View source on GitHub
|
Computes the cosine similarity between labels and predictions.
Inherits From: Loss
tf.keras.losses.CosineSimilarity(
axis=-1,
reduction=losses_utils.ReductionV2.AUTO,
name='cosine_similarity'
)
Note that it is a number between -1 and 1. When it is a negative number
between -1 and 0, 0 indicates orthogonality and values closer to -1
indicate greater similarity. The values closer to 1 indicate greater
dissimilarity. This makes it usable as a loss function in a setting
where you try to maximize the proximity between predictions and targets.
If either y_true or y_pred is a zero vector, cosine similarity will be 0
regardless of the proximity between predictions and targets.
loss = -sum(l2_norm(y_true) * l2_norm(y_pred))
Standalone usage:
y_true = [[0., 1.], [1., 1.]]y_pred = [[1., 0.], [1., 1.]]# Using 'auto'/'sum_over_batch_size' reduction type.cosine_loss = tf.keras.losses.CosineSimilarity(axis=1)# l2_norm(y_true) = [[0., 1.], [1./1.414, 1./1.414]]# l2_norm(y_pred) = [[1., 0.], [1./1.414, 1./1.414]]# l2_norm(y_true) . l2_norm(y_pred) = [[0., 0.], [0.5, 0.5]]# loss = mean(sum(l2_norm(y_true) . l2_norm(y_pred), axis=1))# = -((0. + 0.) + (0.5 + 0.5)) / 2cosine_loss(y_true, y_pred).numpy()-0.5
# Calling with 'sample_weight'.cosine_loss(y_true, y_pred, sample_weight=[0.8, 0.2]).numpy()-0.0999
# Using 'sum' reduction type.cosine_loss = tf.keras.losses.CosineSimilarity(axis=1,reduction=tf.keras.losses.Reduction.SUM)cosine_loss(y_true, y_pred).numpy()-0.999
# Using 'none' reduction type.cosine_loss = tf.keras.losses.CosineSimilarity(axis=1,reduction=tf.keras.losses.Reduction.NONE)cosine_loss(y_true, y_pred).numpy()array([-0., -0.999], dtype=float32)
Usage with the compile() API:
model.compile(optimizer='sgd',
loss=tf.keras.losses.CosineSimilarity(axis=1))
Args | |
|---|---|
axis
|
The axis along which the cosine similarity is computed (the features axis). Defaults to -1. |
reduction
|
Type of tf.keras.losses.Reduction to apply to loss.
Default value is AUTO. AUTO indicates that the reduction option will
be determined by the usage context. For almost all cases this defaults
to SUM_OVER_BATCH_SIZE. When used under a
tf.distribute.Strategy, except via Model.compile() and
Model.fit(), using AUTO or SUM_OVER_BATCH_SIZE
will raise an error. Please see this custom training tutorial
for more details.
|
name
|
Optional name for the instance. |
Methods
from_config
@classmethodfrom_config( config )
Instantiates a Loss from its config (output of get_config()).
| Args | |
|---|---|
config
|
Output of get_config().
|
| Returns | |
|---|---|
A keras.losses.Loss instance.
|
get_config
get_config()
Returns the config dictionary for a Loss instance.
__call__
__call__(
y_true, y_pred, sample_weight=None
)
Invokes the Loss instance.
| Args | |
|---|---|
y_true
|
Ground truth values. shape = [batch_size, d0, .. dN], except
sparse loss functions such as sparse categorical crossentropy where
shape = [batch_size, d0, .. dN-1]
|
y_pred
|
The predicted values. shape = [batch_size, d0, .. dN]
|
sample_weight
|
Optional sample_weight acts as a coefficient for the
loss. If a scalar is provided, then the loss is simply scaled by the
given value. If sample_weight is a tensor of size [batch_size],
then the total loss for each sample of the batch is rescaled by the
corresponding element in the sample_weight vector. If the shape of
sample_weight is [batch_size, d0, .. dN-1] (or can be
broadcasted to this shape), then each loss element of y_pred is
scaled by the corresponding value of sample_weight. (Note
ondN-1: all loss functions reduce by 1 dimension, usually
axis=-1.)
|
| Returns | |
|---|---|
Weighted loss float Tensor. If reduction is NONE, this has
shape [batch_size, d0, .. dN-1]; otherwise, it is scalar. (Note
dN-1 because all loss functions reduce by 1 dimension, usually
axis=-1.)
|
| Raises | |
|---|---|
ValueError
|
If the shape of sample_weight is invalid.
|
View source on GitHub