Warning: This project is deprecated. TensorFlow Addons has stopped development,
The project will only be providing minimal maintenance releases until May 2024. See the full
announcement here or on
github.
tfa.seq2seq.sequence_loss
Stay organized with collections
Save and categorize content based on your preferences.
Computes the weighted cross-entropy loss for a sequence of logits.
tfa.seq2seq.sequence_loss(
logits: tfa.types.TensorLike
,
targets: tfa.types.TensorLike
,
weights: tfa.types.TensorLike
,
average_across_timesteps: bool = True,
average_across_batch: bool = True,
sum_over_timesteps: bool = False,
sum_over_batch: bool = False,
softmax_loss_function: Optional[Callable] = None,
name: Optional[str] = None
) -> tf.Tensor
Depending on the values of average_across_timesteps
/
sum_over_timesteps
and average_across_batch
/ sum_over_batch
, the
return Tensor will have rank 0, 1, or 2 as these arguments reduce the
cross-entropy at each target, which has shape
[batch_size, sequence_length]
, over their respective dimensions. For
example, if average_across_timesteps
is True
and average_across_batch
is False
, then the return Tensor will have shape [batch_size]
.
Note that average_across_timesteps
and sum_over_timesteps
cannot be
True at same time. Same for average_across_batch
and sum_over_batch
.
The recommended loss reduction in tf 2.0 has been changed to sum_over,
instead of weighted average. User are recommend to use sum_over_timesteps
and sum_over_batch
for reduction.
Args |
logits
|
A Tensor of shape
[batch_size, sequence_length, num_decoder_symbols] and dtype float.
The logits correspond to the prediction across all classes at each
timestep.
|
targets
|
A Tensor of shape [batch_size, sequence_length] and dtype
int. The target represents the true class at each timestep.
|
weights
|
A Tensor of shape [batch_size, sequence_length] and dtype
float. weights constitutes the weighting of each prediction in the
sequence. When using weights as masking, set all valid timesteps to 1
and all padded timesteps to 0, e.g. a mask returned by
tf.sequence_mask .
|
average_across_timesteps
|
If set, sum the cost across the sequence
dimension and divide the cost by the total label weight across
timesteps.
|
average_across_batch
|
If set, sum the cost across the batch dimension and
divide the returned cost by the batch size.
|
sum_over_timesteps
|
If set, sum the cost across the sequence dimension
and divide the size of the sequence. Note that any element with 0
weights will be excluded from size calculation.
|
sum_over_batch
|
if set, sum the cost across the batch dimension and
divide the total cost by the batch size. Not that any element with 0
weights will be excluded from size calculation.
|
softmax_loss_function
|
Function (labels, logits) -> loss-batch
to be used instead of the standard softmax (the default if this is
None). Note that to avoid confusion, it is required for the function
to accept named arguments.
|
name
|
Optional name for this operation, defaults to "sequence_loss".
|
Returns |
A float Tensor of rank 0, 1, or 2 depending on the
average_across_timesteps and average_across_batch arguments. By
default, it has rank 0 (scalar) and is the weighted average cross-entropy
(log-perplexity) per symbol.
|
Raises |
ValueError
|
logits does not have 3 dimensions or targets does not have 2
dimensions or weights does not have 2 dimensions.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2023-05-25 UTC.
[null,null,["Last updated 2023-05-25 UTC."],[],[],null,["# tfa.seq2seq.sequence_loss\n\n\u003cbr /\u003e\n\n|-----------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/addons/blob/v0.20.0/tensorflow_addons/seq2seq/loss.py#L24-L169) |\n\nComputes the weighted cross-entropy loss for a sequence of logits. \n\n tfa.seq2seq.sequence_loss(\n logits: ../../tfa/types/TensorLike,\n targets: ../../tfa/types/TensorLike,\n weights: ../../tfa/types/TensorLike,\n average_across_timesteps: bool = True,\n average_across_batch: bool = True,\n sum_over_timesteps: bool = False,\n sum_over_batch: bool = False,\n softmax_loss_function: Optional[Callable] = None,\n name: Optional[str] = None\n ) -\u003e tf.Tensor\n\nDepending on the values of `average_across_timesteps` /\n`sum_over_timesteps` and `average_across_batch` / `sum_over_batch`, the\nreturn Tensor will have rank 0, 1, or 2 as these arguments reduce the\ncross-entropy at each target, which has shape\n`[batch_size, sequence_length]`, over their respective dimensions. For\nexample, if `average_across_timesteps` is `True` and `average_across_batch`\nis `False`, then the return Tensor will have shape `[batch_size]`.\n\nNote that `average_across_timesteps` and `sum_over_timesteps` cannot be\nTrue at same time. Same for `average_across_batch` and `sum_over_batch`.\n\nThe recommended loss reduction in tf 2.0 has been changed to sum_over,\ninstead of weighted average. User are recommend to use `sum_over_timesteps`\nand `sum_over_batch` for reduction.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `logits` | A Tensor of shape `[batch_size, sequence_length, num_decoder_symbols]` and dtype float. The logits correspond to the prediction across all classes at each timestep. |\n| `targets` | A Tensor of shape `[batch_size, sequence_length]` and dtype int. The target represents the true class at each timestep. |\n| `weights` | A Tensor of shape `[batch_size, sequence_length]` and dtype float. `weights` constitutes the weighting of each prediction in the sequence. When using `weights` as masking, set all valid timesteps to 1 and all padded timesteps to 0, e.g. a mask returned by [`tf.sequence_mask`](https://www.tensorflow.org/api_docs/python/tf/sequence_mask). |\n| `average_across_timesteps` | If set, sum the cost across the sequence dimension and divide the cost by the total label weight across timesteps. |\n| `average_across_batch` | If set, sum the cost across the batch dimension and divide the returned cost by the batch size. |\n| `sum_over_timesteps` | If set, sum the cost across the sequence dimension and divide the size of the sequence. Note that any element with 0 weights will be excluded from size calculation. |\n| `sum_over_batch` | if set, sum the cost across the batch dimension and divide the total cost by the batch size. Not that any element with 0 weights will be excluded from size calculation. |\n| `softmax_loss_function` | Function (labels, logits) -\\\u003e loss-batch to be used instead of the standard softmax (the default if this is None). **Note that to avoid confusion, it is required for the function to accept named arguments.** |\n| `name` | Optional name for this operation, defaults to \"sequence_loss\". |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A float Tensor of rank 0, 1, or 2 depending on the `average_across_timesteps` and `average_across_batch` arguments. By default, it has rank 0 (scalar) and is the weighted average cross-entropy (log-perplexity) per symbol. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|--------------|----------------------------------------------------------------------------------------------------------------|\n| `ValueError` | logits does not have 3 dimensions or targets does not have 2 dimensions or weights does not have 2 dimensions. |\n\n\u003cbr /\u003e"]]