A common use case is to use this method for training, and calculate the full
sigmoid loss for evaluation or inference. In this case, you must set
partition_strategy="div" for the two losses to be consistent, as in the
following example:
A Tensor of shape [num_classes, dim], or a list of Tensor
objects whose concatenation along dimension 0 has shape
[num_classes, dim]. The (possibly-partitioned) class embeddings.
biases
A Tensor of shape [num_classes]. The class biases.
labels
A Tensor of type int64 and shape [batch_size,
num_true]. The target classes.
inputs
A Tensor of shape [batch_size, dim]. The forward
activations of the input network.
num_sampled
An int. The number of negative classes to randomly sample
per batch. This single sample of negative classes is evaluated for each
element in the batch.
num_classes
An int. The number of possible classes.
num_true
An int. The number of target classes per training example.
sampled_values
a tuple of (sampled_candidates, true_expected_count,
sampled_expected_count) returned by a *_candidate_sampler function.
(if None, we default to log_uniform_candidate_sampler)
remove_accidental_hits
A bool. Whether to remove "accidental hits"
where a sampled class equals one of the target classes. If set to
True, this is a "Sampled Logistic" loss instead of NCE, and we are
learning to generate log-odds instead of log probabilities. See
our Candidate Sampling Algorithms Reference.
Default is False.
partition_strategy
A string specifying the partitioning strategy, relevant
if len(weights) > 1. Currently "div" and "mod" are supported.
Default is "mod". See tf.nn.embedding_lookup for more details.
name
A name for the operation (optional).
Returns
A batch_size 1-D tensor of per-example NCE losses.
[null,null,["Last updated 2022-10-21 UTC."],[],[],null,["# tf.nn.nce_loss\n\n\u003cbr /\u003e\n\n|---------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|\n| [TensorFlow 2 version](/api_docs/python/tf/nn/nce_loss) | [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/ops/nn_impl.py#L1917-L2025) |\n\nComputes and returns the noise-contrastive estimation training loss.\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.nn.nce_loss`](/api_docs/python/tf/compat/v1/nn/nce_loss)\n\n\u003cbr /\u003e\n\n tf.nn.nce_loss(\n weights, biases, labels, inputs, num_sampled, num_classes, num_true=1,\n sampled_values=None, remove_accidental_hits=False, partition_strategy='mod',\n name='nce_loss'\n )\n\nSee [Noise-contrastive estimation: A new estimation principle for\nunnormalized statistical\nmodels](https://arxiv.org/abs/1806.03664).\nAlso see our [Candidate Sampling Algorithms\nReference](https://www.tensorflow.org/extras/candidate_sampling.pdf)\n\nA common use case is to use this method for training, and calculate the full\nsigmoid loss for evaluation or inference. In this case, you must set\n`partition_strategy=\"div\"` for the two losses to be consistent, as in the\nfollowing example: \n\n if mode == \"train\":\n loss = tf.nn.nce_loss(\n weights=weights,\n biases=biases,\n labels=labels,\n inputs=inputs,\n ...,\n partition_strategy=\"div\")\n elif mode == \"eval\":\n logits = tf.matmul(inputs, tf.transpose(weights))\n logits = tf.nn.bias_add(logits, biases)\n labels_one_hot = tf.one_hot(labels, n_classes)\n loss = tf.nn.sigmoid_cross_entropy_with_logits(\n labels=labels_one_hot,\n logits=logits)\n loss = tf.reduce_sum(loss, axis=1)\n\n| **Note:** By default this uses a log-uniform (Zipfian) distribution for sampling, so your labels must be sorted in order of decreasing frequency to achieve good results. For more details, see [`tf.random.log_uniform_candidate_sampler`](../../tf/random/log_uniform_candidate_sampler).\n| **Note:** In the case where `num_true` \\\u003e 1, we assign to each target class the target probability 1 / `num_true` so that the target probabilities sum to 1 per-example.\n| **Note:** It would be useful to allow a variable number of target classes per example. We hope to provide this functionality in a future release. For now, if you have a variable number of target classes, you can pad them out to a constant number by either repeating them or by padding with an otherwise unused class.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `weights` | A `Tensor` of shape `[num_classes, dim]`, or a list of `Tensor` objects whose concatenation along dimension 0 has shape \\[num_classes, dim\\]. The (possibly-partitioned) class embeddings. |\n| `biases` | A `Tensor` of shape `[num_classes]`. The class biases. |\n| `labels` | A `Tensor` of type `int64` and shape `[batch_size, num_true]`. The target classes. |\n| `inputs` | A `Tensor` of shape `[batch_size, dim]`. The forward activations of the input network. |\n| `num_sampled` | An `int`. The number of negative classes to randomly sample per batch. This single sample of negative classes is evaluated for each element in the batch. |\n| `num_classes` | An `int`. The number of possible classes. |\n| `num_true` | An `int`. The number of target classes per training example. |\n| `sampled_values` | a tuple of (`sampled_candidates`, `true_expected_count`, `sampled_expected_count`) returned by a `*_candidate_sampler` function. (if None, we default to `log_uniform_candidate_sampler`) |\n| `remove_accidental_hits` | A `bool`. Whether to remove \"accidental hits\" where a sampled class equals one of the target classes. If set to `True`, this is a \"Sampled Logistic\" loss instead of NCE, and we are learning to generate log-odds instead of log probabilities. See our [Candidate Sampling Algorithms Reference](https://www.tensorflow.org/extras/candidate_sampling.pdf). Default is False. |\n| `partition_strategy` | A string specifying the partitioning strategy, relevant if `len(weights) \u003e 1`. Currently `\"div\"` and `\"mod\"` are supported. Default is `\"mod\"`. See `tf.nn.embedding_lookup` for more details. |\n| `name` | A name for the operation (optional). |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A `batch_size` 1-D tensor of per-example NCE losses. ||\n\n\u003cbr /\u003e"]]