Computes metrics for across top K candidates surfaced by a retrieval model.

Inherits From: Factorized

Used in the notebooks

Used in the tutorials

The default metric is top K categorical accuracy: how often the true candidate is in the top K candidates for a given query.

candidates A layer for retrieving top candidates in response to a query, or a dataset of candidate embeddings from which candidates should be retrieved.
ks A sequence of values of k at which to perform retrieval evaluation.
name Optional name.



This is where the layer's logic lives.

The call() method may not create state (except in its first invocation, wrapping the creation of variables or other resources in tf.init_scope()). It is recommended to create state, including tf.Variable instances and nested Layer instances, in __init__(), or in the build() method that is called automatically before call() executes for the first time.

inputs Input tensor, or dict/list/tuple of input tensors. The first positional inputs argument is subject to special rules:

  • inputs must be explicitly passed. A layer cannot have zero arguments, and inputs cannot be provided via the default value of a keyword argument.
  • NumPy array or Python scalar values in inputs get cast as tensors.
  • Keras mask metadata is only collected from inputs.
  • Layers are built (build(input_shape) method) using shape info from inputs only.
  • input_spec compatibility is only checked against inputs.
  • Mixed precision input casting is only applied to inputs. If a layer has tensor arguments in *args or **kwargs, their casting behavior in mixed precision should be handled manually.
  • The SavedModel input specification is generated using inputs only.
  • Integration with various ecosystem packages like TFMOT, TFLite, TF.js, etc is only supported for inputs and not for tensors in positional and keyword arguments.
*args Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above.
**kwargs Additional keyword arguments. May contain tensors, although this is not recommended, for the reasons above. The following optional keyword arguments are reserved:
  • training: Boolean scalar tensor of Python boolean indicating whether the call is meant for training or inference.
  • mask: Boolean input mask. If the layer's call() method takes a mask argument, its default value will be set to the mask generated for inputs by the previous layer (if input did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support).
  • Returns
    A tensor or list/tuple of tensors.


    View source

    Resets the metrics.


    View source

    Returns a list of metric results.


    View source

    Updates the metrics.

    query_embeddings [num_queries, embedding_dim] tensor of query embeddings.
    true_candidate_embeddings [num_queries, embedding_dim] tensor of embeddings for candidates that were selected for the query.
    true_candidate_ids Ids of the true candidates. If supplied, evaluation will be id-based: the supplied ids will be matched against the ids of the top candidates returned from the retrieval index, which should have been constructed with the appropriate identifiers.

    If not supplied, evaluation will be score-based: the score of the true candidate will be computed and compared with the scores returned from the index for the top candidates.

    Score-based evaluation is useful for when the true candidate is not in the retrieval index. Id-based evaluation is useful for when scores returned from the index are not directly comparable to scores computed by multiplying the candidate and embedding vector. For example, scores returned by ScaNN are quantized, and cannot be compared to full-precision scores.

    sample_weight Optional weighting of each example. Defaults to 1.

    Update op. Only used in graph mode.