tf_agents.bandits.metrics.tf_metrics.RegretMetric

Computes the regret with respect to a baseline.

Inherits From: TFStepMetric

Used in the notebooks

Used in the tutorials

baseline_reward_fn function that computes the reward used as a baseline for computing the regret.
name (str) name of the metric
dtype dtype of the metric value.

Methods

call

View source

Update the regret value.

Args
trajectory A tf_agents.trajectory.Trajectory

Returns
The arguments, for easy chaining.

init_variables

View source

Initializes this Metric's variables.

Should be called after variables are created in the first execution of __call__(). If using graph execution, the return value should be run() in a session before running the op returned by __call__(). (See example above.)

Returns
If using graph execution, this returns an op to perform the initialization. Under eager execution, the variables are reset to their initial values as a side effect and this function returns None.

reset

View source

Resets the values being tracked by the metric.

result

View source

Computes and returns a final value for the metric.

tf_summaries

View source

Generates summaries against train_step and all step_metrics.

Args
train_step (Optional) Step counter for training iterations. If None, no metric is generated against the global step.
step_metrics (Optional) Iterable of step metrics to generate summaries against.

Returns
A list of summaries.

__call__

View source

Returns op to execute to update this metric for these inputs.

Returns None if eager execution is enabled. Returns a graph-mode function if graph execution is enabled.

Args
*args

**kwargs A mini-batch of inputs to the Metric, passed on to call().