|View source on GitHub|
TF metrics for Bandits algorithms.
class ConstraintViolationsMetric: Computes the violations of a certain constraint.
class DistanceFromGreedyMetric: Difference between the estimated reward of the chosen and the best action.
class RegretMetric: Computes the regret with respect to a baseline.
class SuboptimalArmsMetric: Computes the number of suboptimal arms with respect to a baseline.