ML Community Day is November 9! Join us for updates from TensorFlow, JAX, and more Learn more

Module: tf_agents.bandits.agents.linear_bandit_agent

An agent that maintains linear estimates for rewards and their uncertainty.

LinUCB and Linear Thompson Sampling agents are subclasses of this agent.


class ExplorationPolicy: Possible exploration policies.

class LinearBanditAgent: An agent that maintains linear reward estimates and their uncertainties.

class LinearBanditVariableCollection: A collection of variables used by LinearBanditAgent.


update_a_and_b_with_forgetting(...): Update the covariance matrix a and the weighted sum of rewards b.