View source on GitHub
|
An agent that maintains linear estimates for rewards and their uncertainty.
LinUCB and Linear Thompson Sampling agents are subclasses of this agent.
Classes
class ExplorationPolicy: Possible exploration policies.
class LinearBanditAgent: An agent that maintains linear reward estimates and their uncertainties.
class LinearBanditVariableCollection: A collection of variables used by LinearBanditAgent.
Functions
update_a_and_b_with_forgetting(...): Update the covariance matrix a and the weighted sum of rewards b.
Other Members | |
|---|---|
| absolute_import |
Instance of __future__._Feature
|
| division |
Instance of __future__._Feature
|
| print_function |
Instance of __future__._Feature
|
View source on GitHub