tf_agents.bandits.agents.exp3_mixture_agent.Exp3MixtureVariableCollection
A collection of variables used by subclasses of MixtureAgent
.
tf_agents.bandits.agents.exp3_mixture_agent.Exp3MixtureVariableCollection(
num_agents: int,
reward_aggregates: Optional[List[float]] = None,
inverse_temperature: float = 0.0
)
Note that this variable collection only contains the mixture weights. The
variables of the sub-agents that the mixture agent mixes are in variable
collections of the respective sub-agents.
Args |
num_agents
|
(int) the number of agents mixed by the mixture agent.
|
reward_aggregates
|
A list of floats containing the reward aggregates for
each agent. If not set, the initial values will be 0.
|
inverse_temperature
|
The initial value for the inverse temperature
variable used by the mixture agent.
|
Attributes |
inverse_temperature
|
|
reward_aggregates
|
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[null,null,["Last updated 2024-04-26 UTC."],[],[]]