View source on GitHub
|
Returned with every call to step and reset on an environment.
tf_agents.trajectories.TimeStep(
step_type, reward, discount, observation
)
A TimeStep contains the data emitted by an environment at each step of
interaction. A TimeStep holds a step_type, an observation (typically a
NumPy array or a dict or list of arrays), and an associated reward and
discount.
The first TimeStep in a sequence will equal StepType.FIRST. The final
TimeStep will equal StepType.LAST. All other TimeSteps in a sequence
will equal `StepType.MID.
Methods
is_first
is_first() -> tf_agents.typing.types.Bool
is_last
is_last() -> tf_agents.typing.types.Bool
is_mid
is_mid() -> tf_agents.typing.types.Bool
View source on GitHub