tf_agents.trajectories.to_transition

Create a transition from a trajectory or two adjacent trajectories.

View aliases

Main aliases

tf_agents.trajectories.trajectory.to_transition

tf_agents.trajectories.to_transition(
    trajectory: tf_agents.trajectories.Trajectory,
    next_trajectory: Optional[tf_agents.trajectories.Trajectory] = None
) -> tf_agents.trajectories.Transition

time_steps.step_type = trajectory.step_type[:,:-1]
time_steps.observation = trajectory.observation[:,:-1]
next_time_steps.observation = trajectory.observation[:,1:]
next_time_steps. step_type = trajectory. next_step_type[:,:-1]
next_time_steps.reward = trajectory.reward[:,:-1]
next_time_steps. discount = trajectory. discount[:,:-1]

Notice that reward and discount for time_steps are undefined, therefore filled with zero.

Args
`trajectory`	An instance of `Trajectory`. The tensors in Trajectory must have shape `[B, T, ...]` when next_trajectory is `None`. `discount` is assumed to be a scalar float; hence the shape of `trajectory.discount` must be `[B, T]`.
`next_trajectory`	(optional) An instance of `Trajectory`.

Returns
A tuple `(time_steps, policy_steps, next_time_steps)`. The `reward` and `discount` fields of `time_steps` are filled with zeros because these cannot be deduced (please do not use them).

Raises
`ValueError`	if `discount` rank is not within the range [1, 2].

tf_agents.trajectories.to_transition

View aliases

Args

Returns

Raises