tf_agents.trajectories.boundary

Create a Trajectory transitioning between StepTypes LAST and FIRST.

Main aliases

tf_agents.trajectories.trajectory.boundary

All inputs may be batched.

The input discount is used to infer the outer shape of the inputs, as it is always expected to be a singleton array with scalar inner shape.

observation (possibly nested tuple of) Tensor or np.ndarray; all shaped [B, ...], [T, ...], or [B, T, ...].
action (possibly nested tuple of) Tensor or np.ndarray; all shaped [B, ...], [T, ...], or [B, T, ...].
policy_info (possibly nested tuple of) Tensor or np.ndarray; all shaped [B, ...], [T, ...], or [B, T, ...].
reward (possibly nested tuple of) Tensor or np.ndarray; all shaped [B, ...], [T, ...], or [B, T, ...].
discount A floating point vector Tensor or np.ndarray; shaped [B], [T], or [B, T] (optional).

A Trajectory instance.