tf_agents.trajectories.transition

Returns a TimeStep with step_type set equal to StepType.MID.

View aliases

Main aliases

tf_agents.trajectories.time_step.transition

tf_agents.trajectories.transition(
    observation: tf_agents.typing.types.NestedTensorOrArray,
    reward: tf_agents.typing.types.NestedTensorOrArray,
    discount: tf_agents.typing.types.Float = 1.0,
    outer_dims: Optional[types.Shape] = None
) -> tf_agents.trajectories.TimeStep

Used in the notebooks

Used in the tutorials
Environments

For TF transitions, the batch size is inferred from the shape of reward.

If discount is a scalar, and observation contains Tensors, then discount will be broadcasted to match reward.shape.

Args
`observation`	A NumPy array, tensor, or a nested dict, list or tuple of arrays or tensors.
`reward`	A NumPy array, tensor, or a nested dict, list or tuple of arrays or tensors.
`discount`	(optional) A scalar, or 1D NumPy array, or tensor.
`outer_dims`	(optional) If provided, it will be used to determine the batch dimensions. If not, the batch dimensions will be inferred by reward's shape. If reward is a vector, but not batched use ().

Returns
A `TimeStep`.

Raises
`ValueError`	If observations are tensors but reward's statically known rank is not `0` or `1`.

tf_agents.trajectories.transition

View aliases

Used in the notebooks

Args

Returns

Raises