tf_agents.utils.common.discounted_future_sum_masked

Discounted future sum of batch-major values.

values A Tensor of shape [batch_size, total_steps] and dtype float32.
gamma A float discount value.
num_steps A positive integer number of future steps to sum.
episode_lengths A vector shape [batch_size] with num_steps per episode.

A Tensor of shape [batch_size, total_steps], where each entry is the discounted sum as in discounted_future_sum, except with values after the end of episode_lengths masked to 0.

ValueError If values is not of rank 2, or if total_steps is not defined.