Discounted future sum of batch-major values.
tf_agents.utils.common.discounted_future_sum_masked(
values, gamma, num_steps, episode_lengths
)
Args |
values
|
A Tensor of shape [batch_size, total_steps] and dtype float32.
|
gamma
|
A float discount value.
|
num_steps
|
A positive integer number of future steps to sum.
|
episode_lengths
|
A vector shape [batch_size] with num_steps per episode.
|
Returns |
A Tensor of shape [batch_size, total_steps], where each entry is the
discounted sum as in discounted_future_sum, except with values after
the end of episode_lengths masked to 0.
|
Raises |
ValueError
|
If values is not of rank 2, or if total_steps is not defined.
|