tf_agents.replay_buffers.reverb_utils.ReverbAddEpisodeObserver

Observer for writing episodes to the Reverb replay buffer.

py_client Python client for the reverb replay server.
table_name The table name where samples will be written to.
max_sequence_length An integer. max_sequence_length used to write to the replay buffer tables. This defines the size of the internal buffer controlling the upper limit of the number of timesteps which can be referenced in a single prioritized item. Note that this is the maximum number of trajectories across all the cached episodes that you are writing into the replay buffer (e.g. number_of_episodes). max_sequence_length is not a limit of how many timesteps or items that can be inserted into the replay buffer. Note that, since max_sequence_length controls the size of internal buffer, it is suggested not to set this value to a very large number. If the number of steps in an episode is more than max_sequence_length, only items up to max_sequence_length is written into the table.
priority Initial priority for the table.
bypass_partial_episodes If False (default) and an episode length is greater than max_sequence_length, a ValueError is raised. If set to True, the episodes with length more than max_sequence_length do not cause a ValueError. These episodes are bypassed (will NOT be written into the replay buffer) and an error message is shown to the user. Note that in this case (bypass_partial_episodes=True), the steps for episodes with length more than max_sequence_length are wasted and thrown away. This decision is made to guarantee that the replay buffer always has FULL episodes. Note that, max_sequence_length is just an upper bound.

ValueError If table_name is not a string.
ValueError If priority is not numeric.
ValueError If max_sequence_length is not positive.

Methods

close

View source

Closes the writer of the observer.

open

View source

Open the writer of the observer.

reset

View source

Resets the state of the observer.

The observed data (appended to the writer) will be written to RB after calling reset. Note that, each write creates a separate entry in the replay buffer.

update_priority

View source

Update the table priority.

Args
priority Updates the priority of the observer.

ValueError: If priority is not numeric.

write_cached_steps

View source

Writes the cached steps into the writer.

__call__

View source

Writes the trajectory into the underlying replay buffer.

Allows trajectory to be a flattened trajectory. No batch dimension allowed.

Args
trajectory The trajectory to be written which could be (possibly nested) trajectory object or a flattened version of a trajectory. It assumes there is no batch dimension.

Raises
ValueError If bypass_partial_episodes == False and episode length is > max_sequence_length.