tf_agents.replay_buffers.ReverbAddEpisodeObserver

Observer for writing episodes to Reverb.

View aliases

Main aliases

tf_agents.replay_buffers.reverb_utils.ReverbAddEpisodeObserver

tf_agents.replay_buffers.ReverbAddEpisodeObserver(
    py_client: tf_agents.typing.types.ReverbClient,
    table_name: Union[Text, Sequence[Text]],
    max_sequence_length: int,
    priority: Union[float, int] = 1,
    bypass_partial_episodes: bool = False
)

Used in the notebooks

Used in the tutorials
REINFORCE agent

This observer should be called at every step. It does not support batched trajectories. The steps are cached and written at the end of the episode.

At the end of each episode, an item is written to Reverb. Each item is the trajectory containing an episode, including a boundary step in the end. Therefore, the sequence lengths of the items may vary. If you want a fixed sequence length, use ReverbAddTrajectoryObserver instead.

Unfinished episodes remain in the cache and do not get written until reset(write_cached_steps=True) is called.

consumer, if your episodes have variable lengths.

Args
`py_client`	Python client for the reverb replay server.
`table_name`	The table name(s) where samples will be written to.
`max_sequence_length`	An integer. `max_sequence_length` used to write to the replay buffer tables. This defines the size of the internal buffer controlling the `upper` limit of the number of timesteps which can be referenced in a single prioritized item. Note that this is the maximum number of trajectories across all the cached episodes that you are writing into the replay buffer (e.g. `number_of_episodes`). `max_sequence_length` is not a limit of how many timesteps or items that can be inserted into the replay buffer. Note that, since `max_sequence_length` controls the size of internal buffer, it is suggested not to set this value to a very large number. If the number of steps in an episode is more than `max_sequence_length`, only items up to `max_sequence_length` is written into the table.
`priority`	Initial priority for the table.
`bypass_partial_episodes`	If `False` (default) and an episode length is greater than `max_sequence_length`, a `ValueError` is raised. If set to `True`, the episodes with length more than `max_sequence_length` do not cause a `ValueError`. These episodes are bypassed (will NOT be written into the replay buffer) and an error message is shown to the user. Note that in this case (`bypass_partial_episodes=True`), the steps for episodes with length more than `max_sequence_length` are wasted and thrown away. This decision is made to guarantee that the replay buffer always has FULL episodes. Note that, `max_sequence_length` is just an upper bound.

Raises
`ValueError`	If `priority` is not numeric.
`ValueError`	If max_sequence_length is not positive.

Attributes
`py_client`

Attributes

py_client

Methods

`close`

View source

close() -> None

Closes the writer of the observer.

`flush`

View source

flush()

Ensures that items are pushed to the service.

`get_table_signature`

View source

get_table_signature()

`open`

View source

open() -> None

Open the writer of the observer. This is a no-op if it's already open.

`reset`

View source

reset(
    write_cached_steps: bool = True
) -> None

Resets the state of the observer.

Args
`write_cached_steps`	By default, if there is remaining data in the cache, write them to Reverb before clearing the cache. If `write_cached_steps` is `False`, throw away the cached data instead.

`update_priority`

View source

update_priority(
    priority: Union[float, int]
) -> None

Update the table priority.

Args
`priority`	Updates the priority of the observer.

ValueError: If priority is not numeric.

`call`

View source

__call__(
    trajectory: tf_agents.trajectories.Trajectory
) -> None

Cache the single step trajectory to be written into Reverb.

Allows trajectory to be a flattened trajectory. No batch dimension allowed.

Args
`trajectory`	The trajectory to be written which could be (possibly nested) trajectory object or a flattened version of a trajectory. It assumes there is no batch dimension.

Raises
`ValueError`	If `bypass_partial_episodes` == False and episode length is > `max_sequence_length`.

tf_agents.replay_buffers.ReverbAddEpisodeObserver

View aliases

Used in the notebooks

Args

Raises

Attributes

Methods

close

flush

get_table_signature

open

reset

update_priority

__call__

`close`

`flush`

`get_table_signature`

`open`

`reset`

`update_priority`

`call`