View source on GitHub |
Reverb ReplayBuffer exposed as a TF-Agents replay buffer.
Inherits From: ReplayBuffer
tf_agents.replay_buffers.ReverbReplayBuffer(
data_spec,
table_name,
sequence_length,
server_address=None,
local_server=None,
dataset_buffer_size=None,
max_cycle_length=32,
num_workers_per_iterator=-1,
max_samples_per_stream=-1,
rate_limiter_timeout_ms=-1
)
Used in the notebooks
Used in the tutorials |
---|
Methods
add_batch
add_batch(
items
)
Adds a batch of items to the replay buffer.
Args | |
---|---|
items
|
Ignored. |
Returns | |
---|---|
Nothing. |
Raises: NotImplementedError
as_dataset
as_dataset(
sample_batch_size=None,
num_steps=None,
num_parallel_calls=None,
sequence_preprocess_fn=None,
single_deterministic_pass=False
)
Creates and returns a dataset that returns entries from the buffer.
A single entry from the dataset is the result of the following pipeline:
- Sample sequences from the underlying data store
- (optionally) Process them with
sequence_preprocess_fn
, - (optionally) Split them into subsequences of length
num_steps
- (optionally) Batch them into batches of size
sample_batch_size
.
In practice, this pipeline is executed in parallel as much as possible
if num_parallel_calls != 1
.
Some additional notes:
If num_steps is None
, different replay buffers will behave differently.
For example, TFUniformReplayBuffer
will return single time steps without
a time dimension. In contrast, e.g., EpisodicReplayBuffer
will return
full sequences (since each sequence may be an episode of unknown length,
the outermost shape dimension will be None
).
If sample_batch_size is None
, no batching is performed; and there is no
outer batch dimension in the returned Dataset entries. This setting
is useful with variable episode lengths using e.g. EpisodicReplayBuffer
,
because it allows the user to get full episodes back, and use tf.data
to build padded or truncated batches themselves.
If single_deterministic_pass == True
, the replay buffer will make
every attempt to ensure every time step is visited once and exactly once
in a deterministic manner (though true determinism depends on the
underlying data store). Additional work may be done to ensure minibatches
do not have multiple rows from the same episode. In some cases, this
may mean arguments like num_parallel_calls
are ignored.
Args | |
---|---|
sample_batch_size
|
(Optional.) An optional batch_size to specify the number of items to return. If None (default), a single item is returned which matches the data_spec of this class (without a batch dimension). Otherwise, a batch of sample_batch_size items is returned, where each tensor in items will have its first dimension equal to sample_batch_size and the rest of the dimensions match the corresponding data_spec. |
num_steps
|
(Optional.) Optional way to specify that sub-episodes are desired. If None (default), a batch of single items is returned. Otherwise, a batch of sub-episodes is returned, where a sub-episode is a sequence of consecutive items in the replay_buffer. The returned tensors will have first dimension equal to sample_batch_size (if sample_batch_size is not None), subsequent dimension equal to num_steps, and remaining dimensions which match the data_spec of this class. |
num_parallel_calls
|
(Optional.) A tf.int32 scalar tf.Tensor ,
representing the number elements to process in parallel. If not
specified, elements will be processed sequentially.
|
sequence_preprocess_fn
|
(Optional) fn for preprocessing the collected data
before it is split into subsequences of length num_steps . Defined in
TFAgent.preprocess_sequence . Defaults to pass through.
|
single_deterministic_pass
|
Python boolean. If True , the dataset will
return a single deterministic pass through its underlying data.
NOTE: If the buffer is modified while a Dataset iterator is
iterating over this data, the iterator may miss any new data or
otherwise have subtly invalid data.
|
Returns | |
---|---|
A dataset of type tf.data.Dataset, elements of which are 2-tuples of:
|
Raises | |
---|---|
NotImplementedError
|
If a non-default argument value is not supported. |
ValueError
|
If the data spec contains lists that must be converted to tuples. |
clear
clear()
Resets the contents of replay buffer.
Returns | |
---|---|
Clears the replay buffer contents. |
gather_all
gather_all()
Returns all the items in buffer.
Returns | |
---|---|
Nothing. |
Raises | |
---|---|
NotImplementedError |
get_next
get_next(
sample_batch_size=None, num_steps=None, time_stacked=True
)
Returns an item or batch of items from the buffer.
Args | |
---|---|
sample_batch_size
|
Ignored. |
num_steps
|
Ignored. |
time_stacked
|
Ignored. |
Returns | |
---|---|
Nothing. |
Raises | |
---|---|
NotImplementedError |
get_table_info
get_table_info()
num_frames
num_frames()
Returns the number of frames in the replay buffer.
update_priorities
update_priorities(
keys, priorities
)
Updates the priorities for the given keys.