tf_agents.train.Actor

Actor.

View aliases

Main aliases

tf_agents.train.actor.Actor

tf_agents.train.Actor(
    env,
    policy,
    train_step,
    steps_per_run=None,
    episodes_per_run=None,
    observers=None,
    transition_observers=None,
    info_observers=None,
    metrics=None,
    reference_metrics=None,
    image_metrics=None,
    summary_dir=None,
    summary_interval=1000,
    end_episode_on_boundary=True,
    name=''
)

Used in the notebooks

Used in the tutorials
SAC minitaur with the Actor-Learner API

The actor manages interactions between a policy and an environment. Users should configure the metrics and summaries for a specific task like evaluation or data collection.

The main point of access for users is the run method. This will iterate over either n steps_per_run or episodes_per_run. At least one of steps_per_run or episodes_per_run must be provided.

Args
`env`	An instance of either a tf or py environment. Note the policy, and observers should match the tf/pyness of the env.
`policy`	An instance of a policy used to interact with the environment.
`train_step`	A scalar tf.int64 `tf.Variable` which will keep track of the number of train steps. This is used for artifacts created like summaries.
`steps_per_run`	Number of steps to evaluated per run call. See below.
`episodes_per_run`	Number of episodes evaluated per run call.
`observers`	A list of observers that are notified after every step in the environment. Each observer is a callable(trajectory.Trajectory).
`transition_observers`	A list of observers that are updated after every step in the environment. Each observer is a callable((TimeStep, PolicyStep, NextTimeStep)). The transition is shaped just as trajectories are for regular observers.
`info_observers`	A list of observers that are notified after every step in the environment. Each observer is a callable(info).
`metrics`	A list of metric observers that output a scaler.
`reference_metrics`	Optional list of metrics for which other metrics are plotted against. As an example passing in a metric that tracks number of environment episodes will result in having summaries of all other metrics over this value. Note summaries against the train_step are done by default. If you want reference_metrics to be updated make sure they are also added to the metrics list.
`image_metrics`	A list of metric observers that output an image.
`summary_dir`	Path used for summaries. If no path is provided no summaries are written.
`summary_interval`	How often summaries are written.
`end_episode_on_boundary`	This parameter should be False when using transition observers and be True when using trajectory observers. It is used in py_driver.
`name`	Name for the actor used as a prefix to generated summaries.

Attributes
`image_metrics`
`metrics`
`policy`
`summary_writer`
`train_step`

Methods

`log_metrics`

View source

log_metrics()

Logs metric results to stdout.

`reset`

View source

reset()

Reset the environment to the start and the policy state.

`run`

View source

run()

`run_and_log`

View source

run_and_log()

`write_metric_summaries`

View source

write_metric_summaries()

Generates scalar summaries for the actor metrics.

tf_agents.train.Actor

View aliases

Used in the notebooks

Args

Attributes

Methods

log_metrics

reset

run

run_and_log

write_metric_summaries

`log_metrics`

`reset`

`run`

`run_and_log`

`write_metric_summaries`