|  View source on GitHub | 
Actor.
tf_agents.train.Actor(
    env,
    policy,
    train_step,
    steps_per_run=None,
    episodes_per_run=None,
    observers=None,
    transition_observers=None,
    info_observers=None,
    metrics=None,
    reference_metrics=None,
    image_metrics=None,
    summary_dir=None,
    summary_interval=1000,
    end_episode_on_boundary=True,
    name=''
)
Used in the notebooks
| Used in the tutorials | 
|---|
The actor manages interactions between a policy and an environment. Users should configure the metrics and summaries for a specific task like evaluation or data collection.
The main point of access for users is  the run method. This will iterate
over either n steps_per_run or episodes_per_run. At least one of
steps_per_run or episodes_per_run must be provided.
| Args | |
|---|---|
| env | An instance of either a tf or py environment. Note the policy, and observers should match the tf/pyness of the env. | 
| policy | An instance of a policy used to interact with the environment. | 
| train_step | A scalar tf.int64 tf.Variablewhich will keep track of the
number of train steps. This is used for artifacts created like
summaries. | 
| steps_per_run | Number of steps to evaluated per run call. See below. | 
| episodes_per_run | Number of episodes evaluated per run call. | 
| observers | A list of observers that are notified after every step in the environment. Each observer is a callable(trajectory.Trajectory). | 
| transition_observers | A list of observers that are updated after every step in the environment. Each observer is a callable((TimeStep, PolicyStep, NextTimeStep)). The transition is shaped just as trajectories are for regular observers. | 
| info_observers | A list of observers that are notified after every step in the environment. Each observer is a callable(info). | 
| metrics | A list of metric observers that output a scaler. | 
| reference_metrics | Optional list of metrics for which other metrics are plotted against. As an example passing in a metric that tracks number of environment episodes will result in having summaries of all other metrics over this value. Note summaries against the train_step are done by default. If you want reference_metrics to be updated make sure they are also added to the metrics list. | 
| image_metrics | A list of metric observers that output an image. | 
| summary_dir | Path used for summaries. If no path is provided no summaries are written. | 
| summary_interval | How often summaries are written. | 
| end_episode_on_boundary | This parameter should be False when using transition observers and be True when using trajectory observers. It is used in py_driver. | 
| name | Name for the actor used as a prefix to generated summaries. | 
| Attributes | |
|---|---|
| image_metrics | |
| metrics | |
| policy | |
| summary_writer | |
| train_step | |
Methods
log_metrics
log_metrics()
Logs metric results to stdout.
reset
reset()
Reset the environment to the start and the policy state.
run
run()
run_and_log
run_and_log()
write_metric_summaries
write_metric_summaries()
Generates scalar summaries for the actor metrics.