tf_agents.train.Actor
Stay organized with collections
Save and categorize content based on your preferences.
Actor.
tf_agents.train.Actor(
env,
policy,
train_step,
steps_per_run=None,
episodes_per_run=None,
observers=None,
transition_observers=None,
info_observers=None,
metrics=None,
reference_metrics=None,
image_metrics=None,
summary_dir=None,
summary_interval=1000,
end_episode_on_boundary=True,
name=''
)
Used in the notebooks
The actor manages interactions between a policy and an environment. Users
should configure the metrics and summaries for a specific task like evaluation
or data collection.
The main point of access for users is the run
method. This will iterate
over either n steps_per_run
or episodes_per_run
. At least one of
steps_per_run
or episodes_per_run
must be provided.
Args |
env
|
An instance of either a tf or py environment. Note the policy, and
observers should match the tf/pyness of the env.
|
policy
|
An instance of a policy used to interact with the environment.
|
train_step
|
A scalar tf.int64 tf.Variable which will keep track of the
number of train steps. This is used for artifacts created like
summaries.
|
steps_per_run
|
Number of steps to evaluated per run call. See below.
|
episodes_per_run
|
Number of episodes evaluated per run call.
|
observers
|
A list of observers that are notified after every step in the
environment. Each observer is a callable(trajectory.Trajectory).
|
transition_observers
|
A list of observers that are updated after every
step in the environment. Each observer is a callable((TimeStep,
PolicyStep, NextTimeStep)). The transition is shaped just as
trajectories are for regular observers.
|
info_observers
|
A list of observers that are notified after every step in
the environment. Each observer is a callable(info).
|
metrics
|
A list of metric observers that output a scaler.
|
reference_metrics
|
Optional list of metrics for which other metrics are
plotted against. As an example passing in a metric that tracks number of
environment episodes will result in having summaries of all other
metrics over this value. Note summaries against the train_step are done
by default. If you want reference_metrics to be updated make sure they
are also added to the metrics list.
|
image_metrics
|
A list of metric observers that output an image.
|
summary_dir
|
Path used for summaries. If no path is provided no summaries
are written.
|
summary_interval
|
How often summaries are written.
|
end_episode_on_boundary
|
This parameter should be False when using
transition observers and be True when using trajectory observers. It is
used in py_driver.
|
name
|
Name for the actor used as a prefix to generated summaries.
|
Attributes |
image_metrics
|
|
metrics
|
|
policy
|
|
summary_writer
|
|
train_step
|
|
Methods
log_metrics
View source
log_metrics()
Logs metric results to stdout.
reset
View source
reset()
Reset the environment to the start and the policy state.
run
View source
run()
run_and_log
View source
run_and_log()
write_metric_summaries
View source
write_metric_summaries()
Generates scalar summaries for the actor metrics.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[null,null,["Last updated 2024-04-26 UTC."],[],[],null,["# tf_agents.train.Actor\n\n\u003cbr /\u003e\n\n|--------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/agents/blob/v0.19.0/tf_agents/train/actor.py#L32-L237) |\n\nActor.\n\n#### View aliases\n\n\n**Main aliases**\n\n[`tf_agents.train.actor.Actor`](https://www.tensorflow.org/agents/api_docs/python/tf_agents/train/Actor)\n\n\u003cbr /\u003e\n\n tf_agents.train.Actor(\n env,\n policy,\n train_step,\n steps_per_run=None,\n episodes_per_run=None,\n observers=None,\n transition_observers=None,\n info_observers=None,\n metrics=None,\n reference_metrics=None,\n image_metrics=None,\n summary_dir=None,\n summary_interval=1000,\n end_episode_on_boundary=True,\n name=''\n )\n\n### Used in the notebooks\n\n| Used in the tutorials |\n|------------------------------------------------------------------------------------------------------------------|\n| - [SAC minitaur with the Actor-Learner API](https://www.tensorflow.org/agents/tutorials/7_SAC_minitaur_tutorial) |\n\nThe actor manages interactions between a policy and an environment. Users\nshould configure the metrics and summaries for a specific task like evaluation\nor data collection.\n\nThe main point of access for users is the `run` method. This will iterate\nover either n `steps_per_run` or `episodes_per_run`. At least one of\n`steps_per_run` or `episodes_per_run` must be provided.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|---------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `env` | An instance of either a tf or py environment. Note the policy, and observers should match the tf/pyness of the env. |\n| `policy` | An instance of a policy used to interact with the environment. |\n| `train_step` | A scalar tf.int64 [`tf.Variable`](https://www.tensorflow.org/api_docs/python/tf/Variable) which will keep track of the number of train steps. This is used for artifacts created like summaries. |\n| `steps_per_run` | Number of steps to evaluated per run call. See below. |\n| `episodes_per_run` | Number of episodes evaluated per run call. |\n| `observers` | A list of observers that are notified after every step in the environment. Each observer is a callable(trajectory.Trajectory). |\n| `transition_observers` | A list of observers that are updated after every step in the environment. Each observer is a callable((TimeStep, PolicyStep, NextTimeStep)). The transition is shaped just as trajectories are for regular observers. |\n| `info_observers` | A list of observers that are notified after every step in the environment. Each observer is a callable(info). |\n| `metrics` | A list of metric observers that output a scaler. |\n| `reference_metrics` | Optional list of metrics for which other metrics are plotted against. As an example passing in a metric that tracks number of environment episodes will result in having summaries of all other metrics over this value. Note summaries against the train_step are done by default. If you want reference_metrics to be updated make sure they are also added to the metrics list. |\n| `image_metrics` | A list of metric observers that output an image. |\n| `summary_dir` | Path used for summaries. If no path is provided no summaries are written. |\n| `summary_interval` | How often summaries are written. |\n| `end_episode_on_boundary` | This parameter should be False when using transition observers and be True when using trajectory observers. It is used in py_driver. |\n| `name` | Name for the actor used as a prefix to generated summaries. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Attributes ---------- ||\n|------------------|---------------|\n| `image_metrics` | \u003cbr /\u003e \u003cbr /\u003e |\n| `metrics` | \u003cbr /\u003e \u003cbr /\u003e |\n| `policy` | \u003cbr /\u003e \u003cbr /\u003e |\n| `summary_writer` | \u003cbr /\u003e \u003cbr /\u003e |\n| `train_step` | \u003cbr /\u003e \u003cbr /\u003e |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `log_metrics`\n\n[View source](https://github.com/tensorflow/agents/blob/v0.19.0/tf_agents/train/actor.py#L225-L230) \n\n log_metrics()\n\nLogs metric results to stdout.\n\n### `reset`\n\n[View source](https://github.com/tensorflow/agents/blob/v0.19.0/tf_agents/train/actor.py#L232-L237) \n\n reset()\n\nReset the environment to the start and the policy state.\n\n### `run`\n\n[View source](https://github.com/tensorflow/agents/blob/v0.19.0/tf_agents/train/actor.py#L166-L177) \n\n run()\n\n### `run_and_log`\n\n[View source](https://github.com/tensorflow/agents/blob/v0.19.0/tf_agents/train/actor.py#L179-L181) \n\n run_and_log()\n\n### `write_metric_summaries`\n\n[View source](https://github.com/tensorflow/agents/blob/v0.19.0/tf_agents/train/actor.py#L183-L223) \n\n write_metric_summaries()\n\nGenerates scalar summaries for the actor metrics."]]