tf.compat.v1.train.replica_device_setter
Stay organized with collections
Save and categorize content based on your preferences.
Return a device function
to use when building a Graph for replicas.
tf.compat.v1.train.replica_device_setter(
ps_tasks=0,
ps_device='/job:ps',
worker_device='/job:worker',
merge_devices=True,
cluster=None,
ps_ops=None,
ps_strategy=None
)
Device Functions are used in with tf.device(device_function):
statement to
automatically assign devices to Operation
objects as they are constructed,
Device constraints are added from the inner-most context first, working
outwards. The merging behavior adds constraints to fields that are yet unset
by a more inner context. Currently the fields are (job, task, cpu/gpu).
If cluster
is None
, and ps_tasks
is 0, the returned function is a no-op.
Otherwise, the value of ps_tasks
is derived from cluster
.
By default, only Variable ops are placed on ps tasks, and the placement
strategy is round-robin over all ps tasks. A custom ps_strategy
may be used
to do more intelligent placement, such as
tf.contrib.training.GreedyLoadBalancingStrategy
.
For example,
# To build a cluster with two ps jobs on hosts ps0 and ps1, and 3 worker
# jobs on hosts worker0, worker1 and worker2.
cluster_spec = {
"ps": ["ps0:2222", "ps1:2222"],
"worker": ["worker0:2222", "worker1:2222", "worker2:2222"]}
with
tf.compat.v1.device(tf.compat.v1.train.replica_device_setter(cluster=cluster_spec)):
# Build your graph
v1 = tf.Variable(...) # assigned to /job:ps/task:0
v2 = tf.Variable(...) # assigned to /job:ps/task:1
v3 = tf.Variable(...) # assigned to /job:ps/task:0
# Run compute
Args |
ps_tasks
|
Number of tasks in the ps job. Ignored if cluster is
provided.
|
ps_device
|
String. Device of the ps job. If empty no ps job is used.
Defaults to ps .
|
worker_device
|
String. Device of the worker job. If empty no worker
job is used.
|
merge_devices
|
Boolean . If True , merges or only sets a device if the
device constraint is completely unset. merges device specification rather
than overriding them.
|
cluster
|
ClusterDef proto or ClusterSpec .
|
ps_ops
|
List of strings representing Operation types that need to be
placed on ps devices. If None , defaults to STANDARD_PS_OPS .
|
ps_strategy
|
A callable invoked for every ps Operation (i.e. matched by
ps_ops ), that takes the Operation and returns the ps task index to
use. If None , defaults to a round-robin strategy across all ps
devices.
|
Raises |
TypeError if cluster is not a dictionary or ClusterDef protocol buffer,
or if ps_strategy is provided but not a callable.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-04-26 UTC.
[null,null,["Last updated 2024-04-26 UTC."],[],[],null,["# tf.compat.v1.train.replica_device_setter\n\n\u003cbr /\u003e\n\n|--------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.16.1/tensorflow/python/training/device_setter.py#L128-L223) |\n\nReturn a `device function` to use when building a Graph for replicas. \n\n tf.compat.v1.train.replica_device_setter(\n ps_tasks=0,\n ps_device='/job:ps',\n worker_device='/job:worker',\n merge_devices=True,\n cluster=None,\n ps_ops=None,\n ps_strategy=None\n )\n\nDevice Functions are used in `with tf.device(device_function):` statement to\nautomatically assign devices to `Operation` objects as they are constructed,\nDevice constraints are added from the inner-most context first, working\noutwards. The merging behavior adds constraints to fields that are yet unset\nby a more inner context. Currently the fields are (job, task, cpu/gpu).\n\nIf `cluster` is `None`, and `ps_tasks` is 0, the returned function is a no-op.\nOtherwise, the value of `ps_tasks` is derived from `cluster`.\n\nBy default, only Variable ops are placed on ps tasks, and the placement\nstrategy is round-robin over all ps tasks. A custom `ps_strategy` may be used\nto do more intelligent placement, such as\n`tf.contrib.training.GreedyLoadBalancingStrategy`.\n\nFor example, \n\n # To build a cluster with two ps jobs on hosts ps0 and ps1, and 3 worker\n # jobs on hosts worker0, worker1 and worker2.\n cluster_spec = {\n \"ps\": [\"ps0:2222\", \"ps1:2222\"],\n \"worker\": [\"worker0:2222\", \"worker1:2222\", \"worker2:2222\"]}\n with\n tf.compat.v1.device(tf.compat.v1.train.replica_device_setter(cluster=cluster_spec)):\n # Build your graph\n v1 = tf.Variable(...) # assigned to /job:ps/task:0\n v2 = tf.Variable(...) # assigned to /job:ps/task:1\n v3 = tf.Variable(...) # assigned to /job:ps/task:0\n # Run compute\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `ps_tasks` | Number of tasks in the `ps` job. Ignored if `cluster` is provided. |\n| `ps_device` | String. Device of the `ps` job. If empty no `ps` job is used. Defaults to `ps`. |\n| `worker_device` | String. Device of the `worker` job. If empty no `worker` job is used. |\n| `merge_devices` | `Boolean`. If `True`, merges or only sets a device if the device constraint is completely unset. merges device specification rather than overriding them. |\n| `cluster` | `ClusterDef` proto or `ClusterSpec`. |\n| `ps_ops` | List of strings representing `Operation` types that need to be placed on `ps` devices. If `None`, defaults to `STANDARD_PS_OPS`. |\n| `ps_strategy` | A callable invoked for every ps `Operation` (i.e. matched by `ps_ops`), that takes the `Operation` and returns the ps task index to use. If `None`, defaults to a round-robin strategy across all `ps` devices. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A function to pass to [`tf.device()`](../../../../tf/device). ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|---|---|\n| TypeError if `cluster` is not a dictionary or `ClusterDef` protocol buffer, or if `ps_strategy` is provided but not a callable. ||\n\n\u003cbr /\u003e"]]