tf.config.experimental_connect_to_cluster
Stay organized with collections
Save and categorize content based on your preferences.
Connects to the given cluster.
tf.config.experimental_connect_to_cluster(
cluster_spec_or_resolver,
job_name='localhost',
task_index=0,
protocol=None,
make_master_device_default=True,
cluster_device_filters=None
)
Used in the notebooks
Used in the guide |
Used in the tutorials |
|
|
Will make devices on the cluster available to use. Note that calling this more
than once will work, but will invalidate any tensor handles on the old remote
devices.
If the given local job name is not present in the cluster specification, it
will be automatically added, using an unused port on the localhost.
Device filters can be specified to isolate groups of remote tasks to avoid
undesired accesses between workers. Workers accessing resources or launching
ops / functions on filtered remote devices will result in errors (unknown
devices). For any remote task, if no device filter is present, all cluster
devices will be visible; if any device filter is specified, it can only
see devices matching at least one filter. Devices on the task itself are
always visible. Device filters can be particially specified.
For example, for a cluster set up for parameter server training, the following
device filters might be specified:
cdf = tf.config.experimental.ClusterDeviceFilters()
# For any worker, only the devices on PS nodes and itself are visible
for i in range(num_workers):
cdf.set_device_filters('worker', i, ['/job:ps'])
# Similarly for any ps, only the devices on workers and itself are visible
for i in range(num_ps):
cdf.set_device_filters('ps', i, ['/job:worker'])
tf.config.experimental_connect_to_cluster(cluster_def,
cluster_device_filters=cdf)
Args |
cluster_spec_or_resolver
|
A ClusterSpec or ClusterResolver describing
the cluster.
|
job_name
|
The name of the local job.
|
task_index
|
The local task index.
|
protocol
|
The communication protocol, such as "grpc" . If unspecified, will
use the default from python/platform/remote_utils.py .
|
make_master_device_default
|
If True and a cluster resolver is passed, will
automatically enter the master task device scope, which indicates the
master becomes the default device to run ops. It won't do anything if
a cluster spec is passed. Will throw an error if the caller is currently
already in some device scope.
|
cluster_device_filters
|
an instance of
tf.train.experimental/ClusterDeviceFilters that specify device filters
to the remote tasks in cluster.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-04-26 UTC.
[null,null,["Last updated 2024-04-26 UTC."],[],[],null,["# tf.config.experimental_connect_to_cluster\n\n\u003cbr /\u003e\n\n|---------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.16.1/tensorflow/python/eager/remote.py#L76-L275) |\n\nConnects to the given cluster.\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.config.experimental_connect_to_cluster`](https://www.tensorflow.org/api_docs/python/tf/config/experimental_connect_to_cluster)\n\n\u003cbr /\u003e\n\n tf.config.experimental_connect_to_cluster(\n cluster_spec_or_resolver,\n job_name='localhost',\n task_index=0,\n protocol=None,\n make_master_device_default=True,\n cluster_device_filters=None\n )\n\n### Used in the notebooks\n\n| Used in the guide | Used in the tutorials |\n|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| - [Migrate from TPU embedding_columns to TPUEmbedding layer](https://www.tensorflow.org/guide/migrate/tpu_embedding) - [Migrate from TPUEstimator to TPUStrategy](https://www.tensorflow.org/guide/migrate/tpu_estimator) - [Use TPUs](https://www.tensorflow.org/guide/tpu) | - [Training with Orbit](https://www.tensorflow.org/tfmodels/orbit/index) - [Solve GLUE tasks using BERT on TPU](https://www.tensorflow.org/text/tutorials/bert_glue) |\n\nWill make devices on the cluster available to use. Note that calling this more\nthan once will work, but will invalidate any tensor handles on the old remote\ndevices.\n\nIf the given local job name is not present in the cluster specification, it\nwill be automatically added, using an unused port on the localhost.\n\nDevice filters can be specified to isolate groups of remote tasks to avoid\nundesired accesses between workers. Workers accessing resources or launching\nops / functions on filtered remote devices will result in errors (unknown\ndevices). For any remote task, if no device filter is present, all cluster\ndevices will be visible; if any device filter is specified, it can only\nsee devices matching at least one filter. Devices on the task itself are\nalways visible. Device filters can be particially specified.\n\nFor example, for a cluster set up for parameter server training, the following\ndevice filters might be specified: \n\n cdf = tf.config.experimental.ClusterDeviceFilters()\n # For any worker, only the devices on PS nodes and itself are visible\n for i in range(num_workers):\n cdf.set_device_filters('worker', i, ['/job:ps'])\n # Similarly for any ps, only the devices on workers and itself are visible\n for i in range(num_ps):\n cdf.set_device_filters('ps', i, ['/job:worker'])\n\n tf.config.experimental_connect_to_cluster(cluster_def,\n cluster_device_filters=cdf)\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `cluster_spec_or_resolver` | A `ClusterSpec` or `ClusterResolver` describing the cluster. |\n| `job_name` | The name of the local job. |\n| `task_index` | The local task index. |\n| `protocol` | The communication protocol, such as `\"grpc\"`. If unspecified, will use the default from `python/platform/remote_utils.py`. |\n| `make_master_device_default` | If True and a cluster resolver is passed, will automatically enter the master task device scope, which indicates the master becomes the default device to run ops. It won't do anything if a cluster spec is passed. Will throw an error if the caller is currently already in some device scope. |\n| `cluster_device_filters` | an instance of `tf.train.experimental/ClusterDeviceFilters` that specify device filters to the remote tasks in cluster. |\n\n\u003cbr /\u003e"]]