tf.test.create_local_cluster
Stay organized with collections
Save and categorize content based on your preferences.
Create and start local servers and return the associated Server
objects.
tf.test.create_local_cluster(
num_workers,
num_ps,
protocol='grpc',
worker_config=None,
ps_config=None
)
"PS" stands for "parameter server": a task responsible for storing and
updating the model's parameters. Other tasks send updates to these parameters
as they work on optimizing the parameters. This particular division of labor
between tasks is not required, but is common for distributed training.
Read more at https://www.tensorflow.org/guide/extend/architecture

Figure illustrates the interaction of these components.
"/job:worker/task:0" and "/job:ps/task:0" are both tasks with worker services.
Example:
workers, _ = tf.test.create_local_cluster(num_workers=2, num_ps=2)
worker_sessions = [tf.compat.v1.Session(w.target) for w in workers]
with tf.device("/job:ps/task:0"):
...
with tf.device("/job:ps/task:1"):
...
with tf.device("/job:worker/task:0"):
...
with tf.device("/job:worker/task:1"):
...
worker_sessions[0].run(...)
Args |
num_workers
|
Number of worker servers to start.
|
num_ps
|
Number of PS servers to start.
|
protocol
|
Communication protocol. Allowed values are documented in the
documentation of tf.distribute.Server .
|
worker_config
|
(optional) tf.ConfigProto to initialize workers. Can be
used to instantiate multiple devices etc.
|
ps_config
|
(optional) tf.ConfigProto to initialize PS servers.
|
Returns |
A tuple (worker_servers, ps_servers) . worker_servers is a list
of num_workers objects of type tf.distribute.Server (all running
locally);
and ps_servers is a list of num_ps objects of similar type.
|
Raises |
ImportError
|
if portpicker module was not found at load time
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2023-10-06 UTC.
[null,null,["Last updated 2023-10-06 UTC."],[],[],null,["# tf.test.create_local_cluster\n\n\u003cbr /\u003e\n\n|-------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.13.1/tensorflow/python/framework/test_util.py#L3798-L3883) |\n\nCreate and start local servers and return the associated `Server` objects.\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.test.create_local_cluster`](https://www.tensorflow.org/api_docs/python/tf/test/create_local_cluster)\n\n\u003cbr /\u003e\n\n tf.test.create_local_cluster(\n num_workers,\n num_ps,\n protocol='grpc',\n worker_config=None,\n ps_config=None\n )\n\n\"PS\" stands for \"parameter server\": a task responsible for storing and\nupdating the model's parameters. Other tasks send updates to these parameters\nas they work on optimizing the parameters. This particular division of labor\nbetween tasks is not required, but is common for distributed training.\n\nRead more at \u003chttps://www.tensorflow.org/guide/extend/architecture\u003e\n\nFigure illustrates the interaction of these components.\n\"/job:worker/task:0\" and \"/job:ps/task:0\" are both tasks with worker services.\n\n#### Example:\n\n workers, _ = tf.test.create_local_cluster(num_workers=2, num_ps=2)\n\n worker_sessions = [tf.compat.v1.Session(w.target) for w in workers]\n\n with tf.device(\"/job:ps/task:0\"):\n ...\n with tf.device(\"/job:ps/task:1\"):\n ...\n with tf.device(\"/job:worker/task:0\"):\n ...\n with tf.device(\"/job:worker/task:1\"):\n ...\n\n worker_sessions[0].run(...)\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-----------------|-------------------------------------------------------------------------------------------------------------------------------------|\n| `num_workers` | Number of worker servers to start. |\n| `num_ps` | Number of PS servers to start. |\n| `protocol` | Communication protocol. Allowed values are documented in the documentation of [`tf.distribute.Server`](../../tf/distribute/Server). |\n| `worker_config` | (optional) `tf.ConfigProto` to initialize workers. Can be used to instantiate multiple devices etc. |\n| `ps_config` | (optional) `tf.ConfigProto` to initialize PS servers. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A tuple `(worker_servers, ps_servers)`. `worker_servers` is a list of `num_workers` objects of type [`tf.distribute.Server`](../../tf/distribute/Server) (all running locally); and `ps_servers` is a list of `num_ps` objects of similar type. ||\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|---------------|-------------------------------------------------|\n| `ImportError` | if portpicker module was not found at load time |\n\n\u003cbr /\u003e"]]