|View source on GitHub|
Synchronous training across multiple replicas on one machine.
tf.compat.v1.distribute.MirroredStrategy( devices=None, cross_device_ops=None )
This strategy is typically used for training on one
machine with multiple GPUs. For TPUs, use
tf.distribute.TPUStrategy. To use
MirroredStrategy with multiple workers,
please refer to
For example, a variable created under a
MirroredStrategy is a
MirroredVariable. If no devices are specified in the constructor argument of
the strategy then it will use all the available GPUs. If no GPUs are found, it
will use the available CPUs. Note that TensorFlow treats all CPUs on a
machine as a single device, and uses threads internally for parallelism.
strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
x = tf.Variable(1.)
0: <tf.Variable ... shape=() dtype=float32, numpy=1.0>,
1: <tf.Variable ... shape=() dtype=float32, numpy=1.0>
While using distribution strategies, all the variable creation should be done within the strategy's scope. This will replicate the variables across all the replicas and keep them in sync using an all-reduce algorithm.
Variables created inside a
MirroredStrategy which is wrapped with a
tf.function are still