Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge

tf.compat.v1.distribute.ReplicaContext

A class with a collection of APIs that can be called in a replica context.

You can use tf.distribute.get_replica_context to get an instance of ReplicaContext, which can only be called inside the function passed to tf.distribute.Strategy.run.

strategy = tf.distribute.MirroredStrategy(['GPU:0', 'GPU:1'])
def func():
  replica_context = tf.distribute.get_replica_context()
  return replica_context.replica_id_in_sync_group
strategy.run(func)
PerReplica:{
  0: <tf.Tensor: shape=(), dtype=int32, numpy=0>,
  1: <tf.Tensor: shape=(), dtype=int32, numpy=1>
}

strategy A tf.distribute.Strategy.
replica_id_in_sync_group An integer, a Tensor or None. Prefer an integer whenever possible to avoid issues with nested tf.function. It accepts a Tensor only to be compatible with tpu.replicate.

devices Returns the devices this replica is to be executed on, as a tuple of strings. (deprecated)

num_replicas_in_sync Returns number of replicas that are kept in sync.
replica_id_in_sync_group Returns the id of the replica.

This identifies the replica among all replicas that are kept in sync. The value of the replica id can range from 0 to tf.distribute.ReplicaContext.num_replicas_in_sync - 1.

strategy The current tf.distribute.Strategy object.

Methods

all_reduce

View source

All-reduces value across all replicas.

strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
def step_fn():
  ctx = tf.distribute.get_replica_context()
  value = tf.identity(1.)
  return ctx.all_reduce(tf.distribute.ReduceOp.SUM, value)
strategy.experimental_local_results(strategy.run(step_fn))
(<tf.Tensor: shape=(), dtype=float32, numpy=2.0>,
 <tf.Tensor: shape=(), dtype=float32, numpy=2.0>)

It supports batched operations. You can pass a list of values and it attempts to batch them when possible. You can also specify options to indicate the desired batching behavior, e.g. batch the values into multiple packs so that they can better overlap with computations.

strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
def step_fn():
  ctx = tf.distribute.get_replica_context()
  value1 = tf.identity(1.)
  value2 = tf.identity(2.)
  return ctx.all_reduce(tf.distribute.ReduceOp.SUM, [value1, value2])
strategy.experimental_local_results(strategy.run(step_fn))
([<tf.Tensor: shape=(), dtype=float32, numpy=2.0>,
<tf.Tensor: shape=(), dtype=float32, numpy=4.0>],
[<tf.Tensor: shape=(), dtype=float32, numpy=2.0>,
<tf.Tensor: shape=(), dtype=float32, numpy=4.0>])

Note that all replicas need to participate in the all-reduce, otherwise this operation hangs. Note that if there're multiple all-reduces, they need to execute in the same order on all replicas. Dispatching all-reduce based on conditions is usually error-prone.

Known limitation: if value contains tf.IndexedSlices, attempting to compute gradient w.r.t value would result in an error.

This API currently can only be called in the replica context. Other variants to reduce values across replicas are: