|  View source on GitHub | 
Hierarchical copy all-reduce implementation of CrossDeviceOps.
Inherits From: CrossDeviceOps
tf.distribute.HierarchicalCopyAllReduce(
    num_packs=1
)
Used in the notebooks
| Used in the guide | 
|---|
It reduces to one GPU along edges in some hierarchy and broadcasts back to each GPU along the same path. For the batch API, tensors will be repacked or aggregated for more efficient cross-device transportation.
This is a reduction created for Nvidia DGX-1 which assumes GPUs connects like
that on DGX-1 machine. If you have different GPU inter-connections, it is
likely that it would be slower than tf.distribute.ReductionToOneDevice.
For reduces that are not all-reduce, it falls back to
tf.distribute.ReductionToOneDevice.
Here is how you can use HierarchicalCopyAllReduce in
tf.distribute.MirroredStrategy:
  strategy = tf.distribute.MirroredStrategy(
    cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())
| Args | |
|---|---|
| num_packs | a non-negative integer. The number of packs to split values into. If zero, no packing will be done. | 
| Raises | |
|---|---|
| ValueError if num_packsis negative. | 
Methods
batch_reduce
batch_reduce(
    reduce_op, value_destination_pairs, options=None
)
Reduce values to destinations in batches.
See tf.distribute.StrategyExtended.batch_reduce_to. This can only be
called in the cross-replica context.
| Args | |
|---|---|
| reduce_op | a tf.distribute.ReduceOpspecifying how values should be
combined. | 
| value_destination_pairs | a sequence of (value, destinations) pairs. See tf.distribute.CrossDeviceOps.reducefor descriptions. | 
| options | a tf.distribute.experimental.CommunicationOptions. Seetf.distribute.experimental.CommunicationOptionsfor details. | 
| Returns | |
|---|---|
| A list of tf.Tensorortf.distribute.DistributedValues, one per pair
invalue_destination_pairs. | 
| Raises | |
|---|---|
| ValueError | if value_destination_pairsis not an iterable of
tuples oftf.distribute.DistributedValuesand destinations. | 
broadcast
broadcast(
    tensor, destinations
)
Broadcast tensor to destinations.
This can only be called in the cross-replica context.
| Args | |
|---|---|
| tensor | a tf.Tensorlike object. The value to broadcast. | 
| destinations | a tf.distribute.DistributedValues, atf.Variable, atf.Tensoralike object, or a device string. It specifies the devices
to broadcast to. Note that if it's atf.Variable, the value is
broadcasted to the devices of that variable, this method doesn't update
the variable. | 
| Returns | |
|---|---|
| A tf.Tensorortf.distribute.DistributedValues. | 
reduce
reduce(
    reduce_op, per_replica_value, destinations, options=None
)
Reduce per_replica_value to destinations.
See tf.distribute.StrategyExtended.reduce_to. This can only be called in
the cross-replica context.
| Args | |
|---|---|
| reduce_op | a tf.distribute.ReduceOpspecifying how values should be
combined. | 
| per_replica_value | a tf.distribute.DistributedValues, or atf.Tensorlike object. | 
| destinations | a tf.distribute.DistributedValues, atf.Variable, atf.Tensoralike object, or a device string. It specifies the devices
to reduce to. To perform an all-reduce, pass the same tovalueanddestinations. Note that if it's atf.Variable, the value is reduced
to the devices of that variable, and this method doesn't update the
variable. | 
| options | a tf.distribute.experimental.CommunicationOptions. Seetf.distribute.experimental.CommunicationOptionsfor details. | 
| Returns | |
|---|---|
| A tf.Tensorortf.distribute.DistributedValues. | 
| Raises | |
|---|---|
| ValueError | if per_replica_value can't be converted to a tf.distribute.DistributedValuesor if destinations is not a string,tf.Variableortf.distribute.DistributedValues. |