tf.experimental.dtensor.create_tpu_mesh

Returns a distributed TPU mesh optimized for AllReduce ring reductions.

Only as many as leading axes specified by ring_axes as necessary will be used to build rings, as long as the subslice formed by these axes have enough cores to contain a ring of the required size. The leftover axes in ring_axes won't affect results.

This function always uses all TPU devices, and offers more customization than tf.experimental.dtensor.create_distributed_mesh.

mesh_dim_names List of mesh dimension names.
mesh_shape Shape of the mesh.
mesh_name A unique name for the mesh. If empty, internally generate one.
ring_dims Optional; The number of leading (ring_dims > 0) or trailing (ring_dims < 0) mesh dimensions to build rings for. If unspecified, build rings for all but the first dimension.
ring_axes Optional; A permutation of ["x", "y", "z", "core"], specifying the order of TPU topology axes to build rings in. If unspecified, default to ["core", "x", "y", "z"].
ring_bounds Optional; The maximum number of devices on each axis, in the x, y, z, core order. If unspecified, default to physical topology limits.
can_split_host_across_rings Optional; If true, devices attached to the same host (i.e., DTensor client) may get assigned to different rings. Setting it to false may cause some combinations of arguments to be infeasible; see DeviceAssignmentTest.testCreateMesh[No]SplittingHosts* for examples.
build_ring_across_rings Optional; If true, also build a data-parallel ring across model-parallel rings. This ring could be strided.
rotate_ring_across_rings Optional; If true, build the data-parallel ring in column-major instead of row-major order.
use_xla_spmd Boolean when True, will use XLA SPMD instead of DTensor SPMD.