View on TensorFlow.org | View source on GitHub | Download notebook |

TensorFlow provides a set of pseudo-random number generators (RNG), in the `tf.random`

module. This document describes how you can control the random number generators, and how these generators interact with other tensorflow sub-systems.

TensorFlow provides two approaches for controlling the random number generation process:

Through the explicit use of

`tf.random.Generator`

objects. Each such object maintains a state (in`tf.Variable`

) that will be changed after each number generation.Through the purely-functional stateless random functions like

`tf.random.stateless_uniform`

. Calling these functions with the same arguments (which include the seed) and on the same device will always produce the same results.

## Setup

```
import tensorflow as tf
# Creates some virtual devices (cpu:0, cpu:1, etc.) for using distribution strategy
physical_devices = tf.config.list_physical_devices("CPU")
tf.config.experimental.set_virtual_device_configuration(
physical_devices[0], [
tf.config.experimental.VirtualDeviceConfiguration(),
tf.config.experimental.VirtualDeviceConfiguration(),
tf.config.experimental.VirtualDeviceConfiguration()
])
```

## The `tf.random.Generator`

class

The `tf.random.Generator`

class is used in cases where you want each RNG call to produce different results. It maintains an internal state (managed by a `tf.Variable`

object) which will be updated every time random numbers are generated. Because the state is managed by `tf.Variable`

, it enjoys all facilities provided by `tf.Variable`

such as easy checkpointing, automatic control-dependency and thread safety.

You can get a `tf.random.Generator`

by manually creating an object of the class or call `tf.random.get_global_generator()`

to get the default global generator:

```
g1 = tf.random.Generator.from_seed(1)
print(g1.normal(shape=[2, 3]))
g2 = tf.random.get_global_generator()
print(g2.normal(shape=[2, 3]))
```

tf.Tensor( [[ 0.43842277 -0.53439844 -0.07710262] [ 1.5658046 -0.1012345 -0.2744976 ]], shape=(2, 3), dtype=float32) tf.Tensor( [[-0.5496427 0.24263908 -1.1436267 ] [ 1.861458 -0.6756685 -0.9900103 ]], shape=(2, 3), dtype=float32)

There are multiple ways to create a generator object. The easiest is `Generator.from_seed`

, as shown above, that creates a generator from a seed. A seed is any non-negative integer. `from_seed`

also takes an optional argument `alg`

which is the RNG algorithm that will be used by this generator:

```
g1 = tf.random.Generator.from_seed(1, alg='philox')
print(g1.normal(shape=[2, 3]))
```

tf.Tensor( [[ 0.43842277 -0.53439844 -0.07710262] [ 1.5658046 -0.1012345 -0.2744976 ]], shape=(2, 3), dtype=float32)

See the *Algorithms* section below for more information about it.

Another way to create a generator is with `Generator.from_non_deterministic_state`

. A generator created this way will start from a non-deterministic state, depending on e.g. time and OS.

```
g = tf.random.Generator.from_non_deterministic_state()
print(g.normal(shape=[2, 3]))
```

tf.Tensor( [[-0.9078738 0.11009752 1.037219 ] [ 0.661036 0.4169741 1.4539026 ]], shape=(2, 3), dtype=float32)

There are yet other ways to create generators, such as from explicit states, which are not covered by this guide.

When using `tf.random.get_global_generator`

to get the global generator, you need to be careful about device placement. The global generator is created (from a non-deterministic state) at the first time `tf.random.get_global_generator`

is called, and placed on the default device at that call. So, for example, if the first site you call `tf.random.get_global_generator`

is within a `tf.device("gpu")`

scope, the global generator will be placed on the GPU, and using the global generator later on from the CPU will incur a GPU-to-CPU copy.

There is also a function `tf.random.set_global_generator`

for replacing the global generator with another generator object. This function should be used with caution though, because the old global generator may have been captured by a `tf.function`

(as a weak reference), and replacing it will cause it to be garbage collected, breaking the `tf.function`

. A better way to reset the global generator is to use one of the "reset" functions such as `Generator.reset_from_seed`

, which won't create new generator objects.

```
g = tf.random.Generator.from_seed(1)
print(g.normal([]))
print(g.normal([]))
g.reset_from_seed(1)
print(g.normal([]))
```

tf.Tensor(0.43842277, shape=(), dtype=float32) tf.Tensor(1.6272374, shape=(), dtype=float32) tf.Tensor(0.43842277, shape=(), dtype=float32)

### Creating independent random-number streams

In many applications one needs multiple independent random-number streams, independent in the sense that they won't overlap and won't have any statistically detectable correlations. This is achieved by using `Generator.split`

to create multiple generators that are guaranteed to be independent of each other (i.e. generating independent streams).

```
g = tf.random.Generator.from_seed(1)
print(g.normal([]))
new_gs = g.split(3)
for new_g in new_gs:
print(new_g.normal([]))
print(g.normal([]))
```

tf.Tensor(0.43842277, shape=(), dtype=float32) tf.Tensor(2.536413, shape=(), dtype=float32) tf.Tensor(0.33186463, shape=(), dtype=float32) tf.Tensor(-0.07144657, shape=(), dtype=float32) tf.Tensor(-0.79253083, shape=(), dtype=float32)

`split`

will change the state of the generator on which it is called (`g`

in the above example), similar to an RNG method such as `normal`

. In addition to being independent of each other, the new generators (`new_gs`

) are also guaranteed to be independent of the old one (`g`

).

Spawning new generators is also useful when you want to make sure the generator you use is on the same device as other computations, to avoid the overhead of cross-device copy. For example:

```
with tf.device("cpu"): # change "cpu" to the device you want
g = tf.random.get_global_generator().split(1)[0]
print(g.normal([])) # use of g won't cause cross-device copy, unlike the global generator
```

tf.Tensor(0.4142675, shape=(), dtype=float32)

You can do splitting recursively, calling `split`

on splitted generators. There are no limits (barring integer overflow) on the depth of recursions.

### Interaction with `tf.function`

`tf.random.Generator`

obeys the same rules as `tf.Variable`

when used with `tf.function`

. This includes three aspects.

#### Creating generators outside `tf.function`

`tf.function`

can use a generator created outside of it.

```
g = tf.random.Generator.from_seed(1)
@tf.function
def foo():
return g.normal([])
print(foo())
```

tf.Tensor(0.43842277, shape=(), dtype=float32)

The user needs to make sure that the generator object is still alive (not garbage-collected) when the function is called.

#### Creating generators inside `tf.function`

Creation of generators inside a `tf.function`

can only happend during the first run of the function.

```
g = None
@tf.function
def foo():
global g
if g is None:
g = tf.random.Generator.from_seed(1)
return g.normal([])
print(foo())
print(foo())
```

tf.Tensor(0.43842277, shape=(), dtype=float32) tf.Tensor(1.6272374, shape=(), dtype=float32)

#### Passing generators as arguments to `tf.function`

When used as an argument to a `tf.function`

, different generator objects will cause retracing of the `tf.function`

.

```
num_traces = 0
@tf.function
def foo(g):
global num_traces
num_traces += 1
return g.normal([])
foo(tf.random.Generator.from_seed(1))
foo(tf.random.Generator.from_seed(2))
print(num_traces)
```

2

Note that this retracing behavior is consistent with `tf.Variable`

:

```
num_traces = 0
@tf.function
def foo(v):
global num_traces
num_traces += 1
return v.read_value()
foo(tf.Variable(1))
foo(tf.Variable(2))
print(num_traces)
```

2

### Interaction with distribution strategies

There are two ways in which `Generator`

interacts with distribution strategies.

#### Creating generators outside distribution strategies

If a generator is created outside strategy scopes, all replicas’ access to the generator will be serialized, and hence the replicas will get different random numbers.

```
g = tf.random.Generator.from_seed(1)
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
def f():
print(g.normal([]))
results = strat.run(f)
```

WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce. INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. tf.Tensor(0.43842274, shape=(), dtype=float32) tf.Tensor(1.6272374, shape=(), dtype=float32)

Note that this usage may have performance issues because the generator's device is different from the replicas.

#### Creating generators inside distribution strategies

If a generator is created inside a strategy scope, each replica will get a different and independent stream of random numbers.

```
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
g = tf.random.Generator.from_seed(1)
print(strat.run(lambda: g.normal([])))
print(strat.run(lambda: g.normal([])))
```

WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce. INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. PerReplica:{ 0: tf.Tensor(-0.87930447, shape=(), dtype=float32), 1: tf.Tensor(0.020661574, shape=(), dtype=float32) } WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32) }

If the generator is seeded (e.g. created by `Generator.from_seed`

), the random numbers are determined by the seed, even though different replicas get different and uncorrelated numbers. One can think of a random number generated on a replica as a hash of the replica ID and a "primary" random number that is common to all replicas. Hence, the whole system is still deterministic.

`tf.random.Generator`

can also be created inside `Strategy.run`

:

```
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
def f():
g = tf.random.Generator.from_seed(1)
a = g.normal([])
b = g.normal([])
return tf.stack([a, b])
print(strat.run(f))
print(strat.run(f))
```

WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce. INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. PerReplica:{ 0: tf.Tensor([-0.87930447 -1.5822568 ], shape=(2,), dtype=float32), 1: tf.Tensor([0.02066157 0.77539235], shape=(2,), dtype=float32) } WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `run` inside a tf.function to get the best performance. PerReplica:{ 0: tf.Tensor([-0.87930447 -1.5822568 ], shape=(2,), dtype=float32), 1: tf.Tensor([0.02066157 0.77539235], shape=(2,), dtype=float32) }

We no longer recommend passing `tf.random.Generator`

as arguments to `Strategy.run`

, because `Strategy.run`

generally expects the arguments to be tensors, not generators.

### Saving generators

Generally for saving or serializing you can handle a `tf.random.Generator`

the same way you would handle a `tf.Variable`

or a `tf.Module`

(or its subclasses). In TF there are two mechanisms for serialization: Checkpoint and SavedModel.

#### Checkpoint

Generators can be freely saved and restored using `tf.train.Checkpoint`

. The random-number stream from the restoring point will be the same as that from the saving point.

```
filename = "./checkpoint"
g = tf.random.Generator.from_seed(1)
cp = tf.train.Checkpoint(generator=g)
print(g.normal([]))
```

tf.Tensor(0.43842277, shape=(), dtype=float32)

```
cp.write(filename)
print("RNG stream from saving point:")
print(g.normal([]))
print(g.normal([]))
```

RNG stream from saving point: tf.Tensor(1.6272374, shape=(), dtype=float32) tf.Tensor(1.6307176, shape=(), dtype=float32)

```
cp.restore(filename)
print("RNG stream from restoring point:")
print(g.normal([]))
print(g.normal([]))
```

RNG stream from restoring point: tf.Tensor(1.6272374, shape=(), dtype=float32) tf.Tensor(1.6307176, shape=(), dtype=float32)

You can also save and restore within a distribution strategy:

```
filename = "./checkpoint"
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
g = tf.random.Generator.from_seed(1)
cp = tf.train.Checkpoint(my_generator=g)
print(strat.run(lambda: g.normal([])))
```

WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce. INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') PerReplica:{ 0: tf.Tensor(-0.87930447, shape=(), dtype=float32), 1: tf.Tensor(0.020661574, shape=(), dtype=float32) }

```
with strat.scope():
cp.write(filename)
print("RNG stream from saving point:")
print(strat.run(lambda: g.normal([])))
print(strat.run(lambda: g.normal([])))
```

RNG stream from saving point: PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32) } PerReplica:{ 0: tf.Tensor(-0.5039703, shape=(), dtype=float32), 1: tf.Tensor(0.1251838, shape=(), dtype=float32) }

```
with strat.scope():
cp.restore(filename)
print("RNG stream from restoring point:")
print(strat.run(lambda: g.normal([])))
print(strat.run(lambda: g.normal([])))
```

RNG stream from restoring point: PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32) } PerReplica:{ 0: tf.Tensor(-0.5039703, shape=(), dtype=float32), 1: tf.Tensor(0.1251838, shape=(), dtype=float32) }

You should make sure that the replicas don't diverge in their RNG call history (e.g. one replica makes one RNG call while another makes two RNG calls) before saving. Otherwise, their internal RNG states will diverge and `tf.train.Checkpoint`

(which only saves the first replica's state) won't properly restore all the replicas.

You can also restore a saved checkpoint to a different distribution strategy with a different number of replicas. Because a `tf.random.Generator`

object created in a strategy can only be used in the same strategy, to restore to a different strategy, you have to create a new `tf.random.Generator`

in the target strategy and a new `tf.train.Checkpoint`

for it, as shown in this example:

```
filename = "./checkpoint"
strat1 = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat1.scope():
g1 = tf.random.Generator.from_seed(1)
cp1 = tf.train.Checkpoint(my_generator=g1)
print(strat1.run(lambda: g1.normal([])))
```

WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce. INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') PerReplica:{ 0: tf.Tensor(-0.87930447, shape=(), dtype=float32), 1: tf.Tensor(0.020661574, shape=(), dtype=float32) }

```
with strat1.scope():
cp1.write(filename)
print("RNG stream from saving point:")
print(strat1.run(lambda: g1.normal([])))
print(strat1.run(lambda: g1.normal([])))
```

RNG stream from saving point: PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32) } PerReplica:{ 0: tf.Tensor(-0.5039703, shape=(), dtype=float32), 1: tf.Tensor(0.1251838, shape=(), dtype=float32) }

```
strat2 = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1", "cpu:2"])
with strat2.scope():
g2 = tf.random.Generator.from_seed(1)
cp2 = tf.train.Checkpoint(my_generator=g2)
cp2.restore(filename)
print("RNG stream from restoring point:")
print(strat2.run(lambda: g2.normal([])))
print(strat2.run(lambda: g2.normal([])))
```

WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce. INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1', '/job:localhost/replica:0/task:0/device:CPU:2') RNG stream from restoring point: PerReplica:{ 0: tf.Tensor(-1.5822568, shape=(), dtype=float32), 1: tf.Tensor(0.77539235, shape=(), dtype=float32), 2: tf.Tensor(0.6851049, shape=(), dtype=float32) } PerReplica:{ 0: tf.Tensor(-0.5039703, shape=(), dtype=float32), 1: tf.Tensor(0.1251838, shape=(), dtype=float32), 2: tf.Tensor(-0.58519536, shape=(), dtype=float32) }

Although `g1`

and `cp1`

are different objects from `g2`

and `cp2`

, they are linked via the common checkpoint file `filename`

and object name `my_generator`

. Overlapping replicas between strategies (e.g. `cpu:0`

and `cpu:1`

above) will have their RNG streams properly restored like in previous examples. This guarantee doesn't cover the case when a generator is saved in a strategy scope and restored outside of any strategy scope or vice versa, because a device outside strategies is treated as different from any replica in a strategy.

#### SavedModel

`tf.random.Generator`

can be saved to a SavedModel. The generator can be created within a strategy scope. The saving can also happen within a strategy scope.

```
filename = "./saved_model"
class MyModule(tf.Module):
def __init__(self):
super(MyModule, self).__init__()
self.g = tf.random.Generator.from_seed(0)
@tf.function
def __call__(self):
return self.g.normal([])
@tf.function
def state(self):
return self.g.state
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
m = MyModule()
print(strat.run(m))
print("state:", m.state())
```

WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce. INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0', '/job:localhost/replica:0/task:0/device:CPU:1') PerReplica:{ 0: tf.Tensor(-1.4154755, shape=(), dtype=float32), 1: tf.Tensor(-0.113884404, shape=(), dtype=float32) } state: tf.Tensor([256 0 0], shape=(3,), dtype=int64)

```
with strat.scope():
tf.saved_model.save(m, filename)
print("RNG stream from saving point:")
print(strat.run(m))
print("state:", m.state())
print(strat.run(m))
print("state:", m.state())
```

INFO:tensorflow:Assets written to: ./saved_model/assets RNG stream from saving point: PerReplica:{ 0: tf.Tensor(-0.68758255, shape=(), dtype=float32), 1: tf.Tensor(0.8084062, shape=(), dtype=float32) } state: tf.Tensor([512 0 0], shape=(3,), dtype=int64) PerReplica:{ 0: tf.Tensor(-0.27342677, shape=(), dtype=float32), 1: tf.Tensor(-0.53093255, shape=(), dtype=float32) } state: tf.Tensor([768 0 0], shape=(3,), dtype=int64) 2021-09-22 20:45:46.222281: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.

```
imported = tf.saved_model.load(filename)
print("RNG stream from loading point:")
print("state:", imported.state())
print(imported())
print("state:", imported.state())
print(imported())
print("state:", imported.state())
```

RNG stream from loading point: state: tf.Tensor([256 0 0], shape=(3,), dtype=int64) tf.Tensor(-1.0359411, shape=(), dtype=float32) state: tf.Tensor([512 0 0], shape=(3,), dtype=int64) tf.Tensor(-0.06425078, shape=(), dtype=float32) state: tf.Tensor([768 0 0], shape=(3,), dtype=int64)

Loading a SavedModel containing `tf.random.Generator`

into a distribution strategy is not recommended because the replicas will all generate the same random-number stream (which is because replica ID is frozen in SavedModel's graph).

Loading a distributed `tf.random.Generator`

(a generator created within a distribution strategy) into a non-strategy environment, like the above example, also has a caveat. The RNG state will be properly restored, but the random numbers generated will be different from the original generator in its strategy (again because a device outside strategies is treated as different from any replica in a strategy).

## Stateless RNGs

Usage of stateless RNGs is simple. Since they are just pure functions, there is no state or side effect involved.

```
print(tf.random.stateless_normal(shape=[2, 3], seed=[1, 2]))
print(tf.random.stateless_normal(shape=[2, 3], seed=[1, 2]))
```

tf.Tensor( [[ 0.5441101 0.20738031 0.07356433] [ 0.04643455 -1.30159 -0.95385665]], shape=(2, 3), dtype=float32) tf.Tensor( [[ 0.5441101 0.20738031 0.07356433] [ 0.04643455 -1.30159 -0.95385665]], shape=(2, 3), dtype=float32)

Every stateless RNG requires a `seed`

argument, which needs to be an integer Tensor of shape `[2]`

. The results of the op are fully determined by this seed.

The RNG algorithm used by stateless RNGs is device-dependent, meaning the same op running on a different device may produce different outputs.

## Algorithms

### General

Both the `tf.random.Generator`

class and the `stateless`

functions support the Philox algorithm (written as `"philox"`

or `tf.random.Algorithm.PHILOX`

) on all devices.

Different devices will generate the same integer numbers, if using the same algorithm and starting from the same state. They will also generate "almost the same" float-point numbers, though there may be small numerical discrepancies caused by the different ways the devices carry out the float-point computation (e.g. reduction order).

### XLA devices

On XLA-driven devices (such as TPU, and also CPU/GPU when XLA is enabled) the ThreeFry algorithm (written as `"threefry"`

or `tf.random.Algorithm.THREEFRY`

) is also supported. This algorithm is fast on TPU but slow on CPU/GPU compared to Philox.

See paper 'Parallel Random Numbers: As Easy as 1, 2, 3' for more details about these algorithms.