Help protect the Great Barrier Reef with TensorFlow on Kaggle

# 生成随机数

TensorFlow 在 `tf.random` 模块中提供了一组伪随机数生成器 (RNG)。本文介绍如何控制随机数生成器，以及这些生成器如何与其他 Tensorflow 子系统交互。

TensorFlow 提供了两种方法来控制随机数生成过程：

1. 通过明确使用 `tf.random.Generator` 对象。每个此类对象都会在 `tf.Variable` 中维护一个状态，该状态在每次生成随机数后都会发生改变。

2. 通过使用纯函数式无状态随机函数，如 `tf.random.stateless_uniform`。在同一设备上调用具有相同参数（包括种子）的这些函数会产生相同的结果。

## 设置

``````import tensorflow as tf

# Creates 2 virtual devices cpu:0 and cpu:1 for using distribution strategy
physical_devices = tf.config.experimental.list_physical_devices("CPU")
tf.config.experimental.set_virtual_device_configuration(
physical_devices[0], [
tf.config.experimental.VirtualDeviceConfiguration(),
tf.config.experimental.VirtualDeviceConfiguration()
])
``````

## `tf.random.Generator` 类

``````g1 = tf.random.Generator.from_seed(1)
print(g1.normal(shape=[2, 3]))
g2 = tf.random.get_global_generator()
print(g2.normal(shape=[2, 3]))
``````
```tf.Tensor(
[[ 0.43842274 -0.53439844 -0.07710262]
[ 1.5658046  -0.1012345  -0.2744976 ]], shape=(2, 3), dtype=float32)
tf.Tensor(
[[-1.7979205  -0.9676012   0.01789179]
[ 0.14531721 -0.69562066  1.3617809 ]], shape=(2, 3), dtype=float32)
```

``````g1 = tf.random.Generator.from_seed(1, alg='philox')
print(g1.normal(shape=[2, 3]))
``````
```tf.Tensor(
[[ 0.43842274 -0.53439844 -0.07710262]
[ 1.5658046  -0.1012345  -0.2744976 ]], shape=(2, 3), dtype=float32)
```

``````g = tf.random.Generator.from_non_deterministic_state()
print(g.normal(shape=[2, 3]))
``````
```tf.Tensor(
[[-0.8661606   1.5540929   0.20434628]
[-0.58820033 -0.774897    0.5584449 ]], shape=(2, 3), dtype=float32)
```

``````g = tf.random.Generator.from_seed(1)
print(g.normal([]))
print(g.normal([]))
g.reset_from_seed(1)
print(g.normal([]))
``````
```tf.Tensor(0.43842274, shape=(), dtype=float32)
tf.Tensor(1.6272374, shape=(), dtype=float32)
tf.Tensor(0.43842274, shape=(), dtype=float32)
```

### 创建独立的随机数流

``````g = tf.random.Generator.from_seed(1)
print(g.normal([]))
new_gs = g.split(3)
for new_g in new_gs:
print(new_g.normal([]))
print(g.normal([]))
``````
```tf.Tensor(0.43842274, shape=(), dtype=float32)
tf.Tensor(2.536413, shape=(), dtype=float32)
tf.Tensor(0.33186463, shape=(), dtype=float32)
tf.Tensor(-0.07144657, shape=(), dtype=float32)
tf.Tensor(-0.79253083, shape=(), dtype=float32)
```

`normal` 之类的 RNG 方法类似，`split` 会改变调用它的生成器的状态（上例中为 `g`）。除相互之间保持独立外，新生成器 (`new_gs`) 还一定独立于旧生成器 (`g`)。

``````with tf.device("cpu"):  # change "cpu" to the device you want
g = tf.random.get_global_generator().split(1)[0]
print(g.normal([]))  # use of g won't cause cross-device copy, unlike the global generator
``````
```tf.Tensor(1.720571, shape=(), dtype=float32)
```

### 与 `tf.function` 交互

`tf.function` 一起使用时，`tf.random.Generator` 遵循与 `tf.Variable` 相同的原则。这包括三个方面：

#### 在 `tf.function` 的外部创建生成器

`tf.function` 可以使用在其外部创建的生成器。

``````g = tf.random.Generator.from_seed(1)
@tf.function
def foo():
return g.normal([])
print(foo())
``````
```tf.Tensor(0.43842274, shape=(), dtype=float32)
```

#### 在 `tf.function` 的内部创建生成器

``````g = None
@tf.function
def foo():
global g
if g is None:
g = tf.random.Generator.from_seed(1)
return g.normal([])
print(foo())
print(foo())
``````
```tf.Tensor(0.43842274, shape=(), dtype=float32)
tf.Tensor(1.6272374, shape=(), dtype=float32)
```

#### 将生成器作为参数传递给 `tf.function`

``````num_traces = 0
@tf.function
def foo(g):
global num_traces
num_traces += 1
return g.normal([])
foo(tf.random.Generator.from_seed(1))
foo(tf.random.Generator.from_seed(2))
print(num_traces)
``````
```1
```

### 与分布策略交互

`Generator` 与分布策略有三种交互方式。

#### 在分布策略的外部创建生成器

``````g = tf.random.Generator.from_seed(1)
strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
def f():
print(g.normal([]))
results = strat.run(f)
``````
```WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce.
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
tf.Tensor(0.43842274, shape=(), dtype=float32)
tf.Tensor(1.6272374, shape=(), dtype=float32)
```

#### 在分布策略的内部创建生成器

``````strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
with strat.scope():
try:
tf.random.Generator.from_seed(1)
except ValueError as e:
print("ValueError:", e)
``````
```WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce.
ValueError: Creating a generator within a strategy scope is disallowed, because there is ambiguity on how to replicate a generator (e.g. should it be copied so that each replica gets the same random numbers, or 'split' so that each replica gets different random numbers).
```

``````strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
def f():
tf.random.Generator.from_seed(1)
try:
strat.run(f)
except ValueError as e:
print("ValueError:", e)
``````
```WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce.
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
INFO:tensorflow:Error reported to Coordinator: Creating a generator within a strategy scope is disallowed, because there is ambiguity on how to replicate a generator (e.g. should it be copied so that each replica gets the same random numbers, or 'split' so that each replica gets different random numbers).
Traceback (most recent call last):
File "/tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/coordinator.py", line 297, in stop_on_exception
yield
File "/tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_run.py", line 323, in run
self.main_result = self.main_fn(*self.main_args, **self.main_kwargs)
File "/tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 275, in wrapper
return func(*args, **kwargs)
File "<ipython-input-1-2cd7806456bd>", line 3, in f
tf.random.Generator.from_seed(1)
File "/tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/stateful_random_ops.py", line 441, in from_seed
return cls(state=state, alg=alg)
File "/tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/stateful_random_ops.py", line 363, in __init__
trainable=False)
File "/tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/stateful_random_ops.py", line 378, in _create_variable
"Creating a generator within a strategy scope is disallowed, because "
ValueError: Creating a generator within a strategy scope is disallowed, because there is ambiguity on how to replicate a generator (e.g. should it be copied so that each replica gets the same random numbers, or 'split' so that each replica gets different random numbers).
ValueError: Creating a generator within a strategy scope is disallowed, because there is ambiguity on how to replicate a generator (e.g. should it be copied so that each replica gets the same random numbers, or 'split' so that each replica gets different random numbers).
```

#### 将生成器作为参数传递给 `Strategy.run`

``````strat = tf.distribute.MirroredStrategy(devices=["cpu:0", "cpu:1"])
gs = tf.random.get_global_generator().split(2)
# to_args is a workaround for the absence of APIs to create arguments for
# run. It will be replaced when such APIs are available.
def to_args(gs):
with strat.scope():
def f():
return [gs[tf.distribute.get_replica_context().replica_id_in_sync_group]]
return strat.run(f)
args = to_args(gs)
def f(g):
print(g.normal([]))
results = strat.run(f, args=args)
``````
```WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce.
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
tf.Tensor(0.39796075, shape=(), dtype=float32)
tf.Tensor(-1.3158226, shape=(), dtype=float32)
```

## 无状态 RNG

``````print(tf.random.stateless_normal(shape=[2, 3], seed=[1, 2]))
print(tf.random.stateless_normal(shape=[2, 3], seed=[1, 2]))
``````
```tf.Tensor(
[[ 0.5441101   0.20738031  0.07356433]
[ 0.04643455 -1.3015898  -0.95385665]], shape=(2, 3), dtype=float32)
tf.Tensor(
[[ 0.5441101   0.20738031  0.07356433]
[ 0.04643455 -1.3015898  -0.95385665]], shape=(2, 3), dtype=float32)
```

## 算法

### 基本信息

`tf.random.Generator` 类和 `stateless` 函数在所有设备上都支持 Philox 算法（写作 `"philox"``tf.random.Algorithm.PHILOX`）。

[]
[]