GPU 사용하기

TensorFlow.org에서 보기 Google Colab에서 실행 GitHub에서 소스 보기 노트북 다운로드

TensorFlow 코드 및 tf.keras 모델은 코드를 변경할 필요 없이 단일 GPU에서 투명하게 실행됩니다.

참고: tf.config.list_physical_devices('GPU')를 사용하여 TensorFlow가 GPU를 사용하고 있는지 확인하세요.

하나 또는 여러 시스템의 여러 GPU에서 실행하는 가장 간단한 방법은 배포 전략을 이용하는 것입니다.

이 가이드는 이러한 접근 방식을 시도해 보고 TensorFlow가 GPU를 사용하는 방식을 세밀한 제어해야 할 필요성을 느낀 사용자를 대상으로 합니다. 단일 및 다중 GPU 시나리오에서 성능 문제를 디버깅하는 방법을 알아보려면 TensorFlow GPU 성능 최적화 가이드를 참조하세요.

설정

최신 버전의 텐서플로가 설치되어있는지 확인하세요.

import tensorflow as tf
2022-12-14 21:22:27.507083: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2022-12-14 21:22:27.507177: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2022-12-14 21:22:27.507186: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

개요

  • "/device:CPU:0": 컴퓨터의 CPU입니다.
  • "/GPU:0": TensorFlow에 인식되는 시스템의 첫 번째 GPU에 대한 약식 표기입니다.
  • "/job:localhost/replica:0/task:0/device:GPU:1": TensorFlow에 인식되는 시스템의 두 번째 GPU에 대한 정규화된 이름입니다.

TensorFlow 연산에 CPU와 GPU 구현이 모두 있는 경우, 기본적으로 연산이 할당될 때 GPU 장치에 우선 순위가 지정됩니다. 예를 들어, tf.matmul에는 CPU 및 GPU 커널이 모두 있으며 CPU:0GPU:0 장치가 있는 시스템에서는 다른 장치에서 실행하도록 명시적으로 요청하지 않는 한 GPU:0 장치가 tf.matmul을 실행하도록 선택됩니다.

TensorFlow 작업에 해당 GPU 구현이 없는 경우 연산은 CPU 장치로 대체됩니다. 예를 들어 tf.cast에는 CPU 커널만 있기 때문에 CPU:0GPU:0 장치가 있는 시스템에서는 GPU:0 장치에서 실행하도록 요청된 경우라도 CPU:0 장치가 tf.cast를 실행하도록 선택됩니다.

장치 할당 로깅

연산과 텐서가 어떤 장치에 할당되었는지 확인하려면 tf.debugging.set_log_device_placement(True)를 프로그램의 가장 처음에 선언하세요. 장치 할당 로깅을 활성화하면 모든 텐서나 연산 할당이 출력됩니다.

tf.debugging.set_log_device_placement(True)

# 텐서 생성
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)

print(c)
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0
tf.Tensor(
[[22. 28.]
 [49. 64.]], shape=(2, 2), dtype=float32)

위 코드는 MatMul 연산이 GPU:0에서 수행되었다고 보여줄 것입니다.

장치 수동 할당

특정 연산을 수행할 장치를 직접 선택하고 싶다면, with tf.device로 장치 컨텍스트를 생성할 수 있고 해당 컨텍스트에서의 모든 연산은 지정된 장치에서 수행됩니다.

tf.debugging.set_log_device_placement(True)

# 텐서를 CPU에 할당
with tf.device('/CPU:0'):
  a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
  b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])

c = tf.matmul(a, b)
print(c)
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0
tf.Tensor(
[[22. 28.]
 [49. 64.]], shape=(2, 2), dtype=float32)

abCPU:0에 할당되었습니다. MatMul 연산은 수행할 장치가 명시적으로 할당되어 있지 않기 때문에 텐서플로 런타임(runtime)은 연산과 가용한 장치들(이 예제에서는 GPU:0)을 기반으로 하나를 고를 것이고 필요하다면 장치들간에 텐서를 자동으로 복사할 것입니다.

GPU 메모리 제한하기

기본적으로 텐서플로는 모든 GPU의 거의 모든 메모리를 프로세스가 볼 수 있도록 매핑합니다(CUDA_VISIBLE_DEVICES에 포함되었다고 가정합니다). 이는 메모리 단편화를 줄여서 상대적으로 귀한 GPU 메모리 리소스를 장치에서 보다 효율적으로 사용할 수 있게 합니다. tf.config.set_visible_devices 메서드를 사용하여 텐서플로에서 접근할 수 있는 GPU를 조정할 수 있습니다.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  # 텐서플로가 첫 번째 GPU만 사용하도록 제한
  try:
    tf.config.set_visible_devices(gpus[0], 'GPU')
  except RuntimeError as e:
    # 프로그램 시작시에 접근 가능한 장치가 설정되어야만 합니다
    print(e)
Visible devices cannot be modified after being initialized

어떤 경우에는 프로세스가 가용한 메모리의 일부에만 할당되도록 하거나 프로세스의 요구량만큼 메모리 사용이 가능할 필요가 있습니다. 텐서플로에서는 이를 위해 두 가지 방법을 제공합니다.

런타임 할당에 필요한 만큼의 GPU 메모리만 할당하려고 시도하는 tf.config.experimental.set_memory_growth를 호출하여 메모리 증가가 이루어지도록 하는 것이 첫 번째 옵션입니다. 처음에는 매우 적은 메모리만 할당하고, 프로그램이 실행되어 더 많은 GPU 메모리가 필요해짐에 따라 GPU 메모리 영역이 TensorFlow 프로세스를 위해 확장됩니다. 메모리 조각화로 이어질 수 있으므로 메모리 할당이 해제되지 않습니다. 특정 GPU에 대한 메모리 증가를 활성화하려면 텐서를 할당하거나 연산을 실행하기 전에 다음 코드를 사용하세요.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  try:
    tf.config.experimental.set_memory_growth(gpus[0], True)
  except RuntimeError as e:
    # 프로그램 시작시에 메모리 증가가 설정되어야만 합니다
    print(e)
Physical devices cannot be modified after being initialized

또 다른 방법은 TF_FORCE_GPU_ALLOW_GROWTH 환경변수를 true로 설정하는 것입니다. 이 설정은 플랫폼 종속적입니다.

두 번째 방법은 tf.config.set_logical_device_configuration으로 가상 GPU 장치를 설정하고 GPU에 할당될 전체 메모리를 제한하는 것입니다.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  # 텐서플로가 첫 번째 GPU에 1GB 메모리만 할당하도록 제한
  try:
    tf.config.set_logical_device_configuration(
        gpus[0],
        [tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
  except RuntimeError as e:
    # 프로그램 시작시에 가상 장치가 설정되어야만 합니다
    print(e)
Virtual devices cannot be modified after being initialized

이는 텐서플로 프로세스에서 사용가능한 GPU 메모리량을 제한하는데 유용합니다. 워크스테이션 GUI같이 GPU가 다른 어플리케이션들에 공유되는 로컬 개발환경에서 보통 사용되는 방법입니다.

멀티 GPU 시스템에서 하나의 GPU만 사용하기

시스템에 두 개 이상의 GPU가 있다면 낮은 ID의 GPU가 기본으로 선택됩니다. 다른 GPU에서 실행하고 싶으면 명시적으로 표시해야 합니다:

tf.debugging.set_log_device_placement(True)

try:
  # 유효하지 않은 GPU 장치를 명시
  with tf.device('/device:GPU:2'):
    a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
    b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
    c = tf.matmul(a, b)
except RuntimeError as e:
  print(e)
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:2

명시한 장치가 존재하지 않으면 RuntimeError가 나옵니다:

명시한 장치가 존재하지 않을 때 텐서플로가 자동으로 현재 지원하는 장치를 선택하게 하려면 tf.config.set_soft_device_placement(True)를 호출하세요.

tf.config.set_soft_device_placement(True)
tf.debugging.set_log_device_placement(True)

# 텐서 생성
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)

print(c)
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0
tf.Tensor(
[[22. 28.]
 [49. 64.]], shape=(2, 2), dtype=float32)

멀티 GPU 사용하기

여러 GPU용으로 개발하면 추가 리소스를 사용하여 모델을 확장할 수 있습니다. 단일 GPU가 있는 시스템에서 개발하는 경우, 가상 기기로 여러 GPU를 시뮬레이션할 수 있습니다. 이를 통해 추가 리소스 없이도 다중 GPU 설정을 쉽게 테스트할 수 있습니다.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  # Create 2 virtual GPUs with 1GB memory each
  try:
    tf.config.set_logical_device_configuration(
        gpus[0],
        [tf.config.LogicalDeviceConfiguration(memory_limit=1024),
         tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Physical GPU,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Virtual devices must be set before GPUs have been initialized
    print(e)
Virtual devices cannot be modified after being initialized

런타임에 여러 논리 GPU를 사용할 수 있게 되면 tf.distribute.Strategy 또는 수동 배치를 통해 여러 GPU를 활용할 수 있습니다.

tf.distribute.Strategy 사용

멀티 GPU를 사용하는 가장 좋은 방법은 tf.distriute.Strategy를 사용하는 것입니다. 간단한 예제를 살펴봅시다:

tf.debugging.set_log_device_placement(True)
gpus = tf.config.list_logical_devices('GPU')
strategy = tf.distribute.MirroredStrategy(gpus)
with strategy.scope():
  inputs = tf.keras.layers.Input(shape=(1,))
  predictions = tf.keras.layers.Dense(1)(inputs)
  model = tf.keras.models.Model(inputs=inputs, outputs=predictions)
  model.compile(loss='mse',
                optimizer=tf.keras.optimizers.SGD(learning_rate=0.2))
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1', '/job:localhost/replica:0/task:0/device:GPU:2', '/job:localhost/replica:0/task:0/device:GPU:3')
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op StatelessRandomGetKeyCounter in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op StatelessRandomUniformV2 in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Sub in device /job:localhost/replica:0/task:0/device:GPU:0
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:0
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
a: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
b: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
product_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:2
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2
a: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
b: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:2
product_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:0
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:1
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:1
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:1
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:2
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:2
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:3
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:3
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:3
input: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:0
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
seed: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
StatelessRandomGetKeyCounter: (StatelessRandomGetKeyCounter): /job:localhost/replica:0/task:0/device:GPU:0
key_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
counter_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
shape: (_DeviceArg): /job:localhost/replica:0/task:0/device:CPU:0
key: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
counter: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
alg: (_DeviceArg): /job:localhost/replica:0/task:0/device:CPU:0
StatelessRandomUniformV2: (StatelessRandomUniformV2): /job:localhost/replica:0/task:0/device:GPU:0
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
x: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
y: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
Sub: (Sub): /job:localhost/replica:0/task:0/device:GPU:0
z_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
x: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
y: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
Mul: (Mul): /job:localhost/replica:0/task:0/device:GPU:0
z_RetVal: (_Retval): /job:localhost/replica:Executing op Mul in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AddV2 in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
0/task:0/device:GPU:0
x: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
y: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
AddV2: (AddV2): /job:localhost/replica:0/task:0/device:GPU:0
z_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:0
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:1
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:1
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:1
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:2
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:2
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:2
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:3
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:3
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:3
NoOp: (NoOp): /job:localhost/replica:0/task:0/device:GPU:0
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:0
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
dims: (_DeviceArg): /job:localhost/replica:0/task:0/device:CPU:0
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
Fill: (Fill): /job:localhost/replica:0/task:0/device:GPU:0
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:0
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:1
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:1
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:2
resource: (_Arg): /job:localhost/replica:0/taskExecuting op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
:0/device:GPU:2
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:2
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:3
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:3
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
_EagerConst: (_EagerConst): /job:localhost/replica:0/task:0/device:GPU:0
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:0
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:1
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:1
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:1
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:2
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:2
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:2
input: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
Identity: (Identity): /job:localhost/replica:0/task:0/device:GPU:3
output_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3
resource_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/replica:0/task:0/device:GPU:3
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
value: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/replica:0/task:0/device:GPU:3
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:0
value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:0
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:1
ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:1
value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:1
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:2
ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:2
value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:2
resource: (_Arg): /job:localhost/replica:0/task:0/device:GPU:3
ReadVariableOp: (ReadVariableOp): /job:localhost/replica:0/task:0/device:GPU:3
value_RetVal: (_Retval): /job:localhost/replica:0/task:0/device:GPU:3
inputs_0: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
inputs_1: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
inputs_2: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
inputs_3: (_Arg): /job:localhost/replica:0/task:0/device:CPU:0
AddN: (AddN): /job:localhost/replica:0/task:0/device:CPU:0
sum_RetVExecuting op AddN in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:CPU:0
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AddN in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Fill in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AddN in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:CPU:0
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AddN in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op ReadVariableOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Identity in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op NoOp in device /job:localhost/replica:0/task:0/device:GPU:0

이 프로그램은 입력 데이터를 나누고 모델의 복사본을 각 GPU에서 실행할 것입니다. 이는 "데이터 병렬처리"라고도 합니다.

병렬화 전략에 대해 더 알고 싶으시면 가이드를 참조하세요.

tf.distribute.Strategy 미사용

tf.distribute.Strategy는 여러 장치에 걸쳐 계산을 복제해서 동작합니다. 모델을 각 GPU에 구성하여 수동으로 이를 구현할 수 있습니다. 예를 들면:

tf.debugging.set_log_device_placement(True)

gpus = tf.config.list_logical_devices('GPU')
if gpus:
  # Replicate your computation on multiple GPUs
  c = []
  for gpu in gpus:
    with tf.device(gpu.name):
      a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
      b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
      c.append(tf.matmul(a, b))

  with tf.device('/CPU:0'):
    matmul_sum = tf.add_n(c)

  print(matmul_sum)
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:1
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:2
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:3
Executing op AddN in device /job:localhost/replica:0/task:0/device:CPU:0
tf.Tensor(
[[ 88. 112.]
 [196. 256.]], shape=(2, 2), dtype=float32)