tf.debugging.experimental.enable_dump_debug_info

Enable dumping debugging information from a TensorFlow program.

The debugging information is dumped to a directory on the file system specified as dump_root.

The dumped debugging information can be ingested by debugger UIs.

The files in the dump directory contain the following information:

  • TensorFlow Function construction (e.g., compilation of Python functions decorated with @tf.function), the op types, names (if available), context, the input and output tensors, and the associated stack traces.
  • Execution of TensorFlow operations (ops) and Functions and their stack traces, op types, names (if available) and contexts. In addition, depending on the value of the tensor_debug_mode argument (see Args section below), the value(s) of the output tensors or more concise summaries of the tensor values will be dumped.
  • A snapshot of Python source files involved in the execution of the TensorFlow program.

Once enabled, the dumping can be disabled with the corresponding disable_dump_debug_info() method under the same Python namespace. Calling this method more than once with the same dump_root is idempotent. Calling this method more than once with different tensor_debug_modes leads to a ValueError. Calling this method more than once with different circular_buffer_sizes leads to a ValueError. Calling this method with a different dump_root abolishes the previously-enabled dump_root.

Usage example:

tf.debugging.experimental.enable_dump_debug_info('/tmp/my-tfdbg-dumps')

# Code to build, train and run your TensorFlow model...
tf.config.set_soft_device_placement(True)
tf.debugging.experimental.enable_dump_debug_info(
    logdir, tensor_debug_mode="FULL_HEALTH")

resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
strategy = tf.distribute.TPUStrategy(resolver)
with strategy.scope():
  # ...

dump_root The directory path where the dumping information will be written.
tensor_debug_mode Debug mode for tensor values, as a string. The currently supported options are:

  • "NO_TENSOR": (Default) Only traces the output tensors of all executed ops (including those executed eagerly at the Python level or as a part of a TensorFlow graph) and functions, while not extracting any information from the values of the tensors.
  • "CURT_HEALTH": For each floating-dtype tensor (e.g., tensors of dtypes such as float32, float64 and bfloat16), extracts a binary bit indicating whether it contains any -infinity, +infinity or NaN.
  • "CONCISE_HEALTH": For each floating-d