View source on GitHub |
Enable dumping debugging information from a TensorFlow program.
tf.debugging.experimental.enable_dump_debug_info(
dump_root,
tensor_debug_mode=DEFAULT_TENSOR_DEBUG_MODE,
circular_buffer_size=1000,
op_regex=None,
tensor_dtypes=None
)
The debugging information is dumped to a directory on the file system
specified as dump_root
.
The dumped debugging information can be ingested by debugger UIs.
The files in the dump directory contain the following information:
- TensorFlow Function construction (e.g., compilation of Python functions decorated with @tf.function), the op types, names (if available), context, the input and output tensors, and the associated stack traces.
- Execution of TensorFlow operations (ops) and Functions and their stack
traces, op types, names (if available) and contexts. In addition,
depending on the value of the
tensor_debug_mode
argument (see Args section below), the value(s) of the output tensors or more concise summaries of the tensor values will be dumped. - A snapshot of Python source files involved in the execution of the TensorFlow program.
Once enabled, the dumping can be disabled with the corresponding
disable_dump_debug_info()
method under the same Python namespace.
Calling this method more than once with the same dump_root
is idempotent.
Calling this method more than once with different tensor_debug_mode
s
leads to a ValueError
.
Calling this method more than once with different circular_buffer_size
s
leads to a ValueError
.
Calling this method with a different dump_root
abolishes the
previously-enabled dump_root
.
Usage example:
tf.debugging.experimental.enable_dump_debug_info('/tmp/my-tfdbg-dumps')
# Code to build, train and run your TensorFlow model...
tf.config.set_soft_device_placement(True)
tf.debugging.experimental.enable_dump_debug_info(
logdir, tensor_debug_mode="FULL_HEALTH")
resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
strategy = tf.distribute.TPUStrategy(resolver)
with strategy.scope():
# ...
Args | |
---|---|
dump_root
|
The directory path where the dumping information will be written. |
tensor_debug_mode
|
Debug mode for tensor values, as a string.
The currently supported options are:
|
circular_buffer_size
|
Size of the circular buffers for execution events.
These circular buffers are designed to reduce the overhead of debugging
dumping. They hold the most recent debug events concerning eager execution
of ops and tf.function s and traces of tensor values computed inside
tf.function s. They are written to the file system only when the proper
flushing method is called (see description of return values below).
Expected to be an integer. If <= 0, the circular-buffer behavior will be
disabled, i.e., the execution debug events will be written to the file
writers in the same way as non-execution events such as op creations and
source-file snapshots.
|
op_regex
|
Dump data from only the tensors from op types that matches to the
regular expression (through Python's re.match() ).
"Op type" refers to the names of the TensorFlow operations (e.g.,
"MatMul", "LogSoftmax"), which may repeat in a TensorFlow
function. It does not refer to the names of nodes (e.g.,
"dense/MatMul", "dense_1/MatMul_1") which are unique within a function.
op_regex="^(MatMul|Relu)$" .op_regex="(?!^Relu$)" .
This filter operates in a logical AND relation with tensor_dtypes .
|
tensor_dtypes
|
Dump data from only the tensors of which the specified
dtypes. This optional argument can be in any of the following format:
DType objects or strings that can be converted
to DType objects via tf.as_dtype() . Examples:
DType argument and returns a Python
boolean indicating whether the dtype is to be included in the data
dumping. Examples:
|
Returns | |
---|---|
A DebugEventsWriter instance used by the dumping callback. The caller
may use its flushing methods, including FlushNonExecutionFiles() and
FlushExecutionFiles() .
|