|  View source on GitHub | 
Creates a MonitoredSession for training.
tf.compat.v1.train.MonitoredTrainingSession(
    master='',
    is_chief=True,
    checkpoint_dir=None,
    scaffold=None,
    hooks=None,
    chief_only_hooks=None,
    save_checkpoint_secs=USE_DEFAULT,
    save_summaries_steps=USE_DEFAULT,
    save_summaries_secs=USE_DEFAULT,
    config=None,
    stop_grace_period_secs=120,
    log_step_count_steps=100,
    max_wait_secs=7200,
    save_checkpoint_steps=USE_DEFAULT,
    summary_dir=None,
    save_graph_def=True
)
Migrate to TF2
This API is not compatible with eager execution and tf.function. To migrate
to TF2, rewrite the code to be compatible with eager execution. Check the
migration
guide
on replacing Session.run calls. In Keras, session hooks can be replaced by
Callbacks e.g. logging hook notebook
For more details please read Better
performance with tf.function.
Description
Used in the notebooks
| Used in the tutorials | 
|---|
For a chief, this utility sets proper session initializer/restorer. It also
creates hooks related to checkpoint and summary saving. For workers, this
utility sets proper session creator which waits for the chief to
initialize/restore. Please check tf.compat.v1.train.MonitoredSession for
more
information.
| Args | |
|---|---|
| master | Stringthe TensorFlow master to use. | 
| is_chief | If True, it will take care of initialization and recovery the
underlying TensorFlow session. IfFalse, it will wait on a chief to
initialize or recover the TensorFlow session. | 
| checkpoint_dir | A string. Optional path to a directory where to restore variables. | 
| scaffold | A Scaffoldused for gathering or building supportive ops. If not
specified, a default one is created. It's used to finalize the graph. | 
| hooks | Optional list of SessionRunHookobjects. | 
| chief_only_hooks | list of SessionRunHookobjects. Activate these hooks ifis_chief==True, ignore otherwise. | 
| save_checkpoint_secs | The frequency, in seconds, that a checkpoint is saved
using a default checkpoint saver. If both save_checkpoint_stepsandsave_checkpoint_secsare set toNone, then the default checkpoint
saver isn't used. If both are provided, then onlysave_checkpoint_secsis used. Default 600. | 
| save_summaries_steps | The frequency, in number of global steps, that the
summaries are written to disk using a default summary saver. If both save_summaries_stepsandsave_summaries_secsare set toNone, then
the default summary saver isn't used. Default 100. | 
| save_summaries_secs | The frequency, in secs, that the summaries are written
to disk using a default summary saver.  If both save_summaries_stepsandsave_summaries_secsare set toNone, then the default summary saver
isn't used. Default not enabled. | 
| config | an instance of tf.compat.v1.ConfigProtoproto used to configure
the session. It's theconfigargument of constructor oftf.compat.v1.Session. | 
| stop_grace_period_secs | Number of seconds given to threads to stop after close()has been called. | 
| log_step_count_steps | The frequency, in number of global steps, that the global step/sec is logged. | 
| max_wait_secs | Maximum time workers should wait for the session to become available. This should be kept relatively short to help detect incorrect code, but sometimes may need to be increased if the chief takes a while to start up. | 
| save_checkpoint_steps | The frequency, in number of global steps, that a
checkpoint is saved using a default checkpoint saver. If both save_checkpoint_stepsandsave_checkpoint_secsare set toNone, then
the default checkpoint saver isn't used. If both are provided, then onlysave_checkpoint_secsis used. Default not enabled. | 
| summary_dir | A string. Optional path to a directory where to save summaries. If None, checkpoint_dir is used instead. | 
| save_graph_def | Whether to save the GraphDef and MetaGraphDef to checkpoint_dir. The GraphDef is saved after the session is created asgraph.pbtxt. MetaGraphDefs are saved out for every checkpoint asmodel.ckpt-*.meta. | 
| Returns | |
|---|---|
| A MonitoredSessionobject. |