View source on GitHub

Session-like object that handles initialization, recovery and hooks.

Example usage:

saver_hook = CheckpointSaverHook(...)
summary_hook = SummarySaverHook(...)
with MonitoredSession(session_creator=ChiefSessionCreator(...),
                      hooks=[saver_hook, summary_hook]) as sess:
  while not sess.should_stop():

Initialization: At creation time the monitored session does following things in given order:

  • calls hook.begin() for each given hook
  • finalizes the graph via scaffold.finalize()
  • create session
  • initializes the model via initialization ops provided by Scaffold
  • restores variables if a checkpoint exists
  • launches queue runners
  • calls hook.after_create_session()

Run: When run() is called, the monitored session does following things:

  • calls hook.before_run()
  • calls TensorFlow with merged fetches and feed_dict
  • calls hook.after_run()
  • returns result of asked by user
  • if AbortedError or UnavailableError occurs, it recovers or reinitializes the session before executing the run() call again

Exit: At the close(), the monitored session does following things in order:

  • calls hook.end()
  • closes the queue runners and the session
  • suppresses OutOfRange error which indicates that all inputs have been processed if the monitored_session is used as a context

How to set tf.compat.v1.Session arguments:

  • In most cases you can set session arguments as follows:
  session_creator=ChiefSessionCreator(master=..., config=...))
  • In distributed setting for a non-chief worker, you can use following:
  session_creator=WorkerSessionCreator(master=..., config=...))

See MonitoredTrainingSession for an example usage based on chief or worker.

  • it cannot be set as default session.
  • it cannot be sent to
  • it cannot be sent to tf.train.start_queue_runners.

session_creator A factory object to create session. Typically a ChiefSessionCreator which is the default one.
hooks An iterable of `SessionRunHook' objects.

A MonitoredSession object.

session_creator A factory object to create session. Typically a ChiefSessionCreator or a WorkerSessionCreator.
hooks An iterable of SessionRunHook' objects. </td> </tr><tr> <td>should_recover</td> <td> A bool. Indicates whether to recover fromAbortedErrorandUnavailableErroror not. </td> </tr><tr> <td>stop_grace_period_secs</td> <td> Number of seconds given to threads to stop afterclose()` has been called.

graph The graph that was launched in this session.

Child Classes

class StepContext



View source


View source

Run ops in the monitored session.

This method is completely compatible with the method.

fetches Same as
feed_dict Same as
options Same as
run_metadata Same as

Same as


View source

Run ops using a step function.

step_fn A function or a method with a single argument of type StepContext. The function may use methods of the argument to perform computations with access to a raw session. The returned value of the step_fn will be returned from run_step_fn, unless a stop is requested. In that case, the next should_stop call will return True. Example usage:

with tf.Graph().as_default():
c = tf.compat.v1.placeholder(dtypes.float32)
v = tf.add(c, 4.0)
w = tf.add(c, 0.5)
def step_fn(step_context):
a =, feed_dict={c: 0.5})
if a <= 4.5:
return step_context.run_with_hooks(fetches=w,
feed_dict={c: 0.1})

with tf.MonitoredSession() as session:
while not session.should_stop():
a = session.run_step_fn(step_fn)

Hooks interact with the run_with_hooks() call inside the step_fn as they do with a call.

Returns the returned value of step_fn.

StopIteration if step_fn has called request_stop(). It may be caught by with tf.MonitoredSession() to close the session.
ValueError if step_fn doesn't have a single argument called step_context. It may also optionally have self for cases when it belongs to an object.


View source


View source


View source