tf.train.CheckpointManager
Stay organized with collections
Save and categorize content based on your preferences.
Deletes old checkpoints.
tf.train.CheckpointManager(
checkpoint, directory, max_to_keep, keep_checkpoint_every_n_hours=None,
checkpoint_name='ckpt'
)
Example usage:
import tensorflow as tf
checkpoint = tf.train.Checkpoint(optimizer=optimizer, model=model)
manager = tf.contrib.checkpoint.CheckpointManager(
checkpoint, directory="/tmp/model", max_to_keep=5)
status = checkpoint.restore(manager.latest_checkpoint)
while True:
# train
manager.save()
CheckpointManager
preserves its own state across instantiations (see the
__init__
documentation for details). Only one should be active in a
particular directory at a time.
Args |
checkpoint
|
The tf.train.Checkpoint instance to save and manage
checkpoints for.
|
directory
|
The path to a directory in which to write checkpoints. A
special file named "checkpoint" is also written to this directory (in a
human-readable text format) which contains the state of the
CheckpointManager .
|
max_to_keep
|
An integer, the number of checkpoints to keep. Unless
preserved by keep_checkpoint_every_n_hours , checkpoints will be
deleted from the active set, oldest first, until only max_to_keep
checkpoints remain. If None , no checkpoints are deleted and everything
stays in the active set. Note that max_to_keep=None will keep all
checkpoint paths in memory and in the checkpoint state protocol buffer
on disk.
|
keep_checkpoint_every_n_hours
|
Upon removal from the active set, a
checkpoint will be preserved if it has been at least
keep_checkpoint_every_n_hours since the last preserved checkpoint. The
default setting of None does not preserve any checkpoints in this way.
|
checkpoint_name
|
Custom name for the checkpoint file.
|
Raises |
ValueError
|
If max_to_keep is not a positive integer.
|
Attributes |
checkpoints
|
A list of managed checkpoints.
Note that checkpoints saved due to keep_checkpoint_every_n_hours will not
show up in this list (to avoid ever-growing filename lists).
|
latest_checkpoint
|
The prefix of the most recent checkpoint in directory .
Equivalent to tf.train.latest_checkpoint(directory) where directory is
the constructor argument to CheckpointManager .
Suitable for passing to tf.train.Checkpoint.restore to resume training.
|
Methods
save
View source
save(
checkpoint_number=None
)
Creates a new checkpoint and manages it.
Args |
checkpoint_number
|
An optional integer, or an integer-dtype Variable or
Tensor , used to number the checkpoint. If None (default),
checkpoints are numbered using checkpoint.save_counter . Even if
checkpoint_number is provided, save_counter is still incremented. A
user-provided checkpoint_number is not incremented even if it is a
Variable .
|
Returns |
The path to the new checkpoint. It is also recorded in the checkpoints
and latest_checkpoint properties.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2020-10-01 UTC.
[null,null,["Last updated 2020-10-01 UTC."],[],[],null,["# tf.train.CheckpointManager\n\n\u003cbr /\u003e\n\n|---------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|\n| [TensorFlow 2 version](/api_docs/python/tf/train/CheckpointManager) | [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/training/checkpoint_management.py#L496-L737) |\n\nDeletes old checkpoints.\n\n#### View aliases\n\n\n**Main aliases**\n\n\\`tf.contrib.checkpoint.CheckpointManager\\`\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.train.CheckpointManager`](/api_docs/python/tf/train/CheckpointManager), \\`tf.compat.v2.train.CheckpointManager\\`\n\n\u003cbr /\u003e\n\n tf.train.CheckpointManager(\n checkpoint, directory, max_to_keep, keep_checkpoint_every_n_hours=None,\n checkpoint_name='ckpt'\n )\n\n#### Example usage:\n\n import tensorflow as tf\n checkpoint = tf.train.Checkpoint(optimizer=optimizer, model=model)\n manager = tf.contrib.checkpoint.CheckpointManager(\n checkpoint, directory=\"/tmp/model\", max_to_keep=5)\n status = checkpoint.restore(manager.latest_checkpoint)\n while True:\n # train\n manager.save()\n\n`CheckpointManager` preserves its own state across instantiations (see the\n`__init__` documentation for details). Only one should be active in a\nparticular directory at a time.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `checkpoint` | The [`tf.train.Checkpoint`](../../tf/train/Checkpoint) instance to save and manage checkpoints for. |\n| `directory` | The path to a directory in which to write checkpoints. A special file named \"checkpoint\" is also written to this directory (in a human-readable text format) which contains the state of the `CheckpointManager`. |\n| `max_to_keep` | An integer, the number of checkpoints to keep. Unless preserved by `keep_checkpoint_every_n_hours`, checkpoints will be deleted from the active set, oldest first, until only `max_to_keep` checkpoints remain. If `None`, no checkpoints are deleted and everything stays in the active set. Note that `max_to_keep=None` will keep all checkpoint paths in memory and in the checkpoint state protocol buffer on disk. |\n| `keep_checkpoint_every_n_hours` | Upon removal from the active set, a checkpoint will be preserved if it has been at least `keep_checkpoint_every_n_hours` since the last preserved checkpoint. The default setting of `None` does not preserve any checkpoints in this way. |\n| `checkpoint_name` | Custom name for the checkpoint file. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|--------------|---------------------------------------------|\n| `ValueError` | If `max_to_keep` is not a positive integer. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Attributes ---------- ||\n|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `checkpoints` | A list of managed checkpoints. \u003cbr /\u003e Note that checkpoints saved due to `keep_checkpoint_every_n_hours` will not show up in this list (to avoid ever-growing filename lists). |\n| `latest_checkpoint` | The prefix of the most recent checkpoint in `directory`. \u003cbr /\u003e Equivalent to [`tf.train.latest_checkpoint(directory)`](../../tf/train/latest_checkpoint) where `directory` is the constructor argument to `CheckpointManager`. Suitable for passing to [`tf.train.Checkpoint.restore`](../../tf/train/Checkpoint#restore) to resume training. |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `save`\n\n[View source](https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/training/checkpoint_management.py#L679-L737) \n\n save(\n checkpoint_number=None\n )\n\nCreates a new checkpoint and manages it.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|---------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `checkpoint_number` | An optional integer, or an integer-dtype `Variable` or `Tensor`, used to number the checkpoint. If `None` (default), checkpoints are numbered using `checkpoint.save_counter`. Even if `checkpoint_number` is provided, `save_counter` is still incremented. A user-provided `checkpoint_number` is not incremented even if it is a `Variable`. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| The path to the new checkpoint. It is also recorded in the `checkpoints` and `latest_checkpoint` properties. ||\n\n\u003cbr /\u003e"]]