View source on GitHub
|
A CheckpointManager that also exports SavedModels.
tfm.core.savedmodel_checkpoint_manager.SavedModelCheckpointManager(
checkpoint: tf.train.Checkpoint,
directory: str,
max_to_keep: int,
modules_to_export: Optional[Mapping[str, tf.Module]] = None,
keep_checkpoint_every_n_hours: Optional[int] = None,
checkpoint_name: str = 'ckpt',
step_counter: Optional[tf.Variable] = None,
checkpoint_interval: Optional[int] = None,
init_fn: Optional[Callable[[], None]] = None
)
Attributes | |
|---|---|
checkpoint
|
Returns the tf.train.Checkpoint object.
|
checkpoint_interval
|
|
checkpoints
|
A list of managed checkpoints.
Note that checkpoints saved due to |
directory
|
|
latest_checkpoint
|
The prefix of the most recent checkpoint in directory.
Equivalent to Suitable for passing to |
latest_savedmodel
|
The path of the most recent SavedModel in directory.
|
modules_to_export
|
|
savedmodels
|
A list of managed SavedModels. |
Methods
get_existing_savedmodels
get_existing_savedmodels() -> List[str]
Gets a list of all existing SavedModel paths in directory.
| Returns | |
|---|---|
| A list of all existing SavedModel paths. |
get_savedmodel_number_from_path
get_savedmodel_number_from_path(
savedmodel_path: str
) -> Union[int, None]
Gets the savedmodel_number/checkpoint_number from savedmodel filepath.
The savedmodel_number is global step when using with orbit controller.
| Args | |
|---|---|
savedmodel_path
|
savedmodel directory path. |
| Returns | |
|---|---|
| Savedmodel number or None if no matched pattern found in savedmodel path. |
restore_or_initialize
restore_or_initialize()
Restore items in checkpoint from the latest checkpoint file.
This method will first try to restore from the most recent checkpoint in
directory. If no checkpoints exist in directory, and init_fn is
specified, this method will call init_fn to do customized
initialization. This can be used to support initialization from pretrained
models.
Note that unlike tf.train.Checkpoint.restore(), this method doesn't return
a load status object that users can run assertions on
(e.g. assert_consumed()). Thus to run assertions, users should directly use
tf.train.Checkpoint.restore() method.
| Returns | |
|---|---|
| The restored checkpoint path if the lastest checkpoint is found and restored. Otherwise None. |
save
save(
checkpoint_number: Optional[int] = None,
check_interval: bool = True,
options: Optional[tf.train.CheckpointOptions] = None
)
See base class.
savedmodels_iterator
savedmodels_iterator(
min_interval_secs: float = 0,
timeout: Optional[float] = None,
timeout_fn: Optional[Callable[[], bool]] = None
)
Continuously yield new SavedModel files as they appear.
The iterator only checks for new savedmodels when control flow has been
reverted to it. The logic is same to the train.checkpoints_iterator.
| Args | |
|---|---|
min_interval_secs
|
The minimum number of seconds between yielding savedmodels. |
timeout
|
The maximum number of seconds to wait between savedmodels. If
left as None, then the process will wait indefinitely.
|
timeout_fn
|
Optional function to call after a timeout. If the function returns True, then it means that no new savedmodels will be generated and the iterator will exit. The function is called with no arguments. |
| Yields | |
|---|---|
| String paths to latest SavedModel files as they arrive. |
sync
sync()
Wait for any outstanding save or restore operations.
wait_for_new_savedmodel
wait_for_new_savedmodel(
last_savedmodel: Optional[str] = None,
seconds_to_sleep: float = 1.0,
timeout: Optional[float] = None
) -> Union[str, None]
Waits until a new savedmodel file is found.
| Args | |
|---|---|
last_savedmodel
|
The last savedmodel path used or None if we're
expecting a savedmodel for the first time.
|
seconds_to_sleep
|
The number of seconds to sleep for before looking for a new savedmodel. |
timeout
|
The maximum number of seconds to wait. If left as None, then
the process will wait indefinitely.
|
| Returns | |
|---|---|
| A new savedmodel path, or None if the timeout was reached. |
View source on GitHub