tff.program.FileProgramStateManager

A tff.program.ProgramStateManager that is backed by a file system.

Inherits From: ProgramStateManager

A tff.program.FileProgramStateManager is a utility for saving and loading program state to a file system in a federated program and is used to implement fault tolerance. In particular, it is intended to only restart the same simulation and run with the same version of TensorFlow Federated.

Program state is saved to the file system using the SavedModel (see tf.saved_model) format. When the program state is saved, each tff.program.MaterializableValueReference is materialized and each tff.Serializable is serialized. The structure of the program state is discarded, but is required to load the program state.

See https://www.tensorflow.org/guide/saved_model for more information about the SavedModel format.

root_dir A path on the file system to save program state. If this path does not exist it will be created.
prefix A string to use as the prefix for filenames.
keep_total An integer representing the total number of program states to keep. If the value is zero or smaller, there will be no limitation on how many program states to keep; if keep_every_k is 1, then all states will be kept.
keep_first A boolean indicating if the first program state should be kept, irrespective of whether it is the oldest program state or not. This is desirable in settings where you would like to ensure full reproducibility of the simulation, especially in settings where model weights or optimizer states are initialized randomly. By loading from the initial program state, one can avoid re-initializing and obtaining different results.
keep_every_k An integer representing how often program states should be kept. The latest version will always be kept. Defaults to 1. Even when keep_total is zero or negative, this setting will still be applied. To keep all states, set this to 1.

ValueError If root_dir is an empty string.

Methods

get_versions

View source

Returns a list of saved versions or None.

Returns
A list of saved versions or None if there is no saved program state.

load

View source

Returns the program state for the given version.

Args
version A integer representing the version of a saved program state.
structure The structure of the saved program state for the given version used to support serialization and deserialization of user-defined classes in the structure.

Raises
ProgramStateNotFoundError If there is no program state for the given version.

load_latest

View source

Returns the latest saved program state and version or (None, 0).

Args
structure The structure of the saved program state for the given version used to support serialization and deserailization of user-defined classes in the structure.

Returns
A tuple of the latest saved (program state, version) or (None, 0) if there is no latest saved program state.

remove_all

View source

Removes all program states.

save

View source

Saves program_state for the given version.

Args
program_state A tff.program.ProgramStateStructure to save.
version A strictly increasing integer representing the version of a saved program_state.

Raises
ProgramStateExistsError If there is already program state for the given version.