tff.learning.programs.EvaluationManager
Stay organized with collections
Save and categorize content based on your preferences.
A manager for facilitating multiple in-progress evaluations.
tff.learning.programs.EvaluationManager(
data_source: tff.program.FederatedDataSource
,
aggregated_metrics_manager: Optional[release_manager.ReleaseManager[release_manager.ReleasableStructure,
int]],
create_state_manager_fn: Callable[[str], tff.program.FileProgramStateManager
],
create_process_fn: Callable[[str], tuple[learning_process.LearningProcess, Optional[
release_manager.ReleaseManager[release_manager.ReleasableStructure, int]]]],
cohort_size: int,
duration: datetime.timedelta = datetime.timedelta(hours=24)
)
This manager performs three responsbilities:
- Prepares, starts and tracks new evaluation loops. This involves creating
a new evaluation process and state manager for that process, adding
the new process to the list of tracked inprocess evaluations, and
creating a new
asyncio.Task
to run the evaluation loop.
- Record evaluations that have finished. This removes the evaluation from
the list of in-progresss evaluations.
- If the program has restarted, load the most recent state of in-progress
evaluations and restart each of the evaluations.
This class uses N + 1 tff.program.ProgramStateManagers
to enable resumable
evaluations.
- The first state managers is for this class itself, and manages the list of
in-progress evaluations via two tensor objects. Tensor objects must be
used (rather than Python lists) because
tff.program.FileProgramStateManager
does not support state objects that
change Python structure across versions (e.g. to load the next version,
we must known its shape, but after a restart we don't know).
Alternatively, we can use tensor or ndarray objects with shape [None]
to
support changing shapes of structure's leaf elements.
- The next N state managers manage the cross-round metric aggregation for
each evaluation process started. One for each evaluation process.
Args |
data_source
|
A tff.program.FederatedDataSource that the manager will use
to create iterators for evaluation loops.
|
aggregated_metrics_manager
|
A tff.program.ReleaseManager for releasing
the total aggregated metrics at the end of the evaluation loop.
|
create_state_manager_fn
|
A callable that returns a
tff.program.FileProgramStateManager that will be used to create the
overall evaluation manager's state manager, and each per evaluation loop
state manager that will enable resuming and checkpointing.
|
create_process_fn
|
A callable that returns a 2-tuple of
tff.learning.templates.LearningProcess and
tff.program.ReleaseManager for the per-evaluation round metrics
releasing that will used be to start each evaluation loop.
|
cohort_size
|
An integer denoting the size of each evaluation round to
select from the iterator created from data_source .
|
duration
|
The datetime.timedelta duration to run each evaluation loop.
|
Attributes |
aggregated_metrics_manager
|
A manager for releasing metrics at the end of each evaluation loop.
|
cohort_size
|
The size of each evaluation round to select from the iterator.
|
create_process_fn
|
A callable that returns a process and manager each evaluation loop.
|
create_state_manager_fn
|
A callable that returns a program state manager each evaluation loop.
|
data_source
|
A data source used to create iterators each evaluation loop.
|
duration
|
The duration to run each evaluation loop.
|
Methods
record_evaluations_finished
View source
record_evaluations_finished(
train_round
)
Removes evaluation for train_round
from the internal state manager.
Args |
train_round
|
The integer round number of the training round that has
finished evaluation.
|
Raises |
RuntimeError
|
If train_round was not currently being tracked as an
in-progress evaluation.
|
resume_from_previous_state
View source
resume_from_previous_state()
Load the most recent state and restart in-progress evaluations.
start_evaluation
View source
start_evaluation(
train_round, start_timestamp_seconds, model_weights
)
Starts a new evaluation loop for the incoming model_weights.
wait_for_evaluations_to_finish
View source
wait_for_evaluations_to_finish()
Creates an awaitable that blocks until all evaluations are finished.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-09-20 UTC.
[null,null,["Last updated 2024-09-20 UTC."],[],[],null,["# tff.learning.programs.EvaluationManager\n\n\u003cbr /\u003e\n\n|-------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/federated/blob/v0.87.0 Version 2.0, January 2004 Licensed under the Apache License, Version 2.0 (the) |\n\nA manager for facilitating multiple in-progress evaluations. \n\n tff.learning.programs.EvaluationManager(\n data_source: ../../../tff/program/FederatedDataSource,\n aggregated_metrics_manager: Optional[release_manager.ReleaseManager[release_manager.ReleasableStructure,\n int]],\n create_state_manager_fn: Callable[[str], ../../../tff/program/FileProgramStateManager],\n create_process_fn: Callable[[str], tuple[learning_process.LearningProcess, Optional[\n release_manager.ReleaseManager[release_manager.ReleasableStructure, int]]]],\n cohort_size: int,\n duration: datetime.timedelta = datetime.timedelta(hours=24)\n )\n\nThis manager performs three responsbilities:\n\n1. Prepares, starts and tracks new evaluation loops. This involves creating a new evaluation process and state manager for that process, adding the new process to the list of tracked inprocess evaluations, and creating a new `asyncio.Task` to run the evaluation loop.\n2. Record evaluations that have finished. This removes the evaluation from the list of in-progresss evaluations.\n3. If the program has restarted, load the most recent state of in-progress evaluations and restart each of the evaluations.\n\nThis class uses N + 1 `tff.program.ProgramStateManagers` to enable resumable\nevaluations.\n\n- The first state managers is for this class itself, and manages the list of in-progress evaluations via two tensor objects. Tensor objects must be used (rather than Python lists) because [`tff.program.FileProgramStateManager`](../../../tff/program/FileProgramStateManager) does not support state objects that change Python *structure* across versions (e.g. to load the next version, we must known its shape, but after a restart we don't know). Alternatively, we can use tensor or ndarray objects with shape `[None]` to support changing shapes of structure's leaf elements.\n- The next N state managers manage the cross-round metric aggregation for each evaluation process started. One for each evaluation process.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `data_source` | A [`tff.program.FederatedDataSource`](../../../tff/program/FederatedDataSource) that the manager will use to create iterators for evaluation loops. |\n| `aggregated_metrics_manager` | A [`tff.program.ReleaseManager`](../../../tff/program/ReleaseManager) for releasing the total aggregated metrics at the end of the evaluation loop. |\n| `create_state_manager_fn` | A callable that returns a [`tff.program.FileProgramStateManager`](../../../tff/program/FileProgramStateManager) that will be used to create the overall evaluation manager's state manager, and each per evaluation loop state manager that will enable resuming and checkpointing. |\n| `create_process_fn` | A callable that returns a 2-tuple of [`tff.learning.templates.LearningProcess`](../../../tff/learning/templates/LearningProcess) and [`tff.program.ReleaseManager`](../../../tff/program/ReleaseManager) for the per-evaluation round metrics releasing that will used be to start each evaluation loop. |\n| `cohort_size` | An integer denoting the size of each evaluation round to select from the iterator created from `data_source`. |\n| `duration` | The `datetime.timedelta` duration to run each evaluation loop. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Attributes ---------- ||\n|------------------------------|-----------------------------------------------------------------------|\n| `aggregated_metrics_manager` | A manager for releasing metrics at the end of each evaluation loop. |\n| `cohort_size` | The size of each evaluation round to select from the iterator. |\n| `create_process_fn` | A callable that returns a process and manager each evaluation loop. |\n| `create_state_manager_fn` | A callable that returns a program state manager each evaluation loop. |\n| `data_source` | A data source used to create iterators each evaluation loop. |\n| `duration` | The duration to run each evaluation loop. |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `record_evaluations_finished`\n\n[View source](https://github.com/tensorflow/federated/blob/v0.87.0\nVersion 2.0, January 2004\nLicensed under the Apache License, Version 2.0 (the) \n\n record_evaluations_finished(\n train_round\n )\n\nRemoves evaluation for `train_round` from the internal state manager.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|---------------|------------------------------------------------------------------------------|\n| `train_round` | The integer round number of the training round that has finished evaluation. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ||\n|----------------|--------------------------------------------------------------------------------|\n| `RuntimeError` | If `train_round` was not currently being tracked as an in-progress evaluation. |\n\n\u003cbr /\u003e\n\n### `resume_from_previous_state`\n\n[View source](https://github.com/tensorflow/federated/blob/v0.87.0\nVersion 2.0, January 2004\nLicensed under the Apache License, Version 2.0 (the) \n\n resume_from_previous_state()\n\nLoad the most recent state and restart in-progress evaluations.\n\n### `start_evaluation`\n\n[View source](https://github.com/tensorflow/federated/blob/v0.87.0\nVersion 2.0, January 2004\nLicensed under the Apache License, Version 2.0 (the) \n\n start_evaluation(\n train_round, start_timestamp_seconds, model_weights\n )\n\nStarts a new evaluation loop for the incoming model_weights.\n\n### `wait_for_evaluations_to_finish`\n\n[View source](https://github.com/tensorflow/federated/blob/v0.87.0\nVersion 2.0, January 2004\nLicensed under the Apache License, Version 2.0 (the) \n\n wait_for_evaluations_to_finish()\n\nCreates an awaitable that blocks until all evaluations are finished."]]