Creates a TFF computation to aggregate metrics via finalize_then_sample.

The returned federated TFF computation is a polymorphic computation that accepts unfinalized client metrics, and returns finalized, summed metrics placed at the server. Invoking the polymorphic computation will initiate tracing on the argument and will raise a ValueError if the keys (i.e., metric names) in metric_finalizers are not the same as those of the argument the polymorphic method is invoked on.

The returned computation is intended to be invoked on the output of tff.learning.models.VariableModel.report_local_unfinalized_metrics() when placed at CLIENTS. The output is computed by first finalizing each client's metrics locally, and then collecting metrics from at most sample_size clients at the SERVER. If more than sample_size clients participating, then sample_size clients are sampled (by reservoir sampling algorithm); otherwise, all clients' metrics are collected. Sampling is done in a "per-client" manner, i.e., a client, once sampled, will contribute all its metrics to the final result.

The collected metrics samples at SERVER has the same structure (i.e., same keys in a dictionary) as the client's local metrics, except that each leaf node contains a list of scalar metric values, where each value comes from a sampled client, e.g.,

  sampled_metrics_at_server = {
      'metric_a': [a1, a2, ...],
      'metric_b': [b1, b2, ...],

where "a1" and "b1" are from the same client (similary for "a2" and "b2" etc).

training_process = tff.learning.algorithms.build_weighted_fed_avg(
    model_fn=..., ..., metrics_aggregator=finalize_then_sample,
state = training_process.initialize()
for i in range(num_rounds):
  output =, client_data_at_round_i)
  state = output.state
  sampled_client_metrics = output.metrics['client_work']

metric_finalizers Either the result of tff.learning.models.VariableModel.metric_finalizers (an OrderedDict of callables) or the tff.learning.models.FunctionalModel.finalize_metrics method (a callable that takes an OrderedDict argument). If the former, the keys must be the same as the OrderedDict returned by tff.learning.models.VariableModel.report_local_unfinalized_metrics. If the later, the callable must compute over the same keyspace of the result returned by tff.learning.models.FunctionalModel.update_metrics_state.
local_unfinalized_metrics_type Unused, will be removed from the API in the future.
sample_size An integer specifying the number of clients sampled by the reservoir sampling algorithm. Metrics from the sampled clients are collected at the server. If the total number of participating clients are smaller than this value, then all clients' metrics are collected. Default value is 100.

A federated TFF computation that finalizes the unfinalized metrics from CLIENTS, samples the clients, and returns the sampled metrics at SERVER.

TypeError If the inputs are of the wrong types.
ValueError If sample_size is not positive.