tff.analytics.build_hierarchical_histogram_process

Creates an IterativeProcess for hierarchical histogram aggregation.

This function wraps the tff.computation created by the build_hierarchical_histogram_computation in an IterativeProcess that is compatible with tff.backends.mapreduce.MapReduceForm.

lower_bound A float specifying the lower bound of the data range.
upper_bound A float specifying the upper bound of the data range.
num_bins The integer number of bins to compute.
arity The branching factor of the tree. Defaults to 2.
clip_mechanism A str representing the clipping mechanism. Currently supported mechanisms are

  • 'sub-sampling': (Default) Uniformly sample up to max_records_per_user records without replacement from the client dataset.
  • 'distinct': Uniquify client dataset and uniformly sample up to max_records_per_user records without replacement from it.
max_records_per_user An int representing the maximum of records each user can include in their local histogram. Defaults to 10.
dp_mechanism A str representing the differentially private mechanism to use. Currently supported mechanisms are
  • 'no-noise': (Default) Tree aggregation mechanism without noise.
  • 'central-gaussian': Tree aggregation with central Gaussian mechanism.
  • 'distributed-discrete-gaussian': Tree aggregation mechanism with the distributed discrete Gaussian mechanism in "The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation. Peter Kairouz, Ziyu Liu, Thomas Steinke". noise_multiplier: A float specifying the noise multiplier (central noise stddev / L2 clip norm) for model updates. Defaults to 0.0.
expected_clients_per_round An int specifying the lower bound on the expected number of clients. Only needed when dp_mechanism is 'distributed-discrete-gaussian'. Defaults to 10.
bits A positive integer specifying the communication bit-width B (where 2**B will be the field size for SecAgg operations). Only needed when dp_mechanism is 'distributed-discrete-gaussian'. Please read the below precautions carefully and set bits accordingly. Otherwise, unexpected overflow or accuracy degradation might happen. (1) Should be in the inclusive range [1, 22] to avoid overflow inside secure aggregation; (2) Should be at least as large as log2(4 * sqrt(expected_clients_per_round)* noise_multiplier * l2_norm_bound + expected_clients_per_round * max_records_per_user) + 1 to avoid accuracy degradation caused by frequent modular clipping; (3) If the number of clients exceed expected_clients_per_round, overflow might happen.
enable_secure_sum Whether to aggregate client's update by secure sum or not. Defaults to True. When dp_mechanism is set to 'distributed-discrete-gaussian', enable_secure_sum must be True.

A federated computation that performs hierarchical histogram aggregation.