tff.aggregators.DifferentiallyPrivateFactory

UnweightedAggregationFactory for tensorflow_privacy DPQueries.

Inherits From: UnweightedAggregationFactory

Used in the notebooks

Used in the tutorials

The created tff.templates.AggregationProcess aggregates values placed at CLIENTS according to the provided DPQuery, and outputs the result placed at SERVER.

A DPQuery defines preprocessing to perform on each value, and postprocessing to perform on the aggregated, preprocessed values. Provided the preprocessed values ("records") are aggregated in a way that is consistent with the DPQuery, formal (epsilon, delta) privacy guarantees can be derived. This aggregation is controlled by record_aggregation_factory.

A simple summation (using the default tff.aggregators.SumFactory) is usually acceptable. Aggregations that change the records (such as compression or secure aggregation) may be allowed so long as they do not increase the sensitivity of the query. It is the users' responsibility to ensure that the mode of aggregation is consistent with the DPQuery. Note that the DPQuery's built-in aggregation functions (accumulate_preprocessed_record and merge_sample_states) are ignored in favor of the provided aggregator.

The state of the created AggregationProcess contains a DPEvent released by the DPQuery that can be extracted using differential_privacy. extract_dp_event_from_state.

query A tfp.SumAggregationDPQuery to perform private estimation.
record_aggregation_factory A tff.aggregators.UnweightedAggregationFactory to aggregate values after preprocessing by the query. If None, defaults to tff.aggregators.SumFactory. The provided factory is assumed to implement a sum, and to have the property that it does not increase the sensitivity of the query - typically this means that it should not increase the l2 norm of the records when aggregating.

TypeError If query is not an instance of tfp.SumAggregationDPQuery or record_aggregation_factory is not an instance of tff.aggregators.UnweightedAggregationFactory.

Methods

create

View source

Creates a tff.aggregators.AggregationProcess without weights.

The provided value_type is a non-federated tff.Type, that is, not a tff.FederatedType.

The returned tff.aggregators.AggregationProcess will be created for aggregation of values matching value_type placed at tff.CLIENTS. That is, its next method will expect type <S@SERVER, {value_type}@CLIENTS>, where S is the unplaced return type of its initialize method.

Args
value_type A non-federated tff.Type.

Returns
A tff.templates.AggregationProcess.

gaussian_adaptive

View source

DifferentiallyPrivateFactory with adaptive clipping and Gaussian noise.

Performs adaptive clipping and addition of Gaussian noise for differentially private learning. For details of the DP algorithm see McMahan et. al (2017) https://arxiv.org/abs/1710.06963 The adaptive clipping uses the geometric method described in Thakkar et al. (2019) https://arxiv.org/abs/1905.03871

The adaptive clipping parameters have been chosen to yield a process that starts small and adapts relatively quickly to the median, without using much of the privacy budget. This works well on most problems.

Args
noise_multiplier A float specifying the noise multiplier for the Gaussian mechanism for model updates. A value of 1.0 or higher may be needed for strong privacy. See above mentioned papers to compute (epsilon, delta) privacy guarantee. Note that this is the effective total noise multiplier, accounting for the privacy loss due to adaptive clipping. The noise actually added to the aggregated values will be slightly higher.
clients_per_round A float specifying the expected number of clients per round. Must be positive.
initial_l2_norm_clip The initial value of the adaptive clipping norm.
target_unclipped_quantile The quantile to which the clipping norm should adapt.
learning_rate The learning rate for the adaptive clipping process.
clipped_count_stddev The stddev of the noise added to the clipped counts in the adaptive clipping algorithm. If None, defaults to 0.05 * clients_per_round (unless noise_multiplier is 0, in which case it is also 0).

Returns
A DifferentiallyPrivateFactory with adaptive clipping and Gaussian noise.

gaussian_fixed

View source

DifferentiallyPrivateFactory with fixed clipping and Gaussian noise.

Performs fixed clipping and addition of Gaussian noise for differentially private learning. For details of the DP algorithm see McMahan et. al (2017) https://arxiv.org/abs/1710.06963

Args
noise_multiplier A float specifying the noise multiplier for the Gaussian mechanism for model updates. A value of 1.0 or higher may be needed for strong privacy. See above mentioned paper to compute (epsilon, delta) privacy guarantee.
clients_per_round A float specifying the expected number of clients per round. Must be positive.
clip The value of the clipping norm.

Returns
A DifferentiallyPrivateFactory with fixed clipping and Gaussian noise.

tree_adaptive

View source

DifferentiallyPrivateFactory with adaptive clipping and tree aggregation.

Performs clipping on client, averages clients records, and adds noise for differential privacy. The noise is estimated based on tree aggregation for the cumulative summation over rounds, and then take the residual between the current round and the previous round. Combining this aggregator with a SGD optimizer on server can be used to implement the DP-FTRL algorithm in "Practical and Private (Deep) Learning without Sampling or Shuffling" (https://arxiv.org/abs/2103.00039).

The standard deviation of the Gaussian noise added at each tree node is l2_norm_clip * noise_multiplier. Note that noise is added during summation of client model updates per round, before normalization (the noise will be scaled down when dividing by clients_per_round). Thus noise_multiplier can be used to compute the (epsilon, delta) privacy guarantee as described in the paper.

The l2_norm_clip is estimated and periodically reset for tree aggregation based on "Differentially Private Learning with Adaptive Clipping" (https://arxiv.org/abs/1905.03871).

Args
noise_multiplier Noise multiplier for the Gaussian noise in tree aggregation. Must be non-negative, zero means no noise is applied.
clients_per_round A positive number specifying the expected number of clients per round.
record_specs The specs of client results to be aggregated.
initial_l2_norm_clip The value of the initial clipping norm. Must be positive.
restart_warmup Restart the tree and adopt the estimated clip norm at the end of restart_warmup times of calling next.
restart_frequency Restart the tree and adopt the estimated clip norm every restart_frequency times of calling next.
target_unclipped_quantile The desired quantile of updates which should be unclipped.
clip_learning_rate The learning rate for the clipping norm adaptation. With geometric updating, a rate of r means that the clipping norm will change by a maximum factor of exp(r) at each round.
clipped_count_stddev The stddev of the noise added to the clipped_count. If None, set to clients_per_round / 20.
noise_seed Random seed for the Gaussian noise generator. If None, a nondeterministic seed based on system time will be generated when initialize.

Returns
A DifferentiallyPrivateFactory with Gaussian noise by tree aggregation.

tree_aggregation

View source

DifferentiallyPrivateFactory with tree aggregation noise.

Performs clipping on client, averages clients records, and adds noise for differential privacy. The noise is estimated based on tree aggregation for the cumulative summation over rounds, and then take the residual between the current round and the previous round. Combining this aggregator with a SGD optimizer on server can be used to implement the DP-FTRL algorithm in "Practical and Private (Deep) Learning without Sampling or Shuffling" (https://arxiv.org/abs/2103.00039).

The standard deviation of the Gaussian noise added at each tree node is l2_norm_clip * noise_multiplier. Note that noise is added during summation of client model updates per round, before normalization (the noise will be scaled down when dividing by clients_per_round). Thus noise_multiplier can be used to compute the (epsilon, delta) privacy guarantee as described in the paper.

Args
noise_multiplier Noise multiplier for the Gaussian noise in tree aggregation. Must be non-negative, zero means no noise is applied.
clients_per_round A positive number specifying the expected number of clients per round.
l2_norm_clip The value of the clipping norm. Must be positive.
record_specs The specs of client results to be aggregated.
noise_seed Random seed for the Gaussian noise generator. If None, a nondeterministic seed based on system time will be generated.
use_efficient If true, use the efficient tree aggregation algorithm based on the paper "Efficient Use of Differentially Private Binary Trees".
record_aggregation_factory An optional tff.aggregators.UnweightedAggregationFactory to aggregate values after preprocessing by the query. See the init method for more details.

Returns
A DifferentiallyPrivateFactory with Gaussian noise by tree aggregation.