View source on GitHub |
Implements DPQuery for adding correlated noise through tree structure.
Inherits From: SumAggregationDPQuery
, DPQuery
tf_privacy.TreeResidualSumQuery(
record_specs, noise_generator, clip_fn, clip_value, use_efficient=True
)
Clips and sums records in current sample xi = sum{j=0}^{n-1} x_{i,j}; returns the current sample adding the noise residual from tree aggregation. The returned value is conceptually equivalent to the following: calculates cumulative sum of samples over time si = sum{k=0}^i x_i (instead of only current sample) with added noise by tree aggregation protocol that is proportional to log(T), T being the number of times the query is called; r eturns the residual between the current noised cumsum noised(si) and the previous one noised(s{i-1}) when the query is called.
This can be used as a drop-in replacement for GaussianSumQuery
, and can
offer stronger utility/privacy tradeoffs when aplification-via-sampling is not
possible, or when privacy epsilon is relativly large. This may result in
more noise by a log(T) factor in each individual estimate of x_i, but if the
x_i are used in the underlying code to compute cumulative sums, the noise in
those sums can be less. That is, this allows us to adapt code that was written
to use a regular SumQuery
to benefit from the tree aggregation protocol.
Combining this query with a SGD optimizer can be used to implement the DP-FTRL algorithm in "Practical and Private (Deep) Learning without Sampling or Shuffling".
Example usage | |
---|---|
query = TreeResidualSumQuery(...) global_state = query.initial_global_state() params = query.derive_sample_params(global_state) for i, samples in enumerate(streaming_samples): sample_state = query.initial_sample_state(samples[0]) # Compute xi = sum{j=0}^{n-1} x_{i,j} for j,sample in enumerate(samples): sample_state = query.accumulate_record(params, sample_state, sample) # noised_sum is privatized estimate of x_i by conceptually postprocessing # noised cumulative sum s_i noised_sum, global_state, event = query.get_noised_result( sample_state, global_state) |
Args | |
---|---|
record_specs
|
A nested structure of tf.TensorSpec s specifying structure
and shapes of records.
|
noise_generator
|
tree_aggregation.ValueGenerator to generate the noise
value for a tree node. Should be coupled with clipping norm to guarantee
privacy.
|
clip_fn
|
Callable that specifies clipping function. Input to clip is a flat list of vars in a record. |
clip_value
|
Float indicating the value at which to clip the record. |
use_efficient
|
Boolean indicating the usage of the efficient tree aggregation algorithm based on the paper "Efficient Use of Differentially Private Binary Trees". |
Attributes | |
---|---|
clip_fn
|
Callable that specifies clipping function. clip_fn receives two
arguments: a flat list of vars in a record and a clip_value to clip the
corresponding record, e.g. clip_fn(flat_record, clip_value).
|
clip_value
|
float indicating the value at which to clip the record. |
record_specs
|
A nested structure of tf.TensorSpec s specifying structure
and shapes of records.
|
tree_aggregator
|
tree_aggregation.TreeAggregator initialized with user
defined noise_generator . noise_generator is a
tree_aggregation.ValueGenerator to generate the noise value for a tree
node. Noise stdandard deviation is specified outside the dp_query by the
user when defining noise_fn and should have order
O(clip_norm*log(T)/eps) to guarantee eps-DP.
|
Child Classes
Methods
accumulate_preprocessed_record
accumulate_preprocessed_record(
sample_state, preprocessed_record
)
Implements tensorflow_privacy.DPQuery.accumulate_preprocessed_record
.
accumulate_record
accumulate_record(
params, sample_state, record
)
Accumulates a single record into the sample state.
This is a helper method that simply delegates to preprocess_record
and
accumulate_preprocessed_record
for the common case when both of those
functions run on a single device. Typically this will be a simple sum.
Args | |
---|---|
params
|
The parameters for the sample. In standard DP-SGD training, the clipping norm for the sample's microbatch gradients (i.e., a maximum norm magnitude to which each gradient is clipped) |
sample_state
|
The current sample state. In standard DP-SGD training, the accumulated sum of previous clipped microbatch gradients. |
record
|
The record to accumulate. In standard DP-SGD training, the gradient computed for the examples in one microbatch, which may be the gradient for just one example (for size 1 microbatches). |
Returns | |
---|---|
The updated sample state. In standard DP-SGD training, the set of previous microbatch gradients with the addition of the record argument. |
build_l2_gaussian_query
@classmethod
build_l2_gaussian_query( clip_norm, noise_multiplier, record_specs, noise_seed=None, use_efficient=True )
Returns TreeResidualSumQuery
with L2 norm clipping and Gaussian noise.
Args | |
---|---|
clip_norm
|
Each record will be clipped so that it has L2 norm at most
clip_norm .
|
noise_multiplier
|
The effective noise multiplier for the sum of records.
Noise standard deviation is clip_norm*noise_multiplier . The value can
be used as the input of the privacy accounting functions in
analysis.tree_aggregation_accountant .
|
record_specs
|
A nested structure of tf.TensorSpec s specifying structure
and shapes of records.
|
noise_seed
|
Integer seed for the Gaussian noise generator. If None , a
nondeterministic seed based on system time will be generated.
|
use_efficient
|
Boolean indicating the usage of the efficient tree aggregation algorithm based on the paper "Efficient Use of Differentially Private Binary Trees". |
derive_metrics
derive_metrics(
global_state
)
Returns the clip norm as a metric.
derive_sample_params
derive_sample_params(
global_state
)
Implements tensorflow_privacy.DPQuery.derive_sample_params
.
get_noised_result
get_noised_result(
sample_state, global_state
)
Implements tensorflow_privacy.DPQuery.get_noised_result
.
Updates tree state, and returns residual of noised cumulative sum.
Args | |
---|---|
sample_state
|
Sum of clipped records for this round. |
global_state
|
Global state with current samples cumulative sum and tree state. |
Returns | |
---|---|
A tuple of (noised_cumulative_sum, new_global_state). |
initial_global_state
initial_global_state()
Implements tensorflow_privacy.DPQuery.initial_global_state
.
initial_sample_state
initial_sample_state(
template=None
)
Implements tensorflow_privacy.DPQuery.initial_sample_state
.
merge_sample_states
merge_sample_states(
sample_state_1, sample_state_2
)
Implements tensorflow_privacy.DPQuery.merge_sample_states
.
preprocess_record
preprocess_record(
params, record
)
Implements tensorflow_privacy.DPQuery.preprocess_record
.
Args | |
---|---|
params
|
clip_value for the record.
|
record
|
The record to be processed. |
Returns | |
---|---|
Structure of clipped tensors. |
preprocess_record_l2_impl
preprocess_record_l2_impl(
params, record
)
Clips the l2 norm, returning the clipped record and the l2 norm.
Args | |
---|---|
params
|
The parameters for the sample. |
record
|
The record to be processed. |
Returns | |
---|---|
A tuple (preprocessed_records, l2_norm) where preprocessed_records is
the structure of preprocessed tensors, and l2_norm is the total l2 norm
before clipping.
|
reset_l2_clip_gaussian_noise
reset_l2_clip_gaussian_noise(
global_state, clip_norm, stddev
)
reset_state
reset_state(
noised_results, global_state
)
Returns state after resetting the tree.
This function will be used in restart_query.RestartQuery
after calling
get_noised_result
when the restarting condition is met.
Args | |
---|---|
noised_results
|
Noised results returned by get_noised_result .
|
global_state
|
Updated global state returned by get_noised_result , which
records noise for the conceptual cumulative sum of the current leaf
node, and tree state for the next conceptual cumulative sum.
|
Returns | |
---|---|
New global state with zero noise and restarted tree state. |