tf_privacy.DPQuery

Interface for differentially private query mechanisms.

Differential privacy is achieved by processing records to bound sensitivity, accumulating the processed records (usually by summing them) and then adding noise to the aggregated result. The process can be repeated to compose applications of the same mechanism, possibly with different parameters.

The DPQuery interface specifies a functional approach to this process. A global state maintains state that persists across applications of the mechanism. For each application, the following steps are performed:

Use the global state to derive parameters to use for the next sample of records.
Initialize a sample state that will accumulate processed records.
For each record: a. Process the record. b. Accumulate the record into the sample state.
Get the result of the mechanism, possibly updating the global state to use in the next application.
Derive metrics from the global state.

Here is an example using the GaussianSumQuery. Assume there is some function records_for_round(round) that returns an iterable of records to use on some round.

dp_query = tensorflow_privacy.GaussianSumQuery(
    l2_norm_clip=1.0, stddev=1.0)
global_state = dp_query.initial_global_state()

for round in range(num_rounds):
  sample_params = dp_query.derive_sample_params(global_state)
  sample_state = dp_query.initial_sample_state()
  for record in records_for_round(round):
    sample_state = dp_query.accumulate_record(
        sample_params, sample_state, record)

  result, global_state = dp_query.get_noised_result(
      sample_state, global_state)
  metrics = dp_query.derive_metrics(global_state)

  # Do something with result and metrics...

Methods

`accumulate_preprocessed_record`

View source

@abc.abstractmethod
accumulate_preprocessed_record(
    sample_state, preprocessed_record
)

Accumulates a single preprocessed record into the sample state.

This method is intended to only do simple aggregation, typically just a sum. In the future, we might remove this method and replace it with a way to declaratively specify the type of aggregation required.

Args
`sample_state`	The current sample state. In standard DP-SGD training, the accumulated sum of previous clipped microbatch gradients.
`preprocessed_record`	The preprocessed record to accumulate.

Returns
The updated sample state.

`accumulate_record`

View source

accumulate_record(
    params, sample_state, record
)

Accumulates a single record into the sample state.

This is a helper method that simply delegates to preprocess_record and accumulate_preprocessed_record for the common case when both of those functions run on a single device. Typically this will be a simple sum.

Args
`params`	The parameters for the sample. In standard DP-SGD training, the clipping norm for the sample's microbatch gradients (i.e., a maximum norm magnitude to which each gradient is clipped)
`sample_state`	The current sample state. In standard DP-SGD training, the accumulated sum of previous clipped microbatch gradients.
`record`	The record to accumulate. In standard DP-SGD training, the gradient computed for the examples in one microbatch, which may be the gradient for just one example (for size 1 microbatches).

Returns
The updated sample state. In standard DP-SGD training, the set of previous microbatch gradients with the addition of the record argument.

`derive_metrics`

View source

derive_metrics(
    global_state
)

Derives metric information from the current global state.

Any metrics returned should be derived only from privatized quantities.

Args
`global_state`	The global state from which to derive metrics.

Returns
A `collections.OrderedDict` mapping string metric names to tensor values.

`derive_sample_params`

View source

derive_sample_params(
    global_state
)

Given the global state, derives parameters to use for the next sample.

For example, if the mechanism needs to clip records to bound the norm, the clipping norm should be part of the sample params. In a distributed context, this is the part of the state that would be sent to the workers so they can process records.

Args
`global_state`	The current global state.

Returns
Parameters to use to process records in the next sample.

`get_noised_result`

View source

@abc.abstractmethod
get_noised_result(
    sample_state, global_state
)

Gets the query result after all records of sample have been accumulated.

The global state can also be updated for use in the next application of the DP mechanism.

Args
`sample_state`	The sample state after all records have been accumulated. In standard DP-SGD training, the accumulated sum of clipped microbatch gradients (in the special case of microbatches of size 1, the clipped per-example gradients).
`global_state`	The global state, storing long-term privacy bookkeeping.

Returns

Returns
A tuple `(result, new_global_state, event)` where: `result` is the result of the query, `new_global_state` is the updated global state, and `event` is the `DpEvent` that occurred. In standard DP-SGD training, the result is a gradient update comprising a noised average of the clipped gradients in the sample state---with the noise and averaging performed in a manner that guarantees differential privacy.

A tuple (result, new_global_state, event) where:

result is the result of the query,
new_global_state is the updated global state, and
event is the DpEvent that occurred. In standard DP-SGD training, the result is a gradient update comprising a noised average of the clipped gradients in the sample state---with the noise and averaging performed in a manner that guarantees differential privacy.

`initial_global_state`

View source

initial_global_state()

Returns the initial global state for the DPQuery.

The global state contains any state information that changes across repeated applications of the mechanism. The default implementation returns just an empty tuple for implementing classes that do not have any persistent state.

This object must be processable via tf.nest.map_structure.

Returns
The global state.

`initial_sample_state`

View source

@abc.abstractmethod
initial_sample_state(
    template=None
)

Returns an initial state to use for the next sample.

For typical DPQuery classes that are aggregated by summation, this should return a nested structure of zero tensors of the appropriate shapes, to which processed records will be aggregated.

Args
`template`	A nested structure of tensors, TensorSpecs, or numpy arrays used as a template to create the initial sample state. It is assumed that the leaves of the structure are python scalars or some type that has properties `shape` and `dtype`.

Returns: An initial sample state.

`merge_sample_states`

View source

@abc.abstractmethod
merge_sample_states(
    sample_state_1, sample_state_2
)

Merges two sample states into a single state.

This can be useful if aggregation is performed hierarchically, where multiple sample states are used to accumulate records and then hierarchically merged into the final accumulated state. Typically this will be a simple sum.

Args
`sample_state_1`	The first sample state to merge.
`sample_state_2`	The second sample state to merge.

Returns
The merged sample state.

`preprocess_record`

View source

preprocess_record(
    params, record
)

Preprocesses a single record.

This preprocessing is applied to one client's record, e.g. selecting vectors and clipping them to a fixed L2 norm. This method can be executed in a separate TF session, or even on a different machine, so it should not depend on any TF inputs other than those provided as input arguments. In particular, implementations should avoid accessing any TF tensors or variables that are stored in self.

Args
`params`	The parameters for the sample. In standard DP-SGD training, the clipping norm for the sample's microbatch gradients (i.e., a maximum norm magnitude to which each gradient is clipped)
`record`	The record to be processed. In standard DP-SGD training, the gradient computed for the examples in one microbatch, which may be the gradient for just one example (for size 1 microbatches).

Returns
A structure of tensors to be aggregated.

tf_privacy.DPQuery Stay organized with collections Save and categorize content based on your preferences.

Methods

accumulate_preprocessed_record

accumulate_record

derive_metrics

derive_sample_params

get_noised_result

initial_global_state

initial_sample_state

merge_sample_states

preprocess_record

tf_privacy.DPQuery

`accumulate_preprocessed_record`

`accumulate_record`

`derive_metrics`

`derive_sample_params`

`get_noised_result`

`initial_global_state`

`initial_sample_state`

`merge_sample_states`

`preprocess_record`