Builds the tff.Computation
for heavy-hitters discovery with IBLT.
tff.analytics.heavy_hitters.iblt.build_iblt_computation(
*,
capacity: int = 1000,
max_string_length: int = 10,
repetitions: int = 3,
seed: int = 0,
max_heavy_hitters: Optional[int] = None,
max_words_per_user: Optional[int] = None,
k_anonymity: int = 1,
secure_sum_bitwidth: Optional[int] = None,
batch_size: int = 1,
multi_contribution: bool = True,
string_postprocessor: Optional[Callable[[tf.Tensor], tf.Tensor]] = None,
decode_iblt_fn: Optional[Callable[..., Tuple[tf.Tensor, tf.Tensor, tf.Tensor, tf.Tensor]]] = None
) -> tff.Computation
Used in the notebooks
Args |
capacity
|
The capacity of the IBLT sketch. Defaults to 1000 .
|
max_string_length
|
The maximum length of a string in the IBLT. Defaults to
10 . Must be positive.
|
repetitions
|
The number of repetitions in IBLT data structure (must be >=
3). Defaults to 3 . Must be at least 3 .
|
seed
|
An integer seed for hash functions. Defaults to 0 .
|
max_heavy_hitters
|
The maximum number of items to return. If the decoded
results have more than this number of items, will order decreasingly by
the estimated counts and return the top max_heavy_hitters items. Default
max_heavy_hitters == None , which means to return all the heavy hitters
in the result.
|
max_words_per_user
|
The maximum number of words each client is allowed to
contribute. If not None , must be a positive integer. Defaults to None ,
which means all the clients contribute all their words.
|
k_anonymity
|
Only return words that appear in at least k clients. Must be a
positive integer. Defaults to 1 .
|
secure_sum_bitwidth
|
The bitwidth used for federated secure sum. The default
value is None , which disables secure sum. If not None , must be in the
range [1,62] . Note that when this parameter is not None , the IBLT
sketches are summed via federated_secure_modular_sum with modulus equal
to IBLT's default field size, and other values (client count, string count
tensor) are aggregated via federated_secure_sum with
max_input=2**secure_sum_bitwidth - 1 .
|
batch_size
|
The number of elements in each batch of the dataset. Defaults
to 1 , means the input dataset is processed by
tf.data.Dataset.batch(1) . Must be a positive.
|
multi_contribution
|
Whether each client is allowed to contribute multiple
counts or only a count of one for each unique word. Defaults to True .
|
string_postprocessor
|
A callable function that is run after strings are
decoded from the IBLT in order to postprocess them. It should accept a
single string tensor and output a single string tensor of the same shape.
If None , no postprocessing is done.
|
decode_iblt_fn
|
A function to decode key-value pairs from an IBLT sketch.
Defaults to None , in this case decode_iblt_fn will be set to
iblt.decode_iblt_tf .
|
Raises |
ValueError
|
if parameters don't meet expectations.
|