tff.analytics.heavy_hitters.iblt.IbltDecoder
Stay organized with collections
Save and categorize content based on your preferences.
Decodes the strings and counts stored in an IBLT data structure.
tff.analytics.heavy_hitters.iblt.IbltDecoder(
iblt: tf.Tensor,
capacity: int,
string_max_bytes: int,
*,
encoding: tff.analytics.heavy_hitters.iblt.CharacterEncoding
= tff.analytics.heavy_hitters.iblt.CharacterEncoding.UTF8
,
seed: int = 0,
repetitions: int = DEFAULT_REPETITIONS,
hash_family: Optional[str] = None,
hash_family_params: Optional[dict[str, Union[int, float]]] = None,
field_size: int = DEFAULT_FIELD_SIZE
)
Args |
iblt
|
Tensor representing the IBLT computed by the IbltEncoder.
|
capacity
|
Number of distinct strings that we expect to be inserted.
|
string_max_bytes
|
Maximum length of a string in bytes that can be
inserted.
|
encoding
|
The character encoding of the string data to decode. For
non-character binary data or strings with unknown encoding, specify
CharacterEncoding.UNKNOWN . Defaults to CharacterEncoding.UTF8 .
|
seed
|
Integer seed for hash functions. Defaults to 0.
|
repetitions
|
Number of repetitions in IBLT data structure (must be >= 3).
Defaults to 3.
|
hash_family
|
A str specifying the hash family to use to construct IBLT.
Options include coupled or random, default is chosen based on capacity.
|
hash_family_params
|
An optional dict of parameters that the hash family
hasher expects. Defaults are chosen based on capacity.
|
field_size
|
The field size for all values in IBLT. Defaults to 2**31 - 1.
|
Methods
decode_string_from_chunks
View source
decode_string_from_chunks(
chunks
)
Computes string from sequence of ints each encoding 'chunk_length' bytes.
Inverse of IBLTEncoder.compute_iblt
.
Args |
chunks
|
A tf.Tensor of num_chunks integers.
|
Returns |
A tf.Tensor with the string encoded in the chunks.
|
get_freq_estimates
View source
get_freq_estimates()
Decodes key-value pairs from an IBLT.
Note that this method only works for UTF-8 strings, and when running TF in
Eager mode.
Returns |
A dictionary containing a decoded key with its frequency.
|
get_freq_estimates_tf
View source
@tf.function
get_freq_estimates_tf() -> tuple[tf.Tensor, tf.Tensor, tf.Tensor]
Decodes key-value pairs from an IBLT.
Returns |
(out_strings, out_counts, num_not_decoded) where out_strings is tf.Tensor
containing all the decoded strings, out_counts is a tf.Tensor containing
the counts of each string and num_not_decoded is tf.Tensor with the number
of items not decoded in the IBLT.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-09-20 UTC.
[null,null,["Last updated 2024-09-20 UTC."],[],[],null,["# tff.analytics.heavy_hitters.iblt.IbltDecoder\n\n\u003cbr /\u003e\n\n|-------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/federated/blob/v0.87.0 Version 2.0, January 2004 Licensed under the Apache License, Version 2.0 (the) |\n\nDecodes the strings and counts stored in an IBLT data structure. \n\n tff.analytics.heavy_hitters.iblt.IbltDecoder(\n iblt: tf.Tensor,\n capacity: int,\n string_max_bytes: int,\n *,\n encoding: ../../../../tff/analytics/heavy_hitters/iblt/CharacterEncoding = ../../../../tff/analytics/heavy_hitters/iblt/CharacterEncoding#UTF8,\n seed: int = 0,\n repetitions: int = DEFAULT_REPETITIONS,\n hash_family: Optional[str] = None,\n hash_family_params: Optional[dict[str, Union[int, float]]] = None,\n field_size: int = DEFAULT_FIELD_SIZE\n )\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `iblt` | Tensor representing the IBLT computed by the IbltEncoder. |\n| `capacity` | Number of distinct strings that we expect to be inserted. |\n| `string_max_bytes` | Maximum length of a string in bytes that can be inserted. |\n| `encoding` | The character encoding of the string data to decode. For non-character binary data or strings with unknown encoding, specify [`CharacterEncoding.UNKNOWN`](../../../../tff/analytics/heavy_hitters/iblt/CharacterEncoding#UNKNOWN). Defaults to [`CharacterEncoding.UTF8`](../../../../tff/analytics/heavy_hitters/iblt/CharacterEncoding#UTF8). |\n| `seed` | Integer seed for hash functions. Defaults to 0. |\n| `repetitions` | Number of repetitions in IBLT data structure (must be \\\u003e= 3). Defaults to 3. |\n| `hash_family` | A `str` specifying the hash family to use to construct IBLT. Options include coupled or random, default is chosen based on capacity. |\n| `hash_family_params` | An optional `dict` of parameters that the hash family hasher expects. Defaults are chosen based on capacity. |\n| `field_size` | The field size for all values in IBLT. Defaults to 2\\*\\*31 - 1. |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `decode_string_from_chunks`\n\n[View source](https://github.com/tensorflow/federated/blob/v0.87.0\nVersion 2.0, January 2004\nLicensed under the Apache License, Version 2.0 (the) \n\n decode_string_from_chunks(\n chunks\n )\n\nComputes string from sequence of ints each encoding 'chunk_length' bytes.\n\nInverse of `IBLTEncoder.compute_iblt`.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ||\n|----------|-------------------------------------------------------------------------------------------------|\n| `chunks` | A [`tf.Tensor`](https://www.tensorflow.org/api_docs/python/tf/Tensor) of `num_chunks` integers. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| A [`tf.Tensor`](https://www.tensorflow.org/api_docs/python/tf/Tensor) with the string encoded in the chunks. ||\n\n\u003cbr /\u003e\n\n### `get_freq_estimates`\n\n[View source](https://github.com/tensorflow/federated/blob/v0.87.0\nVersion 2.0, January 2004\nLicensed under the Apache License, Version 2.0 (the) \n\n get_freq_estimates()\n\nDecodes key-value pairs from an IBLT.\n\nNote that this method only works for UTF-8 strings, and when running TF in\nEager mode.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| A dictionary containing a decoded key with its frequency. ||\n\n\u003cbr /\u003e\n\n### `get_freq_estimates_tf`\n\n[View source](https://github.com/tensorflow/federated/blob/v0.87.0\nVersion 2.0, January 2004\nLicensed under the Apache License, Version 2.0 (the) \n\n @tf.function\n get_freq_estimates_tf() -\u003e tuple[tf.Tensor, tf.Tensor, tf.Tensor]\n\nDecodes key-value pairs from an IBLT.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ||\n|---|---|\n| (out_strings, out_counts, num_not_decoded) where out_strings is tf.Tensor containing all the decoded strings, out_counts is a tf.Tensor containing the counts of each string and num_not_decoded is tf.Tensor with the number of items not decoded in the IBLT. ||\n\n\u003cbr /\u003e"]]