Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge

Decodes the strings and counts stored in an IBLT data structure.

iblt Tensor representing the IBLT computed by the IbltEncoder.
capacity Number of distinct strings that we expect to be inserted.
string_max_length Maximum length of a string that can be inserted.
seed Integer seed for hash functions. Defaults to 0.
repetitions Number of repetitions in IBLT data structure (must be >= 3). Defaults to 3.
hash_family A str specifying the hash family to use to construct IBLT. Options include coupled or random, default is chosen based on capacity.
hash_family_params An optional dict of parameters that the hash family hasher expects. Defaults are chosen based on capacity.
dtype A tensorflow data type which determines the type of the IBLT values
field_size The field size for all values in IBLT. Defaults to 2**31 - 1.



View source

Computes string from sequence of ints each encoding 'chunk_length' bytes.

Inverse of IBLTEncoder.compute_iblt.

chunks A tf.Tensor of num_chunks integers.

A tf.Tensor with the UTF-8 string encoded in the chunks.


View source

Decodes key-value pairs from an IBLT.

Note that this method only works when running TF in Eager mode.

A dictionary containing a decoded key with its frequency.


View source

Decodes key-value pairs from an IBLT.

(out_strings, out_counts, num_not_decoded) where out_strings is tf.Tensor containing all the decoded strings, out_counts is a tf.Tensor containing the counts of each string and num_not_decoded is tf.Tensor with the number of items not decoded in the IBLT.