Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge

Hashes a string to an hyper-edge with coupled indices.

For a string, generates a set of indices such that the indices are close to each as described in

seed An integer seed for hash functions.
table_size The hash table size of the IBLT. Must be a positive integer.
repetitions The number of repetitions in IBLT data structure. Must be at least 3.
rescale_factor A float to rescale table_size to table_size / rescale_factor + 1. This number is denoted as z in Must be non-negative and no greater than table_size - 1.

ValueError If arguments do not meet expectations.



View source

Computes the indices at which the given strings in IBLT.

data_strings A list of strings to be hashed.

hash_indices vector of repetitions hash values of data_string, in {0,...,table_size-1}.


View source

Returns Tensor containing hash-position of (input string, repetition).

data_strings A tf.Tensor of strings.

A tf.Tensor of shape (input_length, repetitions, 3) containing value i at index (i, r, 0), value r at index (i, r, 1) and the hash-index of the i-th input string in repetition r at index (i, r, 2).