Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge

tff.analytics.data_processing.get_top_multi_elements

Gets the top max_user_contribution unique word multiset from the input dataset.

This method returns the max_user_contribution most common unique words from the dataset, but returns a multiset. That is, a word will appear in the output as many times as it did in the dataset, but each unique word only counts one toward the max_user_contribution limit.

This differs from get_top_elements in that it returns a multiset rather than a set.

The input dataset must yield batched 1-d tensors. This function reads each coordinate of the tensor as an individual element and caps the total number of elements to return. Note that the returned set of top words will not necessarily be sorted.

dataset A tf.data.Dataset to extract top elements from. Element type must be tf.string.
max_user_contribution The maximum number of elements to keep.
max_string_length The maximum length (in bytes) of strings in the dataset. Strings longer than max_string_length will be truncated. Defaults to None, which means there is no limit of the string length.

A rank-1 Tensor containing the top max_user_contribution unique elements of the input dataset. If the total number of unique words is less than or equal to max_user_contribution, returns the list of all unique elements.

ValueError -- If the shape of elements in dataset is not rank 1. -- If max_user_contribution is less than 1. -- If max_string_length is not None and is less than 1.
TypeError If dataset.element_spec.dtype must be tf.string is not tf.string.