tft.bag_of_words

Computes a bag of "words" based on the specified ngram configuration.

tft.bag_of_words(
    tokens: tf.SparseTensor,
    ngram_range: Tuple[int, int],
    separator: str,
    name: Optional[str] = None
) -> tf.SparseTensor

A light wrapper around tft.ngrams. First computes ngrams, then transforms the ngram representation (list semantics) into a Bag of Words (set semantics) per row. Each row reflects the set of unique ngrams present in an input record.

See tft.ngrams for more information.

Args
`tokens`	a two-dimensional `SparseTensor` of dtype `tf.string` containing tokens that will be used to construct a bag of words.
`ngram_range`	A pair with the range (inclusive) of ngram sizes to compute.
`separator`	a string that will be inserted between tokens when ngrams are constructed.
`name`	(Optional) A name for this operation.

Returns
A `SparseTensor` containing the unique set of ngrams from each row of the input. Note: the original order of the ngrams may not be preserved.

tft.bag_of_words Stay organized with collections Save and categorize content based on your preferences.

Args

Returns

tft.bag_of_words