View source on GitHub
|
Returns a lookup table based on the given dataset.
tf.data.experimental.table_from_dataset(
dataset=None,
num_oov_buckets=0,
vocab_size=None,
default_value=None,
hasher_spec=lookup_ops.FastHashSpec,
key_dtype=tf.dtypes.string,
name=None
)
This operation constructs a lookup table based on the given dataset of pairs of (key, value).
Any lookup of an out-of-vocabulary token will return a bucket ID based on its
hash if num_oov_buckets is greater than zero. Otherwise it is assigned the
default_value.
The bucket ID range is
[vocabulary size, vocabulary size + num_oov_buckets - 1].
Sample Usages:
keys = tf.data.Dataset.range(100)values = tf.data.Dataset.range(100).map(lambda x: tf.strings.as_string(x * 2))ds = tf.data.Dataset.zip((keys, values))table = tf.data.experimental.table_from_dataset(ds, default_value='n/a', key_dtype=tf.int64)table.lookup(tf.constant([0, 1, 2], dtype=tf.int64)).numpy()array([b'0', b'2', b'4'], dtype=object)
Returns | |
|---|---|
| The lookup table based on the given dataset. |
View source on GitHub