tf.contrib.lookup.index_to_string_table_from_file

View source on GitHub

Returns a lookup table that maps a Tensor of indices into strings.

This operation constructs a lookup table to map int64 indices into string values. The table is initialized from a vocabulary file specified in vocabulary_file, where the whole line is the value and the zero-based line number is the index.

Any input which does not have a corresponding index in the vocabulary file (an out-of-vocabulary entry) is assigned the default_value

The underlying table must be initialized by calling session.run(tf.compat.v1.tables_initializer()) or session.run(table.init()) once.

To specify multi-column vocabulary files, use key_column_index and value_column_index and delimiter.

  • TextFileIndex.LINE_NUMBER means use the line number starting from zero, expects data type int64.
  • TextFileIndex.WHOLE_LINE means use the whole line content, expects data type string.
  • A value >=0 means use the index (starting at zero) of the split line based on delimiter.

Sample Usages:

If we have a vocabulary file "test.txt" with the following content:

emerson
lake
palmer
indices = tf.constant([1, 5], tf.int64)
table = tf.lookup.index_to_string_table_from_file(
    vocabulary_file="test.txt", default_value="UNKNOWN")
values = table.lookup(indices)
...
tf.compat.v1.tables_initializer().run()

values.eval() ==> ["lake", "UNKNOWN"]

vocabulary_file The vocabulary filename, may be a constant scalar Tensor.
vocab_size Number of the elements in the vocabulary, if known.
default_value The value to use for out-of-vocabulary indices.
name A name for this op (optional).
key_column_index The column index from the text file to get the key values from. The default is to use the line number, starting from zero.
value_column_index The column index from the text file to get the value values from. The default is to use the whole line content.
delimiter The delimiter to separate fields in a line.

The lookup table to map a string values associated to a given index int64 Tensors.

ValueError when vocabulary_file is empty.
ValueError when vocab_size is invalid.