View source on GitHub |
Split elements of source
based on delimiter
. (deprecated arguments)
tf.compat.v1.string_split(
source,
sep=None,
skip_empty=True,
delimiter=None,
result_type='SparseTensor',
name=None
)
Let N be the size of source
(typically N will be the batch size). Split each
element of source
based on delimiter
and return a SparseTensor
or RaggedTensor
containing the split tokens. Empty tokens are ignored.
If sep
is an empty string, each element of the source
is split
into individual strings, each containing one byte. (This includes splitting
multibyte sequences of UTF-8.) If delimiter contains multiple bytes, it is
treated as a set of delimiters with each considered a potential split point.
Examples:
print(tf.compat.v1.string_split(['hello world', 'a b c']))
SparseTensor(indices=tf.Tensor( [[0 0] [0 1] [1 0] [1 1] [1 2]], ...),
values=tf.Tensor([b'hello' b'world' b'a' b'b' b'c'], ...),
dense_shape=tf.Tensor([2 3], shape=(2,), dtype=int64))
print(tf.compat.v1.string_split(['hello world', 'a b c'],
result_type="RaggedTensor"))
<tf.RaggedTensor [[b'hello', b'world'], [b'a', b'b', b'c']]>
Raises | |
---|---|
ValueError
|
If delimiter is not a string. |
Returns | |
---|---|
A SparseTensor or RaggedTensor of rank 2 , the strings split according
to the delimiter. The first column of the indices corresponds to the row
in source and the second column corresponds to the index of the split
component in this row.
|