tf.RaggedTensor

TensorFlow 2 version

View source on GitHub

Represents a ragged tensor.

View aliases

Compat aliases for migration

See Migration guide for more details.

tf.compat.v1.RaggedTensor, `tf.compat.v2.RaggedTensor`

tf.RaggedTensor(
    values, row_splits, cached_row_lengths=None, cached_value_rowids=None,
    cached_nrows=None, internal=False
)

A RaggedTensor is a tensor with one or more ragged dimensions, which are dimensions whose slices may have different lengths. For example, the inner (column) dimension of rt=[[3, 1, 4, 1], [], [5, 9, 2], [6], []] is ragged, since the column slices (rt[0, :], ..., rt[4, :]) have different lengths. Dimensions whose slices all have the same length are called uniform dimensions. The outermost dimension of a RaggedTensor is always uniform, since it consists of a single slice (and so there is no possibility for differing slice lengths).

The total number of dimensions in a RaggedTensor is called its rank, and the number of ragged dimensions in a RaggedTensor is called its ragged-rank. A RaggedTensor's ragged-rank is fixed at graph creation time: it can't depend on the runtime values of Tensors, and can't vary dynamically for different session runs.

Potentially Ragged Tensors

Many ops support both Tensors and RaggedTensors. The term "potentially ragged tensor" may be used to refer to a tensor that might be either a Tensor or a RaggedTensor. The ragged-rank of a Tensor is zero.

Documenting RaggedTensor Shapes

When documenting the shape of a RaggedTensor, ragged dimensions can be indicated by enclosing them in parentheses. For example, the shape of a 3-D RaggedTensor that stores the fixed-size word embedding for each word in a sentence, for each sentence in a batch, could be written as [num_sentences, (num_words), embedding_size]. The parentheses around (num_words) indicate that dimension is ragged, and that the length of each element list in that dimension may vary for each item.

Component Tensors

Internally, a RaggedTensor consists of a concatenated list of values that are partitioned into variable-length rows. In particular, each RaggedTensor consists of:

A values tensor, which concatenates the variable-length rows into a flattened list. For example, the values tensor for [[3, 1, 4, 1], [], [5, 9, 2], [6], []] is [3, 1, 4, 1, 5, 9, 2, 6].
A row_splits vector, which indicates how those flattened values are divided into rows. In particular, the values for row rt[i] are stored in the slice rt.values[rt.row_splits[i]:rt.row_splits[i+1]].

Example:

print(tf.RaggedTensor.from_row_splits(
    values=[3, 1, 4, 1, 5, 9, 2, 6],
    row_splits=[0, 4, 4, 7, 8, 8]))
<tf.RaggedTensor [[3, 1, 4, 1], [], [5, 9, 2], [6], []]>

Alternative Row-Partitioning Schemes

In addition to row_splits, ragged tensors provide support for four other row-partitioning schemes:

row_lengths: a vector with shape [nrows], which specifies the length of each row.
value_rowids and nrows: value_rowids is a vector with shape [nvals], corresponding one-to-one with values, which specifies each value's row index. In particular, the row rt[row] consists of the values rt.values[j] where value_rowids[j]==row. nrows is an integer scalar that specifies the number of rows in the RaggedTensor. (nrows is used to indicate trailing empty rows.)
row_starts: a vector with shape [nrows], which specifies the start offset of each row. Equivalent to row_splits[:-1].
row_limits: a vector with shape [nrows], which specifies the stop offset of each row. Equivalent to row_splits[1:].

Example: The following ragged tensors are equivalent, and all represent the nested list [[3, 1, 4, 1], [], [5, 9, 2], [6], []].

values = [3, 1, 4, 1, 5, 9, 2, 6]
rt1 = RaggedTensor.from_row_splits(values, row_splits=[0, 4, 4, 7, 8, 8])
rt2 = RaggedTensor.from_row_lengths(values, row_lengths=[4, 0, 3, 1, 0])
rt3 = RaggedTensor.from_value_rowids(
    values, value_rowids=[0, 0, 0, 0, 2, 2, 2, 3], nrows=5)
rt4 = RaggedTensor.from_row_starts(values, row_starts=[0, 4, 4, 7, 8])
rt5 = RaggedTensor.from_row_limits(values, row_limits=[4, 4, 7, 8, 8])

Multiple Ragged Dimensions

RaggedTensors with multiple ragged dimensions can be defined by using a nested RaggedTensor for the values tensor. Each nested RaggedTensor adds a single ragged dimension.

inner_rt = RaggedTensor.from_row_splits(  # =rt1 from above
    values=[3, 1, 4, 1, 5, 9, 2, 6], row_splits=[0, 4, 4, 7, 8, 8])
outer_rt = RaggedTensor.from_row_splits(
    values=inner_rt, row_splits=[0, 3, 3, 5])
print outer_rt.to_list()
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]
print outer_rt.ragged_rank
2

The factory function RaggedTensor.from_nested_row_splits may be used to construct a RaggedTensor with multiple ragged dimensions directly, by providing a list of row_splits tensors:

RaggedTensor.from_nested_row_splits(
    flat_values=[3, 1, 4, 1, 5, 9, 2, 6],
    nested_row_splits=([0, 3, 3, 5], [0, 4, 4, 7, 8, 8])).to_list()
[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]

Uniform Inner Dimensions

RaggedTensors with uniform inner dimensions can be defined by using a multidimensional Tensor for values.

rt = RaggedTensor.from_row_splits(values=tf.ones([5, 3]),
..                                    row_splits=[0, 2, 5])
print rt.to_list()
[[[1, 1, 1], [1, 1, 1]],
 [[1, 1, 1], [1, 1, 1], [1, 1, 1]]]
print rt.shape
 (2, ?, 3)

RaggedTensor Shape Restrictions

The shape of a RaggedTensor is currently restricted to have the following form:

A single uniform dimension
Followed by one or more ragged dimensions
Followed by zero or more uniform dimensions.

This restriction follows from the fact that each nested RaggedTensor replaces the uniform outermost dimension of its values with a uniform dimension followed by a ragged dimension.

Args
`values`	A potentially ragged tensor of any dtype and shape `[nvals, ...]`.
`row_splits`	A 1-D integer tensor with shape `[nrows+1]`.
`cached_row_lengths`	A 1-D integer tensor with shape `[nrows]`
`cached_value_rowids`	A 1-D integer tensor with shape `[nvals]`.
`cached_nrows`	A 1-D integer scalar tensor.
`internal`	True if the constructor is being called by one of the factory methods. If false, an exception will be raised.

Raises
`TypeError`	If a row partitioning tensor has an inappropriate dtype.
`TypeError`	If exactly one row partitioning argument was not specified.
`ValueError`	If a row partitioning tensor has an inappropriate shape.
`ValueError`	If multiple partitioning arguments are specified.
`ValueError`	If nrows is specified but value_rowids is not None.

Attributes
`dtype`	The `DType` of values in this tensor.
`flat_values`	The innermost `values` tensor for this ragged tensor. Concretely, if `rt.values` is a `Tensor`, then `rt.flat_values` is `rt.values`; otherwise, `rt.flat_values` is `rt.values.flat_values`. Conceptually, `flat_values` is the tensor formed by flattening the outermost dimension and all of the ragged dimensions into a single dimension. `rt.flat_values.shape = [nvals] + rt.shape[rt.ragged_rank + 1:]` (where `nvals` is the number of items in the flattened dimensions). Example: `rt = ragged.constant([[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]])` `print rt.flat_values()` `tf.Tensor([3, 1, 4, 1, 5, 9, 2, 6])`
`nested_row_splits`	A tuple containing the row_splits for all ragged dimensions. `rt.nested_row_splits` is a tuple containing the `row_splits` tensors for all ragged dimensions in `rt`, ordered from outermost to innermost. In particular, `rt.nested_row_splits = (rt.row_splits,) + value_splits` where: `value_splits = ()` if `rt.values` is a `Tensor`. `value_splits = rt.values.nested_row_splits` otherwise. Example: `rt = ragged.constant([[[[3, 1, 4, 1], [], [5, 9, 2]], [], [[6], []]]])` `for i, splits in enumerate(rt.nested_row_splits()):` `print('Splits for dimension %d: %s' % (i+1, splits))` `Splits for dimension 1: [0, 1]` `Splits for dimension 2: [0, 3, 3, 5]` `Splits for dimension 3: [0, 4, 4, 7, 8, 8]`
`ragged_rank`	The number of ragged dimensions in this ragged tensor.
`row_splits`	The row-split indices for this ragged tensor's `values`. `rt.row_splits` specifies where the values for each row begin and end in `rt.values`. In particular, the values for row `rt[i]` are stored in the slice `rt.values[rt.row_splits[i]:rt.row_splits[i+1]]`. Example: `>>> rt = ragged.constant([[3, 1, 4, 1], [], [5, 9, 2], [6], []]) >>> print rt.row_splits # indices of row splits in rt.values tf.Tensor([0, 4, 4, 7, 8, 8])`
`shape`	The statically known shape of this ragged tensor.
`values`	The concatenated rows for this ragged tensor. `rt.values` is a potentially ragged tensor formed by flattening the two outermost dimensions of `rt` into a single dimension. `rt.values.shape = [nvals] + rt.shape[2:]` (where `nvals` is the number of items in the outer two dimensions of `rt`). `rt.ragged_rank = self.ragged_rank - 1` Example: `>>> rt = ragged.constant([[3, 1, 4, 1], [], [5, 9, 2], [6], []]) >>> print rt.values tf.Tensor([3, 1, 4, 1, 5, 9, 2, 6])`

Args
`axis`	An integer scalar or vector indicating which axes to return the bounding box for. If not specified, then the full bounding box is returned.
`name`	A name prefix for the returned tensor (optional).
`out_type`	`dtype` for the returned tensor. Defaults to `self.row_splits.dtype`.

Args
`flat_values`	A potentially ragged tensor.
`nested_row_lengths`	A list of 1-D integer tensors. The `i`th tensor is used as the `row_lengths` for the `i`th ragged dimension.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`.

Args
`flat_values`	A potentially ragged tensor.
`nested_value_rowids`	A list of 1-D integer tensors. The `i`th tensor is used as the `value_rowids` for the `i`th ragged dimension.
`nested_nrows`	A list of integer scalars. The `i`th scalar is used as the `nrows` for the `i`th ragged dimension.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`.

Args
`values`	A potentially ragged tensor with shape `[nvals, ...]`.
`row_lengths`	A 1-D integer tensor with shape `[nrows]`. Must be nonnegative. `sum(row_lengths)` must be `nvals`.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`.

Args
`values`	A potentially ragged tensor with shape `[nvals, ...]`.
`row_limits`	A 1-D integer tensor with shape `[nrows]`. Must be sorted in ascending order. If `nrows>0`, then `row_limits[-1]` must be `nvals`.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`.

Args
`values`	A potentially ragged tensor with shape `[nvals, ...]`.
`row_starts`	A 1-D integer tensor with shape `[nrows]`. Must be nonnegative and sorted in ascending order. If `nrows>0`, then `row_starts[0]` must be zero.
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`.

Args
`st_input`	The sparse tensor to convert. Must have rank 2.
`name`	A name prefix for the returned tensors (optional).
`row_splits_dtype`	`dtype` for the returned `RaggedTensor`'s `row_splits` tensor. One of `tf.int32` or `tf.int64`.

Args
`tensor`	The `Tensor` to convert. Must have rank `ragged_rank + 1` or higher.
`lengths`	An optional set of row lengths, specified using a 1-D integer `Tensor` whose length is equal to `tensor.shape[0]` (the number of rows in `tensor`). If specified, then `output[row]` will contain `tensor[row][:lengths[row]]`. Negative lengths are treated as zero. You may optionally pass a list or tuple of lengths to this argument, which will be used as nested row lengths to construct a ragged tensor with multiple ragged dimensions.
`padding`	An optional padding value. If specified, then any row suffix consisting entirely of `padding` will be excluded from the returned RaggedTensor. `padding` is a `Tensor` with the same dtype as `tensor` and with `shape=tensor.shape[ragged_rank + 1:]`.
`ragged_rank`	Integer specifying the ragged rank for the returned `RaggedTensor`. Must be greater than zero.
`name`	A name prefix for the returned tensors (optional).
`row_splits_dtype`	`dtype` for the returned `RaggedTensor`'s `row_splits` tensor. One of `tf.int32` or `tf.int64`.

Args
`values`	A potentially ragged tensor with shape `[nvals, ...]`.
`value_rowids`	A 1-D integer tensor with shape `[nvals]`, which corresponds one-to-one with `values`, and specifies each value's row index. Must be nonnegative, and must be sorted in ascending order.
`nrows`	An integer scalar specifying the number of rows. This should be specified if the `RaggedTensor` may containing empty training rows. Must be greater than `value_rowids[-1]` (or zero if `value_rowids` is empty). Defaults to `value_rowids[-1]` (or zero if `value_rowids` is empty).
`name`	A name prefix for the RaggedTensor (optional).
`validate`	If true, then use assertions to check that the arguments form a valid `RaggedTensor`.

Args
`axis`	An integer constant indicating the axis whose row lengths should be returned.
`name`	A name prefix for the returned tensor (optional).

Args
`default_value`	Value to set for indices not specified in `self`. Defaults to zero. `default_value` must be broadcastable to `self.shape[self.ragged_rank + 1:]`.
`name`	A name prefix for the returned tensors (optional).

Args
`x`	A `Tensor` or `SparseTensor` of type `float16`, `float32`, `float64`, `int32`, `int64`, `complex64` or `complex128`.
`name`	A name for the operation (optional).

Args
`x`	A `Tensor`. Must be one of the following types: `bfloat16`, `half`, `float32`, `float64`, `uint8`, `int8`, `int16`, `int32`, `int64`, `complex64`, `complex128`, `string`.
`y`	A `Tensor`. Must have the same type as `x`.
`name`	A name for the operation (optional).

Args
`x`	A `Tensor` of type `bool`.
`y`	A `Tensor` of type `bool`.
`name`	A name for the operation (optional).

Args
`x`	`Tensor` numerator of real numeric type.
`y`	`Tensor` denominator of real numeric type.
`name`	A name for the operation (optional).

Raises
`ValueError`	If `key` is out of bounds.
`ValueError`	If `key` is not supported.
`TypeError`	If the indices in `key` have an unsupported type.

Args
`x`	A `Tensor` of type `float16`, `float32`, `float64`, `int32`, `int64`, `complex64`, or `complex128`.
`y`	A `Tensor` of type `float16`, `float32`, `float64`, `int32`, `int64`, `complex64`, or `complex128`.
`name`	A name for the operation (optional).

Args
`x`	`Tensor` numerator of numeric type.
`y`	`Tensor` denominator of numeric type.
`name`	A name for the operation (optional).

tf.RaggedTensor

View aliases

Potentially Ragged Tensors

Documenting RaggedTensor Shapes

Component Tensors

Example:

Alternative Row-Partitioning Schemes

Multiple Ragged Dimensions

Uniform Inner Dimensions

RaggedTensor Shape Restrictions

Args

Raises

Attributes

Example:

Example:

Example:

Example:

Methods

bounding_shape

Example:

consumers

from_nested_row_lengths

Equivalent to:

from_nested_row_splits

Equivalent to:

from_nested_value_rowids

Equivalent to:

from_row_lengths

Example:

from_row_limits

Example:

from_row_splits

Example:

from_row_starts

Example:

from_sparse

Example:

from_tensor

Examples:

from_value_rowids

Example:

nested_row_lengths

nested_value_rowids

Example:

nrows

Example:

row_lengths

Example:

row_limits

Example:

row_starts

Example:

to_list

to_sparse

Example:

to_tensor

Example:

value_rowids

Example:

with_flat_values

with_row_splits_dtype

with_values

__abs__

__add__

__and__

__bool__

__div__

__floordiv__

__ge__

__getitem__

Examples:

A 3-D ragged tensor with 2 ragged dimensions.

__gt__

__invert__

__le__

__lt__

__mod__

__mul__

__neg__

__nonzero__

`bounding_shape`

`consumers`

`from_nested_row_lengths`

`from_nested_row_splits`

`from_nested_value_rowids`

`from_row_lengths`

`from_row_limits`

`from_row_splits`

`from_row_starts`

`from_sparse`

`from_tensor`

`from_value_rowids`

`nested_row_lengths`

`nested_value_rowids`

`nrows`

`row_lengths`

`row_limits`

`row_starts`

`to_list`

`to_sparse`

`to_tensor`

`value_rowids`

`with_flat_values`

`with_row_splits_dtype`

`with_values`

`abs`

`add`

`and`

`bool`

`div`

`floordiv`

`ge`

`getitem`

`gt`

`invert`

`le`

`lt`

`mod`

`mul`

`neg`

`nonzero`

`or`

`pow`

`radd`

`rand`

`rdiv`

`rfloordiv`

`rmod`

`rmul`

`ror`

`rpow`

`rsub`

`rtruediv`

`rxor`

`sub`

`truediv`

`xor`