TString

public interface TString

String type.

This type can be used to store any arbitrary byte sequence of variable length.

Since the size of a tensor is fixed, creating a tensor of this type requires to provide all of its values initially, so TensorFlow can compute and allocate the right amount of memory. Then the data in the tensor is initialized once and cannot be modified afterwards.

Public Methods

abstract NdArray<byte[]>
abstract static TString
scalarOf(String value)
Allocates a new tensor for storing a string scalar.
abstract static TString
tensorOf(Shape shape, DataBuffer<String> data)
Allocates a new tensor with the given shape and data.
abstract static TString
tensorOf(NdArray<String> src)
Allocates a new tensor which is a copy of a given array.
abstract static TString
tensorOf(Charset charset, Shape shape, DataBuffer<String> data)
Allocates a new tensor with the given shape and data.
abstract static TString
tensorOf(Charset charset, NdArray<String> src)
Allocates a new tensor which is a copy of a given array.
abstract static TString
tensorOfBytes(Shape shape, DataBuffer<byte[]> data)
Allocates a new tensor with the given shape and raw bytes.
abstract static TString
tensorOfBytes(NdArray<byte[]> src)
Allocates a new tensor which is a copy of a given array of raw bytes.
abstract TString
using(Charset charset)
Use a specific charset for decoding data from a string tensor, instead of the default UTF-8.
abstract static TString
vectorOf(String... values)
Allocates a new tensor for storing a vector of strings.

Inherited Methods

Public Methods

public abstract NdArray<byte[]> asBytes ()

Returns
  • the tensor data as a n-dimensional array of raw byte sequences.

public static abstract TString scalarOf (String value)

Allocates a new tensor for storing a string scalar.

The string is encoded into bytes using the UTF-8 charset.

Parameters
value scalar value to store in the new tensor
Returns
  • the new tensor

public static abstract TString tensorOf (Shape shape, DataBuffer<String> data)

Allocates a new tensor with the given shape and data.

The data will be copied from the provided buffer to the tensor after it is allocated. The strings are encoded into bytes using the UTF-8 charset.

Parameters
shape shape of the tensor
data buffer of strings to initialize the tensor with
Returns
  • the new tensor

public static abstract TString tensorOf (NdArray<String> src)

Allocates a new tensor which is a copy of a given array.

The tensor will have the same shape as the source array and its data will be copied. The strings are encoded into bytes using the UTF-8 charset.

Parameters
src the source array giving the shape and data to the new tensor
Returns
  • the new tensor

public static abstract TString tensorOf (Charset charset, Shape shape, DataBuffer<String> data)

Allocates a new tensor with the given shape and data.

The data will be copied from the provided buffer to the tensor after it is allocated. The strings are encoded into bytes using the charset passed in parameter.

If charset is different than default UTF-8, then it must also be provided explicitly when reading data from the tensor, using using(Charset):

// Given `originalStrings` an initialized buffer of strings
 TString tensor =
    TString.tensorOf(Charsets.UTF_16, Shape.of(originalString.size()), originalStrings);
 ...
 TString tensorStrings = tensor.data().using(Charsets.UTF_16);
 assertEquals(originalStrings.getObject(0), tensorStrings.getObject(0));
 

Parameters
charset charset to use for encoding the strings into bytes
shape shape of the tensor
data buffer of strings to initialize the tensor with
Returns
  • the new tensor

public static abstract TString tensorOf (Charset charset, NdArray<String> src)

Allocates a new tensor which is a copy of a given array.

The tensor will have the same shape as the source array and its data will be copied. The strings are encoded into bytes using the charset passed in parameter.

If charset is different than default UTF-8, then it must also be provided explicitly when reading data from the tensor, using using(Charset):

// Given `originalStrings` an initialized vector of strings
 TString tensor = TString.tensorOf(Charsets.UTF_16, originalStrings);
 ...
 TString tensorStrings = tensor.data().using(Charsets.UTF_16);
 assertEquals(originalStrings.getObject(0), tensorStrings.getObject(0));
 

Parameters
charset charset to use for encoding the strings into bytes
src the source array giving the shape and data to the new tensor
Returns
  • the new tensor

public static abstract TString tensorOfBytes (Shape shape, DataBuffer<byte[]> data)

Allocates a new tensor with the given shape and raw bytes.

The data will be copied from the provided buffer to the tensor after it has been allocated.

If data must be read as raw bytes as well, the user must specify it explicitly by invoking asBytes() on the returned data:

byte[] bytes = tensor.data().asBytes().getObject(0);  // returns first sequence of bytes in the tensor
 

Parameters
shape shape of the tensor to create
data the source array giving the shape and data to the new tensor
Returns
  • the new tensor

public static abstract TString tensorOfBytes (NdArray<byte[]> src)

Allocates a new tensor which is a copy of a given array of raw bytes.

The tensor will have the same shape as the source array and its data will be copied.

If data must be read as raw bytes as well, the user must specify it explicitly by invoking asBytes() on the returned data:

byte[] bytes = tensor.data().asBytes().getObject(0);  // returns first sequence of bytes in the tensor
 

Parameters
src the source array giving the shape and data to the new tensor
Returns
  • the new tensor

public abstract TString using (Charset charset)

Use a specific charset for decoding data from a string tensor, instead of the default UTF-8.

The charset must match the one used for encoding the string values when the tensor was created. For example:

TString tensor =
    TString.tensorOf(StandardCharsets.UTF_16, NdArrays.scalarOfObject("TensorFlow");

 assertEquals("TensorFlow", tensor.data().using(StandardCharsets.UTF_16).getObject());
 

Parameters
charset charset to use
Returns
  • string tensor data using this charset

public static abstract TString vectorOf (String... values)

Allocates a new tensor for storing a vector of strings.

The strings are encoded into bytes using the UTF-8 charset.

Parameters
values values to store in the new tensor
Returns
  • the new tensor