A container for the input tensor metadata information of Bert models.

model_buffer valid buffer of the model file.
ids_name name of the ids tensor, which represents the tokenized ids of the input text.
mask_name name of the mask tensor, which represents the mask with 1 for real tokens and 0 for padding tokens.
segment_name name of the segment ids tensor, where 0 stands for the first sequence, and 1 stands for the second sequence if exists.
ids_md input ids tensor informaton.
mask_md input mask tensor informaton.
segment_ids_md input segment tensor informaton.
tokenizer_md information of the tokenizer used to process the input string, if any. Supported tokenziers are: BertTokenizer 1 and SentencePieceTokenizer 2. If the tokenizer is RegexTokenizer 3, refer to nl_classifier.MetadataWriter.



View source

Creates the input process unit metadata.


View source

Creates the input metadata for the three input tesnors.


View source

Gets the associated files that are packed in the tokenizer.