tfm.nlp.ops.get_sentence_order_labels

Extract segments and labels for sentence order prediction (SOP) task.

Extracts the segment and labels for the sentence order prediction task defined in "ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations" (https://arxiv.org/pdf/1909.11942.pdf)

sentences a RaggedTensor of shape [batch, (num_sentences)] with string dtype.
random_threshold (optional) A float threshold between 0 and 1, used to determine whether to extract a random, out-of-batch sentence or a suceeding sentence. Higher value favors succeeding sentence.
random_next_threshold (optional) A float threshold between 0 and 1, used to determine whether to extract either a random, out-of-batch, or succeeding sentence or a preceeding sentence. Higher value favors preceeding sentences.
random_fn (optional) An op used to generate random float values.

a tuple of (preceeding_or_random_next, is_suceeding_or_random) where: preceeding_or_random_next: a RaggedTensor of strings with the same shape as sentences and contains either a preceeding, suceeding, or random out-of-batch sentence respective to its counterpart in sentences and dependent on its label in is_preceeding_or_random_next. is_suceeding_or_random: a RaggedTensor of bool values with the same shape as sentences and is True if it's corresponding sentence in preceeding_or_random_next is a random or suceeding sentence, False otherwise.