Menggunakan fitur teks dan jaringan saraf

Lihat di TensorFlow.org Jalankan di Google Colab Lihat di GitHub Unduh buku catatan Lihat model TF Hub

Selamat datang di Menengah CoLab Hutan Keputusan TensorFlow (TF-DF). Dalam colab ini, Anda akan belajar tentang beberapa kemampuan yang lebih maju dari TF-DF, termasuk bagaimana menangani fitur bahasa alami.

Colab ini mengasumsikan Anda sudah familiar dengan konsep yang disajikan dalam colab Pemula , terutama tentang instalasi tentang TF-DF.

Dalam kolab ini, Anda akan:

  1. Latih Hutan Acak yang menggunakan fitur teks secara asli sebagai set kategoris.

  2. Melatih Forest acak yang mengkonsumsi teks fitur menggunakan TensorFlow Hub modul. Dalam pengaturan ini (transfer learning), modul sudah dilatih sebelumnya pada korpus teks besar.

  3. Latih Gradient Boosted Decision Trees (GBDT) dan Neural Network bersama-sama. GBDT akan mengkonsumsi output dari Neural Network.

Mempersiapkan

# Install TensorFlow Dececision Forests
pip install tensorflow_decision_forests

Instal Wurlitzer . Ini dapat digunakan untuk menampilkan log pelatihan yang terperinci. Ini hanya diperlukan di colab.

pip install wurlitzer

Impor perpustakaan yang diperlukan.

import tensorflow_decision_forests as tfdf

import os
import numpy as np
import pandas as pd
import tensorflow as tf
import math

try:
  from wurlitzer import sys_pipes
except:
  from colabtools.googlelog import CaptureLog as sys_pipes

from IPython.core.magic import register_line_magic
from IPython.display import Javascript
WARNING:root:Failure to load the custom c++ tensorflow ops. This error is likely caused the version of TensorFlow and TensorFlow Decision Forests are not compatible.
WARNING:root:TF Parameter Server distributed training not available.

Sel kode tersembunyi membatasi tinggi keluaran dalam colab.

Gunakan teks mentah sebagai fitur

TF-DF dapat mengkonsumsi kategoris-set fitur native. Set kategorikal mewakili fitur teks sebagai kantong kata-kata (atau n-gram).

Sebagai contoh: "The little blue dog"{"the", "little", "blue", "dog"}

Dalam contoh ini, Anda akan akan melatih Acak Forest pada Stanford Sentimen Treebank (SST) dataset. Tujuan dari dataset ini adalah untuk kalimat mengklasifikasikan sebagai membawa sentimen positif atau negatif. Anda akan akan menggunakan versi klasifikasi biner dari dataset dikuratori di TensorFlow Datasets .

# Install the nighly TensorFlow Datasets package
# TODO: Remove when the release package is fixed.
pip install tfds-nightly -U --quiet
# Load the dataset
import tensorflow_datasets as tfds
all_ds = tfds.load("glue/sst2")

# Display the first 3 examples of the test fold.
for example in all_ds["test"].take(3):
  print({attr_name: attr_tensor.numpy() for attr_name, attr_tensor in example.items()})
{'idx': 163, 'label': -1, 'sentence': b'not even the hanson brothers can save it'}
{'idx': 131, 'label': -1, 'sentence': b'strong setup and ambitious goals fade as the film descends into unsophisticated scare tactics and b-film thuggery .'}
{'idx': 1579, 'label': -1, 'sentence': b'too timid to bring a sense of closure to an ugly chapter of the twentieth century .'}
2021-11-08 12:12:01.807072: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.

Dataset dimodifikasi sebagai berikut:

  1. Label baku adalah bilangan bulat di {-1, 1} , tetapi algoritma pembelajaran mengharapkan bilangan bulat positif label misalnya {0, 1} . Oleh karena itu, label diubah sebagai berikut: new_labels = (original_labels + 1) / 2 .
  2. Ukuran batch 64 diterapkan untuk membuat membaca dataset lebih efisien.
  3. The sentence kebutuhan atribut untuk tokenized, yaitu "hello world" -> ["hello", "world"] .

Rincian: Beberapa algoritma pembelajaran hutan keputusan tidak perlu dataset validasi (misalnya Hutan Acak) sementara yang lain lakukan (misalnya Gradient Pohon Didorong dalam beberapa kasus). Karena setiap algoritma pembelajaran di bawah TF-DF dapat menggunakan data validasi secara berbeda, TF-DF menangani pemisahan kereta/validasi secara internal. Akibatnya, ketika Anda memiliki set pelatihan dan validasi, set tersebut selalu dapat digabungkan sebagai input ke algoritme pembelajaran.

def prepare_dataset(example):
  label = (example["label"] + 1) // 2
  return {"sentence" : tf.strings.split(example["sentence"])}, label

train_ds = all_ds["train"].batch(64).map(prepare_dataset)
test_ds = all_ds["validation"].batch(64).map(prepare_dataset)

Terakhir, latih dan evaluasi model seperti biasa. TF-DF secara otomatis mendeteksi fitur kategoris multi-nilai sebagai set kategoris.

%set_cell_height 300

# Specify the model.
model_1 = tfdf.keras.RandomForestModel(num_trees=30)

# Optionally, add evaluation metrics.
model_1.compile(metrics=["accuracy"])

# Train the model.
with sys_pipes():
  model_1.fit(x=train_ds)
<IPython.core.display.Javascript object>
1027/1053 [============================>.] - ETA: 0s
[INFO kernel.cc:736] Start Yggdrasil model training
[INFO kernel.cc:737] Collect training examples
[INFO kernel.cc:392] Number of batches: 1053
[INFO kernel.cc:393] Number of examples: 67349
[INFO data_spec_inference.cc:290] 12816 item(s) have been pruned (i.e. they are considered out of dictionary) for the column sentence (2000 item(s) left) because min_value_count=5 and max_number_of_unique_values=2000
[INFO kernel.cc:759] Dataset:
Number of records: 67349
Number of columns: 2

Number of columns by type:
    CATEGORICAL_SET: 1 (50%)
    CATEGORICAL: 1 (50%)

Columns:

CATEGORICAL_SET: 1 (50%)
    0: "sentence" CATEGORICAL_SET has-dict vocab-size:2001 num-oods:3595 (5.33787%) most-frequent:"the" 27205 (40.3941%)

CATEGORICAL: 1 (50%)
    1: "__LABEL" CATEGORICAL integerized vocab-size:3 no-ood-item

Terminology:
    nas: Number of non-available (i.e. missing) values.
    ood: Out of dictionary.
    manually-defined: Attribute which type is manually defined by the user i.e. the type was not automatically inferred.
    tokenized: The attribute value is obtained through tokenization.
    has-dict: The attribute is attached to a string dictionary e.g. a categorical attribute stored as a string.
    vocab-size: Number of unique values.

[INFO kernel.cc:762] Configure learner
[INFO kernel.cc:787] Training config:
learner: "RANDOM_FOREST"
features: "sentence"
label: "__LABEL"
task: CLASSIFICATION
[yggdrasil_decision_forests.model.random_forest.proto.random_forest_config] {
  num_trees: 30
  decision_tree {
    max_depth: 16
    min_examples: 5
    in_split_min_examples_check: true
    missing_value_policy: GLOBAL_IMPUTATION
    allow_na_conditions: false
    categorical_set_greedy_forward {
      sampling: 0.1
      max_num_items: -1
      min_item_frequency: 1
    }
    growing_strategy_local {
    }
    categorical {
      cart {
      }
    }
    num_candidate_attributes_ratio: -1
    axis_aligned_split {
    }
    internal {
      sorting_strategy: PRESORTED
    }
  }
  winner_take_all_inference: true
  compute_oob_performances: true
  compute_oob_variable_importances: false
  adapt_bootstrap_size_ratio_for_maximum_training_duration: false
}

[INFO kernel.cc:790] Deployment config:
num_threads: 6

[INFO kernel.cc:817] Train model
[INFO random_forest.cc:315] Training random forest on 67349 example(s) and 1 feature(s).
[INFO random_forest.cc:628] Training of tree  1/30 (tree index:1) done accuracy:0.7412 logloss:9.32811
[INFO random_forest.cc:628] Training of tree  4/30 (tree index:2) done accuracy:0.75669 logloss:5.54597
[INFO random_forest.cc:628] Training of tree  7/30 (tree index:7) done accuracy:0.779932 logloss:3.76263
[INFO random_forest.cc:628] Training of tree  9/30 (tree index:8) done accuracy:0.788283 logloss:3.14015
[INFO random_forest.cc:628] Training of tree  13/30 (tree index:13) done accuracy:0.803553 logloss:1.6681
[INFO random_forest.cc:628] Training of tree  15/30 (tree index:18) done accuracy:0.809139 logloss:1.48232
[INFO random_forest.cc:628] Training of tree  21/30 (tree index:20) done accuracy:0.817067 logloss:0.997885
[INFO random_forest.cc:628] Training of tree  23/30 (tree index:23) done accuracy:0.81845 logloss:0.944225
[INFO random_forest.cc:628] Training of tree  27/30 (tree index:26) done accuracy:0.821066 logloss:0.877389
[INFO random_forest.cc:628] Training of tree  29/30 (tree index:29) done accuracy:0.821571 logloss:0.861307
[INFO random_forest.cc:628] Training of tree  30/30 (tree index:28) done accuracy:0.821274 logloss:0.854486
[INFO random_forest.cc:696] Final OOB metrics: accuracy:0.821274 logloss:0.854486
[INFO kernel.cc:828] Export model in log directory: /tmp/tmpab1ap3d5
[INFO kernel.cc:836] Save model in resources
[INFO kernel.cc:988] Loading model from path
[INFO decision_forest.cc:590] Model loaded with 30 root(s), 43180 node(s), and 1 input feature(s).
[INFO abstract_model.cc:993] Engine "RandomForestGeneric" built
[INFO kernel.cc:848] Use fast generic engine
1053/1053 [==============================] - 233s 217ms/step

Dalam log sebelumnya, diketahui bahwa sentence adalah CATEGORICAL_SET fitur.

Model dievaluasi seperti biasa:

evaluation = model_1.evaluate(test_ds)

print(f"BinaryCrossentropyloss: {evaluation[0]}")
print(f"Accuracy: {evaluation[1]}")
14/14 [==============================] - 1s 3ms/step - loss: 0.0000e+00 - accuracy: 0.7638
BinaryCrossentropyloss: 0.0
Accuracy: 0.7637614607810974

Tampilan log pelatihan adalah sebagai berikut:

import matplotlib.pyplot as plt

logs = model_1.make_inspector().training_logs()
plt.plot([log.num_trees for log in logs], [log.evaluation.accuracy for log in logs])
plt.xlabel("Number of trees")
plt.ylabel("Out-of-bag accuracy")
pass

png

Lebih banyak pohon mungkin akan bermanfaat (saya yakin karena saya mencoba :p).

Gunakan penyematan teks yang sudah terlatih

Contoh sebelumnya melatih Hutan Acak menggunakan fitur teks mentah. Contoh ini akan menggunakan penyematan TF-Hub yang telah dilatih sebelumnya untuk mengonversi fitur teks menjadi penyematan padat, lalu melatih Hutan Acak di atasnya. Dalam situasi ini, Hutan Acak hanya akan "melihat" output numerik dari penyematan (yaitu tidak akan melihat teks mentah).

Dalam percobaan ini, akan menggunakan Universal-Kalimat-Encoder . Penyematan pra-pelatihan yang berbeda mungkin cocok untuk jenis teks yang berbeda (misalnya bahasa yang berbeda, tugas yang berbeda) tetapi juga untuk jenis fitur terstruktur lainnya (misalnya gambar).

Modul embedding dapat diterapkan di salah satu dari dua tempat:

  1. Selama persiapan dataset.
  2. Pada tahap pra-pemrosesan model.

Opsi kedua seringkali lebih disukai: Mengemas embedding dalam model membuat model lebih mudah digunakan (dan lebih sulit untuk disalahgunakan).

Pertama instal TF-Hub:

pip install --upgrade tensorflow-hub

Tidak seperti sebelumnya, Anda tidak perlu menandai teks.

def prepare_dataset(example):
  label = (example["label"] + 1) // 2
  return {"sentence" : example["sentence"]}, label

train_ds = all_ds["train"].batch(64).map(prepare_dataset)
test_ds = all_ds["validation"].batch(64).map(prepare_dataset)
%set_cell_height 300

import tensorflow_hub as hub
# NNLM (https://tfhub.dev/google/nnlm-en-dim128/2) is also a good choice.
hub_url = "http://tfhub.dev/google/universal-sentence-encoder/4"
embedding = hub.KerasLayer(hub_url)

sentence = tf.keras.layers.Input(shape=(), name="sentence", dtype=tf.string)
embedded_sentence = embedding(sentence)

raw_inputs = {"sentence": sentence}
processed_inputs = {"embedded_sentence": embedded_sentence}
preprocessor = tf.keras.Model(inputs=raw_inputs, outputs=processed_inputs)

model_2 = tfdf.keras.RandomForestModel(
    preprocessing=preprocessor,
    num_trees=100)
model_2.compile(metrics=["accuracy"])

with sys_pipes():
  model_2.fit(x=train_ds)
<IPython.core.display.Javascript object>
1053/1053 [==============================] - ETA: 0s
[INFO kernel.cc:736] Start Yggdrasil model training
[INFO kernel.cc:737] Collect training examples
[INFO kernel.cc:392] Number of batches: 1053
[INFO kernel.cc:393] Number of examples: 67349
[INFO kernel.cc:759] Dataset:
Number of records: 67349
Number of columns: 513

Number of columns by type:
    NUMERICAL: 512 (99.8051%)
    CATEGORICAL: 1 (0.194932%)

Columns:

NUMERICAL: 512 (99.8051%)
    0: "embedded_sentence.0" NUMERICAL mean:-0.00405803 min:-0.110598 max:0.113378 sd:0.0382544
    1: "embedded_sentence.1" NUMERICAL mean:0.0020755 min:-0.120324 max:0.106003 sd:0.0434171
    2: "embedded_sentence.10" NUMERICAL mean:0.0143459 min:-0.1118 max:0.118193 sd:0.039633
    3: "embedded_sentence.100" NUMERICAL mean:0.003884 min:-0.104019 max:0.127238 sd:0.0431
    4: "embedded_sentence.101" NUMERICAL mean:-0.0132592 min:-0.133774 max:0.125128 sd:0.0465773
    5: "embedded_sentence.102" NUMERICAL mean:0.00732224 min:-0.114158 max:0.135181 sd:0.0462208
    6: "embedded_sentence.103" NUMERICAL mean:-0.00316622 min:-0.115661 max:0.110651 sd:0.0393422
    7: "embedded_sentence.104" NUMERICAL mean:-0.000406039 min:-0.115186 max:0.115727 sd:0.0404569
    8: "embedded_sentence.105" NUMERICAL mean:0.01286 min:-0.10478 max:0.116059 sd:0.0408527
    9: "embedded_sentence.106" NUMERICAL mean:-0.0200857 min:-0.112344 max:0.115696 sd:0.0348447
    10: "embedded_sentence.107" NUMERICAL mean:-0.000881199 min:-0.117538 max:0.128118 sd:0.0397207
    11: "embedded_sentence.108" NUMERICAL mean:-0.0153816 min:-0.119853 max:0.111478 sd:0.0408014
    12: "embedded_sentence.109" NUMERICAL mean:0.0226631 min:-0.115775 max:0.109245 sd:0.0344709
    13: "embedded_sentence.11" NUMERICAL mean:7.16192e-05 min:-0.10631 max:0.107239 sd:0.0399338
    14: "embedded_sentence.110" NUMERICAL mean:-0.0117186 min:-0.12628 max:0.0972872 sd:0.043443
    15: "embedded_sentence.111" NUMERICAL mean:-0.0195 min:-0.138677 max:0.111032 sd:0.0530712
    16: "embedded_sentence.112" NUMERICAL mean:-0.00883525 min:-0.125434 max:0.115491 sd:0.039556
    17: "embedded_sentence.113" NUMERICAL mean:-0.0004395 min:-0.106039 max:0.1141 sd:0.0441183
    18: "embedded_sentence.114" NUMERICAL mean:-0.00404027 min:-0.131798 max:0.106558 sd:0.040391
    19: "embedded_sentence.115" NUMERICAL mean:0.0164961 min:-0.137229 max:0.11088 sd:0.0396261
    20: "embedded_sentence.116" NUMERICAL mean:-0.0163338 min:-0.109692 max:0.115104 sd:0.0396108
    21: "embedded_sentence.117" NUMERICAL mean:-0.000866382 min:-0.111258 max:0.110021 sd:0.0413076
    22: "embedded_sentence.118" NUMERICAL mean:0.00925641 min:-0.117275 max:0.109073 sd:0.0392531
    23: "embedded_sentence.119" NUMERICAL mean:0.0111224 min:-0.108271 max:0.11018 sd:0.0438516
    24: "embedded_sentence.12" NUMERICAL mean:-0.0115011 min:-0.115238 max:0.115996 sd:0.039107
    25: "embedded_sentence.120" NUMERICAL mean:-0.0109583 min:-0.117243 max:0.113314 sd:0.03753
    26: "embedded_sentence.121" NUMERICAL mean:0.0143342 min:-0.109885 max:0.121471 sd:0.0401907
    27: "embedded_sentence.122" NUMERICAL mean:-0.00603129 min:-0.111126 max:0.106422 sd:0.0401383
    28: "embedded_sentence.123" NUMERICAL mean:-0.00175511 min:-0.115219 max:0.103571 sd:0.0388962
    29: "embedded_sentence.124" NUMERICAL mean:-0.0119755 min:-0.119062 max:0.122632 sd:0.0447561
    30: "embedded_sentence.125" NUMERICAL mean:0.00210507 min:-0.116783 max:0.125758 sd:0.0469827
    31: "embedded_sentence.126" NUMERICAL mean:-0.0166424 min:-0.109771 max:0.13027 sd:0.0399639
    32: "embedded_sentence.127" NUMERICAL mean:-0.0462275 min:-0.137916 max:0.106133 sd:0.0478679
    33: "embedded_sentence.128" NUMERICAL mean:0.0101449 min:-0.134851 max:0.118003 sd:0.0415072
    34: "embedded_sentence.129" NUMERICAL mean:0.0119622 min:-0.106398 max:0.122529 sd:0.047894
    35: "embedded_sentence.13" NUMERICAL mean:-0.0179365 min:-0.133052 max:0.120982 sd:0.0461472
    36: "embedded_sentence.130" NUMERICAL mean:-0.0109302 min:-0.127096 max:0.102555 sd:0.0407236
    37: "embedded_sentence.131" NUMERICAL mean:-2.30421e-05 min:-0.0958128 max:0.116109 sd:0.0393919
    38: "embedded_sentence.132" NUMERICAL mean:0.00622466 min:-0.118524 max:0.171935 sd:0.0435631
    39: "embedded_sentence.133" NUMERICAL mean:0.00537511 min:-0.0999398 max:0.143991 sd:0.0431652
    40: "embedded_sentence.134" NUMERICAL mean:0.0111946 min:-0.101547 max:0.105716 sd:0.0365295
    41: "embedded_sentence.135" NUMERICAL mean:-0.0123165 min:-0.118347 max:0.113619 sd:0.0422525
    42: "embedded_sentence.136" NUMERICAL mean:0.00882626 min:-0.118642 max:0.115052 sd:0.0393646
    43: "embedded_sentence.137" NUMERICAL mean:0.0106701 min:-0.108036 max:0.109746 sd:0.0405698
    44: "embedded_sentence.138" NUMERICAL mean:-0.0130655 min:-0.148064 max:0.118745 sd:0.047092
    45: "embedded_sentence.139" NUMERICAL mean:0.00256777 min:-0.108547 max:0.102547 sd:0.0388182
    46: "embedded_sentence.14" NUMERICAL mean:0.00090757 min:-0.124092 max:0.111964 sd:0.0393761
    47: "embedded_sentence.140" NUMERICAL mean:-0.00255201 min:-0.113298 max:0.120327 sd:0.0469564
    48: "embedded_sentence.141" NUMERICAL mean:-0.0123127 min:-0.124039 max:0.110528 sd:0.047218
    49: "embedded_sentence.142" NUMERICAL mean:0.00659571 min:-0.106909 max:0.126327 sd:0.0444828
    50: "embedded_sentence.143" NUMERICAL mean:0.00838607 min:-0.121819 max:0.108286 sd:0.0409403
    51: "embedded_sentence.144" NUMERICAL mean:-0.00504916 min:-0.117741 max:0.109832 sd:0.0402179
    52: "embedded_sentence.145" NUMERICAL mean:-0.0135 min:-0.112358 max:0.108238 sd:0.0393695
    53: "embedded_sentence.146" NUMERICAL mean:-0.00551706 min:-0.108132 max:0.103118 sd:0.0375181
    54: "embedded_sentence.147" NUMERICAL mean:0.00226707 min:-0.109358 max:0.117688 sd:0.0416268
    55: "embedded_sentence.148" NUMERICAL mean:-0.0083477 min:-0.113886 max:0.105174 sd:0.0379074
    56: "embedded_sentence.149" NUMERICAL mean:-0.0029158 min:-0.104327 max:0.10898 sd:0.0394245
    57: "embedded_sentence.15" NUMERICAL mean:-0.0465314 min:-0.127274 max:0.115007 sd:0.0410307
    58: "embedded_sentence.150" NUMERICAL mean:-0.00857055 min:-0.11757 max:0.108206 sd:0.0416898
    59: "embedded_sentence.151" NUMERICAL mean:0.00697777 min:-0.104269 max:0.109967 sd:0.0353302
    60: "embedded_sentence.152" NUMERICAL mean:-0.0220037 min:-0.122602 max:0.105503 sd:0.0429071
    61: "embedded_sentence.153" NUMERICAL mean:-0.00103943 min:-0.109326 max:0.112115 sd:0.0413219
    62: "embedded_sentence.154" NUMERICAL mean:-0.010306 min:-0.106116 max:0.112624 sd:0.0392094
    63: "embedded_sentence.155" NUMERICAL mean:-0.0128503 min:-0.133511 max:0.129721 sd:0.0417087
    64: "embedded_sentence.156" NUMERICAL mean:-0.00796017 min:-0.10801 max:0.111555 sd:0.0401771
    65: "embedded_sentence.157" NUMERICAL mean:-0.0263644 min:-0.135057 max:0.131898 sd:0.0473006
    66: "embedded_sentence.158" NUMERICAL mean:0.0157188 min:-0.109795 max:0.13194 sd:0.0423631
    67: "embedded_sentence.159" NUMERICAL mean:0.00616692 min:-0.0996693 max:0.121898 sd:0.0405747
    68: "embedded_sentence.16" NUMERICAL mean:0.0122186 min:-0.132531 max:0.112023 sd:0.0412513
    69: "embedded_sentence.160" NUMERICAL mean:0.00140896 min:-0.125797 max:0.10415 sd:0.0422833
    70: "embedded_sentence.161" NUMERICAL mean:-0.00968098 min:-0.107129 max:0.109673 sd:0.0389125
    71: "embedded_sentence.162" NUMERICAL mean:0.0174977 min:-0.102559 max:0.117249 sd:0.0394065
    72: "embedded_sentence.163" NUMERICAL mean:-0.01559 min:-0.117529 max:0.132716 sd:0.0422287
    73: "embedded_sentence.164" NUMERICAL mean:0.0103332 min:-0.131635 max:0.117116 sd:0.0432647
    74: "embedded_sentence.165" NUMERICAL mean:0.0164754 min:-0.111395 max:0.106868 sd:0.03591
    75: "embedded_sentence.166" NUMERICAL mean:-0.0300909 min:-0.110079 max:0.138071 sd:0.0393771
    76: "embedded_sentence.167" NUMERICAL mean:-0.00284721 min:-0.113047 max:0.1113 sd:0.0402787
    77: "embedded_sentence.168" NUMERICAL mean:0.0128449 min:-0.123295 max:0.101678 sd:0.035443
    78: "embedded_sentence.169" NUMERICAL mean:-0.0018307 min:-0.113497 max:0.108755 sd:0.0385736
    79: "embedded_sentence.17" NUMERICAL mean:0.0112924 min:-0.118483 max:0.109047 sd:0.0411375
    80: "embedded_sentence.170" NUMERICAL mean:-0.0154471 min:-0.123997 max:0.0995884 sd:0.039095
    81: "embedded_sentence.171" NUMERICAL mean:-0.0115266 min:-0.135629 max:0.111586 sd:0.0564499
    82: "embedded_sentence.172" NUMERICAL mean:-0.00305818 min:-0.108149 max:0.125287 sd:0.0416153
    83: "embedded_sentence.173" NUMERICAL mean:-0.0192183 min:-0.128661 max:0.111586 sd:0.0445312
    84: "embedded_sentence.174" NUMERICAL mean:-0.00547071 min:-0.106778 max:0.107318 sd:0.0412694
    85: "embedded_sentence.175" NUMERICAL mean:0.00303105 min:-0.114183 max:0.11671 sd:0.037753
    86: "embedded_sentence.176" NUMERICAL mean:0.0200632 min:-0.119154 max:0.12262 sd:0.0449386
    87: "embedded_sentence.177" NUMERICAL mean:0.00830421 min:-0.106867 max:0.108159 sd:0.04212
    88: "embedded_sentence.178" NUMERICAL mean:0.00879771 min:-0.119236 max:0.0975505 sd:0.0365596
    89: "embedded_sentence.179" NUMERICAL mean:-0.0224472 min:-0.141699 max:0.121597 sd:0.0451563
    90: "embedded_sentence.18" NUMERICAL mean:0.0161367 min:-0.103659 max:0.106467 sd:0.0396646
    91: "embedded_sentence.180" NUMERICAL mean:0.00700458 min:-0.122243 max:0.106828 sd:0.0406674
    92: "embedded_sentence.181" NUMERICAL mean:0.015665 min:-0.123784 max:0.117493 sd:0.0423638
    93: "embedded_sentence.182" NUMERICAL mean:0.00455087 min:-0.130433 max:0.129947 sd:0.0468312
    94: "embedded_sentence.183" NUMERICAL mean:0.00469912 min:-0.105513 max:0.115268 sd:0.0422015
    95: "embedded_sentence.184" NUMERICAL mean:0.00118913 min:-0.132085 max:0.119005 sd:0.0425006
    96: "embedded_sentence.185" NUMERICAL mean:-0.0091211 min:-0.105384 max:0.107321 sd:0.0394833
    97: "embedded_sentence.186" NUMERICAL mean:0.00847289 min:-0.100142 max:0.11416 sd:0.0354507
    98: "embedded_sentence.187" NUMERICAL mean:0.00401229 min:-0.0997345 max:0.0985512 sd:0.0330015
    99: "embedded_sentence.188" NUMERICAL mean:0.0375059 min:-0.107009 max:0.147423 sd:0.0457626
    100: "embedded_sentence.189" NUMERICAL mean:-0.0108558 min:-0.158798 max:0.124698 sd:0.0429543
    101: "embedded_sentence.19" NUMERICAL mean:0.000475908 min:-0.126049 max:0.109106 sd:0.0416907
    102: "embedded_sentence.190" NUMERICAL mean:0.0055649 min:-0.102637 max:0.112907 sd:0.0428818
    103: "embedded_sentence.191" NUMERICAL mean:0.0115727 min:-0.0992453 max:0.114756 sd:0.0385606
    104: "embedded_sentence.192" NUMERICAL mean:0.0188207 min:-0.10799 max:0.126446 sd:0.0480458
    105: "embedded_sentence.193" NUMERICAL mean:-0.0231128 min:-0.125829 max:0.098485 sd:0.0413616
    106: "embedded_sentence.194" NUMERICAL mean:-0.0125518 min:-0.118983 max:0.111524 sd:0.0394032
    107: "embedded_sentence.195" NUMERICAL mean:-0.00734374 min:-0.140773 max:0.124731 sd:0.048662
    108: "embedded_sentence.196" NUMERICAL mean:0.0147101 min:-0.109208 max:0.114207 sd:0.0392372
    109: "embedded_sentence.197" NUMERICAL mean:0.00382817 min:-0.0960263 max:0.109744 sd:0.0343786
    110: "embedded_sentence.198" NUMERICAL mean:0.0148358 min:-0.121261 max:0.137886 sd:0.0396124
    111: "embedded_sentence.199" NUMERICAL mean:0.0139377 min:-0.133057 max:0.129123 sd:0.0434494
    112: "embedded_sentence.2" NUMERICAL mean:0.00763253 min:-0.102393 max:0.126418 sd:0.0391092
    113: "embedded_sentence.20" NUMERICAL mean:0.0067624 min:-0.117482 max:0.140442 sd:0.0473874
    114: "embedded_sentence.200" NUMERICAL mean:-0.022174 min:-0.135182 max:0.0998059 sd:0.0447171
    115: "embedded_sentence.201" NUMERICAL mean:0.00918432 min:-0.129768 max:0.104146 sd:0.0407455
    116: "embedded_sentence.202" NUMERICAL mean:6.68974e-05 min:-0.108528 max:0.112123 sd:0.039669
    117: "embedded_sentence.203" NUMERICAL mean:-0.0211792 min:-0.138447 max:0.151201 sd:0.0475548
    118: "embedded_sentence.204" NUMERICAL mean:0.0149458 min:-0.114192 max:0.121993 sd:0.0451805
    119: "embedded_sentence.205" NUMERICAL mean:-0.000877425 min:-0.106281 max:0.110069 sd:0.0399283
    120: "embedded_sentence.206" NUMERICAL mean:0.00135042 min:-0.122458 max:0.133155 sd:0.0490798
    121: "embedded_sentence.207" NUMERICAL mean:-0.00564686 min:-0.0980346 max:0.124534 sd:0.0381495
    122: "embedded_sentence.208" NUMERICAL mean:-0.0137386 min:-0.104712 max:0.116268 sd:0.0380542
    123: "embedded_sentence.209" NUMERICAL mean:-0.000932724 min:-0.120575 max:0.106782 sd:0.0389735
    124: "embedded_sentence.21" NUMERICAL mean:-0.0103802 min:-0.141084 max:0.11384 sd:0.0543033
    125: "embedded_sentence.210" NUMERICAL mean:-0.0221436 min:-0.11615 max:0.110612 sd:0.0375885
    126: "embedded_sentence.211" NUMERICAL mean:0.00739621 min:-0.107881 max:0.139283 sd:0.0380559
    127: "embedded_sentence.212" NUMERICAL mean:0.000771754 min:-0.130277 max:0.118151 sd:0.0457612
    128: "embedded_sentence.213" NUMERICAL mean:-0.00631693 min:-0.113811 max:0.122369 sd:0.0420019
    129: "embedded_sentence.214" NUMERICAL mean:-0.0190752 min:-0.130814 max:0.12256 sd:0.0462656
    130: "embedded_sentence.215" NUMERICAL mean:0.00351438 min:-0.119497 max:0.112531 sd:0.0389063
    131: "embedded_sentence.216" NUMERICAL mean:-0.00563816 min:-0.113327 max:0.108573 sd:0.0398438
    132: "embedded_sentence.217" NUMERICAL mean:-0.0128165 min:-0.152494 max:0.112129 sd:0.0435284
    133: "embedded_sentence.218" NUMERICAL mean:-0.000746105 min:-0.115932 max:0.103357 sd:0.0396475
    134: "embedded_sentence.219" NUMERICAL mean:0.00706257 min:-0.105737 max:0.115808 sd:0.0415758
    135: "embedded_sentence.22" NUMERICAL mean:0.00470285 min:-0.108062 max:0.127381 sd:0.0465233
    136: "embedded_sentence.220" NUMERICAL mean:0.000614336 min:-0.120866 max:0.10502 sd:0.036915
    137: "embedded_sentence.221" NUMERICAL mean:-0.00315481 min:-0.110209 max:0.126778 sd:0.0398762
    138: "embedded_sentence.222" NUMERICAL mean:-0.0055338 min:-0.112974 max:0.111057 sd:0.0367833
    139: "embedded_sentence.223" NUMERICAL mean:0.0129532 min:-0.108908 max:0.112232 sd:0.0406737
    140: "embedded_sentence.224" NUMERICAL mean:-0.0195448 min:-0.112833 max:0.122565 sd:0.0423641
    141: "embedded_sentence.225" NUMERICAL mean:0.00715641 min:-0.136763 max:0.123146 sd:0.0455536
    142: "embedded_sentence.226" NUMERICAL mean:0.0105978 min:-0.121166 max:0.125465 sd:0.0433322
    143: "embedded_sentence.227" NUMERICAL mean:-0.00822156 min:-0.131487 max:0.125193 sd:0.0440489
    144: "embedded_sentence.228" NUMERICAL mean:0.0119113 min:-0.109956 max:0.107868 sd:0.0382855
    145: "embedded_sentence.229" NUMERICAL mean:-0.00739044 min:-0.116468 max:0.109886 sd:0.0406385
    146: "embedded_sentence.23" NUMERICAL mean:0.00203851 min:-0.116632 max:0.116226 sd:0.0400387
    147: "embedded_sentence.230" NUMERICAL mean:0.00819752 min:-0.100016 max:0.125019 sd:0.041894
    148: "embedded_sentence.231" NUMERICAL mean:-0.00420582 min:-0.139816 max:0.138647 sd:0.0446602
    149: "embedded_sentence.232" NUMERICAL mean:0.00810722 min:-0.11301 max:0.106853 sd:0.0400325
    150: "embedded_sentence.233" NUMERICAL mean:0.0561205 min:-0.110581 max:0.182053 sd:0.0645425
    151: "embedded_sentence.234" NUMERICAL mean:0.0202212 min:-0.109987 max:0.116562 sd:0.0374199
    152: "embedded_sentence.235" NUMERICAL mean:-0.0125547 min:-0.104766 max:0.115993 sd:0.0383767
    153: "embedded_sentence.236" NUMERICAL mean:0.00228544 min:-0.126092 max:0.125991 sd:0.0403744
    154: "embedded_sentence.237" NUMERICAL mean:-0.00306858 min:-0.107907 max:0.109284 sd:0.0409564
    155: "embedded_sentence.238" NUMERICAL mean:-0.00930815 min:-0.156445 max:0.107558 sd:0.0437983
    156: "embedded_sentence.239" NUMERICAL mean:0.00958206 min:-0.112118 max:0.1195 sd:0.0451739
    157: "embedded_sentence.24" NUMERICAL mean:-0.000927636 min:-0.127188 max:0.105079 sd:0.042448
    158: "embedded_sentence.240" NUMERICAL mean:-0.00998686 min:-0.125181 max:0.107936 sd:0.0414998
    159: "embedded_sentence.241" NUMERICAL mean:-0.00128156 min:-0.103688 max:0.109599 sd:0.0377828
    160: "embedded_sentence.242" NUMERICAL mean:-0.000524396 min:-0.141003 max:0.114016 sd:0.050088
    161: "embedded_sentence.243" NUMERICAL mean:-0.000359091 min:-0.114483 max:0.130721 sd:0.0418654
    162: "embedded_sentence.244" NUMERICAL mean:0.0161613 min:-0.103932 max:0.116754 sd:0.0401808
    163: "embedded_sentence.245" NUMERICAL mean:0.0275608 min:-0.127227 max:0.143614 sd:0.0465002
    164: "embedded_sentence.246" NUMERICAL mean:-0.0199729 min:-0.107911 max:0.114303 sd:0.037755
    165: "embedded_sentence.247" NUMERICAL mean:-0.00782877 min:-0.104362 max:0.11543 sd:0.041834
    166: "embedded_sentence.248" NUMERICAL mean:-0.000544771 min:-0.159329 max:0.155847 sd:0.0543164
    167: "embedded_sentence.249" NUMERICAL mean:-0.0101255 min:-0.116432 max:0.107342 sd:0.0401119
    168: "embedded_sentence.25" NUMERICAL mean:0.0111641 min:-0.114852 max:0.110724 sd:0.0379149
    169: "embedded_sentence.250" NUMERICAL mean:0.0161291 min:-0.12229 max:0.109533 sd:0.0372791
    170: "embedded_sentence.251" NUMERICAL mean:-0.000411384 min:-0.118338 max:0.116215 sd:0.0459737
    171: "embedded_sentence.252" NUMERICAL mean:-0.00268351 min:-0.108327 max:0.109842 sd:0.037631
    172: "embedded_sentence.253" NUMERICAL mean:-0.00246653 min:-0.107393 max:0.114115 sd:0.0386872
    173: "embedded_sentence.254" NUMERICAL mean:0.00223856 min:-0.122731 max:0.140702 sd:0.0447316
    174: "embedded_sentence.255" NUMERICAL mean:0.00186748 min:-0.128662 max:0.107003 sd:0.0409741
    175: "embedded_sentence.256" NUMERICAL mean:0.00786944 min:-0.113685 max:0.118287 sd:0.0418721
    176: "embedded_sentence.257" NUMERICAL mean:-0.00450053 min:-0.117383 max:0.138567 sd:0.0535368
    177: "embedded_sentence.258" NUMERICAL mean:0.0128997 min:-0.109905 max:0.118147 sd:0.0393103
    178: "embedded_sentence.259" NUMERICAL mean:0.00794854 min:-0.10424 max:0.111261 sd:0.0380286
    179: "embedded_sentence.26" NUMERICAL mean:-0.00327954 min:-0.105336 max:0.104934 sd:0.0414663
    180: "embedded_sentence.260" NUMERICAL mean:-0.0117858 min:-0.116906 max:0.103426 sd:0.0360148
    181: "embedded_sentence.261" NUMERICAL mean:0.00338883 min:-0.113532 max:0.114904 sd:0.0432436
    182: "embedded_sentence.262" NUMERICAL mean:0.00238206 min:-0.149582 max:0.13639 sd:0.0507155
    183: "embedded_sentence.263" NUMERICAL mean:-0.0103074 min:-0.140884 max:0.117382 sd:0.0508164
    184: "embedded_sentence.264" NUMERICAL mean:0.00478302 min:-0.104717 max:0.125411 sd:0.0411592
    185: "embedded_sentence.265" NUMERICAL mean:0.00418632 min:-0.111659 max:0.125069 sd:0.0400184
    186: "embedded_sentence.266" NUMERICAL mean:-0.0065648 min:-0.115424 max:0.115422 sd:0.040284
    187: "embedded_sentence.267" NUMERICAL mean:-0.0108974 min:-0.140032 max:0.108537 sd:0.0416651
    188: "embedded_sentence.268" NUMERICAL mean:0.021397 min:-0.110922 max:0.120673 sd:0.0416704
    189: "embedded_sentence.269" NUMERICAL mean:-0.00266875 min:-0.108534 max:0.116014 sd:0.0454318
    190: "embedded_sentence.27" NUMERICAL mean:-0.00290058 min:-0.116482 max:0.113443 sd:0.0406192
    191: "embedded_sentence.270" NUMERICAL mean:0.00904486 min:-0.130418 max:0.158166 sd:0.0548252
    192: "embedded_sentence.271" NUMERICAL mean:0.00193987 min:-0.137558 max:0.14649 sd:0.0508115
    193: "embedded_sentence.272" NUMERICAL mean:-0.000186977 min:-0.116413 max:0.0989802 sd:0.0402487
    194: "embedded_sentence.273" NUMERICAL mean:0.006326 min:-0.115043 max:0.107482 sd:0.0416155
    195: "embedded_sentence.274" NUMERICAL mean:-0.000278915 min:-0.115695 max:0.105325 sd:0.0406986
    196: "embedded_sentence.275" NUMERICAL mean:-0.0102959 min:-0.099434 max:0.128947 sd:0.0361354
    197: "embedded_sentence.276" NUMERICAL mean:-0.0207918 min:-0.116139 max:0.110566 sd:0.0419115
    198: "embedded_sentence.277" NUMERICAL mean:-0.0146824 min:-0.127741 max:0.101543 sd:0.0430422
    199: "embedded_sentence.278" NUMERICAL mean:0.0187157 min:-0.109012 max:0.119525 sd:0.0469243
    200: "embedded_sentence.279" NUMERICAL mean:0.0080616 min:-0.117272 max:0.138517 sd:0.0500966
    201: "embedded_sentence.28" NUMERICAL mean:0.0028253 min:-0.110413 max:0.123963 sd:0.0427868
    202: "embedded_sentence.280" NUMERICAL mean:0.0017946 min:-0.129883 max:0.103422 sd:0.0466893
    203: "embedded_sentence.281" NUMERICAL mean:-0.00546588 min:-0.123351 max:0.122337 sd:0.044948
    204: "embedded_sentence.282" NUMERICAL mean:-0.00352354 min:-0.114364 max:0.122504 sd:0.0421913
    205: "embedded_sentence.283" NUMERICAL mean:0.00593286 min:-0.104898 max:0.11458 sd:0.0418491
    206: "embedded_sentence.284" NUMERICAL mean:-0.0136068 min:-0.112147 max:0.110563 sd:0.0402539
    207: "embedded_sentence.285" NUMERICAL mean:-0.0148682 min:-0.143126 max:0.121947 sd:0.0652969
    208: "embedded_sentence.286" NUMERICAL mean:0.00865603 min:-0.105883 max:0.116117 sd:0.0411941
    209: "embedded_sentence.287" NUMERICAL mean:0.00838776 min:-0.103808 max:0.118732 sd:0.0400033
    210: "embedded_sentence.288" NUMERICAL mean:-0.004587 min:-0.126515 max:0.110044 sd:0.0429655
    211: "embedded_sentence.289" NUMERICAL mean:0.022459 min:-0.101127 max:0.122341 sd:0.0412413
    212: "embedded_sentence.29" NUMERICAL mean:0.0282239 min:-0.104219 max:0.143075 sd:0.0487783
    213: "embedded_sentence.290" NUMERICAL mean:-0.010227 min:-0.104646 max:0.11767 sd:0.0391759
    214: "embedded_sentence.291" NUMERICAL mean:0.0479376 min:-0.118972 max:0.140115 sd:0.0441115
    215: "embedded_sentence.292" NUMERICAL mean:-0.012885 min:-0.13523 max:0.1102 sd:0.044191
    216: "embedded_sentence.293" NUMERICAL mean:-0.00582894 min:-0.118518 max:0.1084 sd:0.0424979
    217: "embedded_sentence.294" NUMERICAL mean:0.00673141 min:-0.123867 max:0.135324 sd:0.0469895
    218: "embedded_sentence.295" NUMERICAL mean:0.00592276 min:-0.109027 max:0.121098 sd:0.0376266
    219: "embedded_sentence.296" NUMERICAL mean:-0.000323969 min:-0.132564 max:0.106466 sd:0.0429391
    220: "embedded_sentence.297" NUMERICAL mean:0.00159954 min:-0.10937 max:0.112449 sd:0.0405972
    221: "embedded_sentence.298" NUMERICAL mean:0.0203997 min:-0.130037 max:0.102531 sd:0.0376077
    222: "embedded_sentence.299" NUMERICAL mean:0.00443814 min:-0.126552 max:0.0985593 sd:0.0406299
    223: "embedded_sentence.3" NUMERICAL mean:-0.000732218 min:-0.109626 max:0.10121 sd:0.0369311
    224: "embedded_sentence.30" NUMERICAL mean:0.0119399 min:-0.10224 max:0.123741 sd:0.0407582
    225: "embedded_sentence.300" NUMERICAL mean:0.00640362 min:-0.109722 max:0.113832 sd:0.042602
    226: "embedded_sentence.301" NUMERICAL mean:0.00300331 min:-0.10537 max:0.1057 sd:0.0400365
    227: "embedded_sentence.302" NUMERICAL mean:0.0105726 min:-0.125406 max:0.125337 sd:0.0386879
    228: "embedded_sentence.303" NUMERICAL mean:-0.00682487 min:-0.119722 max:0.122495 sd:0.0397744
    229: "embedded_sentence.304" NUMERICAL mean:0.0134615 min:-0.113637 max:0.104308 sd:0.0364568
    230: "embedded_sentence.305" NUMERICAL mean:0.00644908 min:-0.106984 max:0.118193 sd:0.0378877
    231: "embedded_sentence.306" NUMERICAL mean:0.00721292 min:-0.106136 max:0.112877 sd:0.0413748
    232: "embedded_sentence.307" NUMERICAL mean:-0.00382715 min:-0.104953 max:0.0990278 sd:0.0384972
    233: "embedded_sentence.308" NUMERICAL mean:6.43178e-05 min:-0.120151 max:0.118558 sd:0.0443767
    234: "embedded_sentence.309" NUMERICAL mean:0.00712577 min:-0.118636 max:0.108645 sd:0.0429865
    235: "embedded_sentence.31" NUMERICAL mean:-0.00879883 min:-0.106952 max:0.114961 sd:0.0397046
    236: "embedded_sentence.310" NUMERICAL mean:0.00668597 min:-0.10649 max:0.116392 sd:0.040195
    237: "embedded_sentence.311" NUMERICAL mean:-0.000903381 min:-0.119513 max:0.131158 sd:0.0420348
    238: "embedded_sentence.312" NUMERICAL mean:0.0107332 min:-0.113776 max:0.112523 sd:0.0408102
    239: "embedded_sentence.313" NUMERICAL mean:0.00918225 min:-0.103286 max:0.106814 sd:0.03942
    240: "embedded_sentence.314" NUMERICAL mean:0.00465102 min:-0.110279 max:0.117252 sd:0.0393894
    241: "embedded_sentence.315" NUMERICAL mean:-0.00789822 min:-0.107114 max:0.11401 sd:0.0388347
    242: "embedded_sentence.316" NUMERICAL mean:0.003646 min:-0.115399 max:0.102757 sd:0.0402218
    243: "embedded_sentence.317" NUMERICAL mean:0.015828 min:-0.115321 max:0.130694 sd:0.0440749
    244: "embedded_sentence.318" NUMERICAL mean:-0.0205412 min:-0.115586 max:0.144723 sd:0.0485943
    245: "embedded_sentence.319" NUMERICAL mean:0.00661137 min:-0.121465 max:0.11194 sd:0.0411842
    246: "embedded_sentence.32" NUMERICAL mean:-0.00641689 min:-0.109096 max:0.115278 sd:0.0395207
    247: "embedded_sentence.320" NUMERICAL mean:-0.0148287 min:-0.103164 max:0.116781 sd:0.0390764
    248: "embedded_sentence.321" NUMERICAL mean:-0.0216578 min:-0.124605 max:0.115269 sd:0.0434055
    249: "embedded_sentence.322" NUMERICAL mean:0.00985385 min:-0.100306 max:0.1268 sd:0.0390696
    250: "embedded_sentence.323" NUMERICAL mean:0.00628717 min:-0.0997497 max:0.119355 sd:0.0396103
    251: "embedded_sentence.324" NUMERICAL mean:-0.00196284 min:-0.121922 max:0.120337 sd:0.0459949
    252: "embedded_sentence.325" NUMERICAL mean:-0.00537022 min:-0.110575 max:0.123165 sd:0.0455996
    253: "embedded_sentence.326" NUMERICAL mean:0.00455174 min:-0.115791 max:0.104665 sd:0.0401681
    254: "embedded_sentence.327" NUMERICAL mean:-0.00533296 min:-0.130506 max:0.112283 sd:0.0453555
    255: "embedded_sentence.328" NUMERICAL mean:-0.00440578 min:-0.126272 max:0.103891 sd:0.041464
    256: "embedded_sentence.329" NUMERICAL mean:-0.0101936 min:-0.108874 max:0.111676 sd:0.0395482
    257: "embedded_sentence.33" NUMERICAL mean:0.00148918 min:-0.111798 max:0.115585 sd:0.0406788
    258: "embedded_sentence.330" NUMERICAL mean:0.00703036 min:-0.108652 max:0.103578 sd:0.0400975
    259: "embedded_sentence.331" NUMERICAL mean:0.000541923 min:-0.109862 max:0.10999 sd:0.0408574
    260: "embedded_sentence.332" NUMERICAL mean:0.0188891 min:-0.112872 max:0.118079 sd:0.0397373
    261: "embedded_sentence.333" NUMERICAL mean:-0.012192 min:-0.133506 max:0.13836 sd:0.0512842
    262: "embedded_sentence.334" NUMERICAL mean:-0.0265024 min:-0.126857 max:0.097852 sd:0.0420318
    263: "embedded_sentence.335" NUMERICAL mean:0.00215234 min:-0.111504 max:0.116062 sd:0.038159
    264: "embedded_sentence.336" NUMERICAL mean:-0.00825738 min:-0.125886 max:0.10212 sd:0.0376238
    265: "embedded_sentence.337" NUMERICAL mean:-0.0055194 min:-0.105159 max:0.110274 sd:0.0404973
    266: "embedded_sentence.338" NUMERICAL mean:0.0111058 min:-0.103003 max:0.134575 sd:0.0376746
    267: "embedded_sentence.339" NUMERICAL mean:0.00451027 min:-0.116598 max:0.114548 sd:0.0434438
    268: "embedded_sentence.34" NUMERICAL mean:-0.00225704 min:-0.116123 max:0.116634 sd:0.0410024
    269: "embedded_sentence.340" NUMERICAL mean:0.0209382 min:-0.109457 max:0.119971 sd:0.0448743
    270: "embedded_sentence.341" NUMERICAL mean:0.00896807 min:-0.121829 max:0.10898 sd:0.0399955
    271: "embedded_sentence.342" NUMERICAL mean:-0.00661843 min:-0.113602 max:0.112046 sd:0.0417717
    272: "embedded_sentence.343" NUMERICAL mean:-0.00921778 min:-0.112399 max:0.116532 sd:0.0399069
    273: "embedded_sentence.344" NUMERICAL mean:0.00135801 min:-0.121002 max:0.0829257 sd:0.0322146
    274: "embedded_sentence.345" NUMERICAL mean:0.00347003 min:-0.131471 max:0.101491 sd:0.0404394
    275: "embedded_sentence.346" NUMERICAL mean:-0.00118125 min:-0.14804 max:0.11391 sd:0.0423848
    276: "embedded_sentence.347" NUMERICAL mean:-0.00893261 min:-0.125488 max:0.109213 sd:0.0498338
    277: "embedded_sentence.348" NUMERICAL mean:-0.0112279 min:-0.119783 max:0.106986 sd:0.039883
    278: "embedded_sentence.349" NUMERICAL mean:0.00921196 min:-0.108645 max:0.124485 sd:0.0427417
    279: "embedded_sentence.35" NUMERICAL mean:0.0139088 min:-0.11982 max:0.117347 sd:0.0412498
    280: "embedded_sentence.350" NUMERICAL mean:-0.0064119 min:-0.11853 max:0.108147 sd:0.0396107
    281: "embedded_sentence.351" NUMERICAL mean:0.00046816 min:-0.133059 max:0.106031 sd:0.0419676
    282: "embedded_sentence.352" NUMERICAL mean:0.00143986 min:-0.119083 max:0.0987318 sd:0.0358907
    283: "embedded_sentence.353" NUMERICAL mean:0.00247002 min:-0.109389 max:0.118887 sd:0.0416032
    284: "embedded_sentence.354" NUMERICAL mean:0.000102879 min:-0.139157 max:0.0995683 sd:0.0394998
    285: "embedded_sentence.355" NUMERICAL mean:-0.00525663 min:-0.146684 max:0.104288 sd:0.0406929
    286: "embedded_sentence.356" NUMERICAL mean:-0.0884722 min:-0.132905 max:0.0538598 sd:0.0108211
    287: "embedded_sentence.357" NUMERICAL mean:0.00677648 min:-0.110339 max:0.110136 sd:0.0402613
    288: "embedded_sentence.358" NUMERICAL mean:0.00630266 min:-0.111695 max:0.115859 sd:0.0427588
    289: "embedded_sentence.359" NUMERICAL mean:0.00225805 min:-0.126003 max:0.117678 sd:0.0444635
    290: "embedded_sentence.36" NUMERICAL mean:-0.00414969 min:-0.117693 max:0.10138 sd:0.0421129
    291: "embedded_sentence.360" NUMERICAL mean:-0.00827234 min:-0.133543 max:0.115376 sd:0.0466799
    292: "embedded_sentence.361" NUMERICAL mean:-0.00625222 min:-0.10512 max:0.123856 sd:0.0418715
    293: "embedded_sentence.362" NUMERICAL mean:0.0651293 min:-0.11562 max:0.153915 sd:0.0359
    294: "embedded_sentence.363" NUMERICAL mean:0.00968887 min:-0.115793 max:0.11435 sd:0.0422501
    295: "embedded_sentence.364" NUMERICAL mean:0.00449241 min:-0.132071 max:0.103237 sd:0.0373741
    296: "embedded_sentence.365" NUMERICAL mean:-0.016221 min:-0.113495 max:0.106975 sd:0.0425689
    297: "embedded_sentence.366" NUMERICAL mean:0.0112515 min:-0.154925 max:0.151612 sd:0.0513015
    298: "embedded_sentence.367" NUMERICAL mean:-5.21384e-05 min:-0.11585 max:0.112307 sd:0.0391906
    299: "embedded_sentence.368" NUMERICAL mean:-0.00112394 min:-0.121213 max:0.126588 sd:0.044652
    300: "embedded_sentence.369" NUMERICAL mean:0.00485578 min:-0.106476 max:0.115632 sd:0.041426
    301: "embedded_sentence.37" NUMERICAL mean:0.00156116 min:-0.114707 max:0.128423 sd:0.0410256
    302: "embedded_sentence.370" NUMERICAL mean:-0.0174785 min:-0.114634 max:0.104434 sd:0.0382088
    303: "embedded_sentence.371" NUMERICAL mean:-0.00559737 min:-0.111149 max:0.115734 sd:0.0402863
    304: "embedded_sentence.372" NUMERICAL mean:-0.00348879 min:-0.108034 max:0.107825 sd:0.0403769
    305: "embedded_sentence.373" NUMERICAL mean:0.0188844 min:-0.127183 max:0.109232 sd:0.0404355
    306: "embedded_sentence.374" NUMERICAL mean:-0.00368462 min:-0.122589 max:0.124831 sd:0.0403308
    307: "embedded_sentence.375" NUMERICAL mean:-0.0106164 min:-0.118052 max:0.150001 sd:0.0432093
    308: "embedded_sentence.376" NUMERICAL mean:0.00311828 min:-0.106068 max:0.11577 sd:0.0400224
    309: "embedded_sentence.377" NUMERICAL mean:-0.0179061 min:-0.125819 max:0.111004 sd:0.0413477
    310: "embedded_sentence.378" NUMERICAL mean:-0.0129489 min:-0.126863 max:0.110993 sd:0.0434155
    311: "embedded_sentence.379" NUMERICAL mean:-0.00801256 min:-0.130591 max:0.112902 sd:0.04366
    312: "embedded_sentence.38" NUMERICAL mean:-0.00506909 min:-0.108533 max:0.113459 sd:0.0408111
    313: "embedded_sentence.380" NUMERICAL mean:-0.00901065 min:-0.109901 max:0.123667 sd:0.0397827
    314: "embedded_sentence.381" NUMERICAL mean:0.00213499 min:-0.117992 max:0.104067 sd:0.0396603
    315: "embedded_sentence.382" NUMERICAL mean:0.0139051 min:-0.116796 max:0.115264 sd:0.041444
    316: "embedded_sentence.383" NUMERICAL mean:0.0015667 min:-0.137801 max:0.121558 sd:0.0446806
    317: "embedded_sentence.384" NUMERICAL mean:0.00590388 min:-0.136462 max:0.15641 sd:0.0551146
    318: "embedded_sentence.385" NUMERICAL mean:-0.0225046 min:-0.125096 max:0.122088 sd:0.0425471
    319: "embedded_sentence.386" NUMERICAL mean:-0.0291993 min:-0.149865 max:0.12312 sd:0.0469557
    320: "embedded_sentence.387" NUMERICAL mean:0.0136623 min:-0.113261 max:0.107316 sd:0.0408869
    321: "embedded_sentence.388" NUMERICAL mean:0.0119563 min:-0.0992984 max:0.118811 sd:0.0415827
    322: "embedded_sentence.389" NUMERICAL mean:1.88279e-05 min:-0.103729 max:0.117051 sd:0.0396793
    323: "embedded_sentence.39" NUMERICAL mean:-0.000564469 min:-0.110207 max:0.123467 sd:0.0405413
    324: "embedded_sentence.390" NUMERICAL mean:0.00614745 min:-0.142472 max:0.132447 sd:0.0483092
    325: "embedded_sentence.391" NUMERICAL mean:-0.00252831 min:-0.111571 max:0.110414 sd:0.0407484
    326: "embedded_sentence.392" NUMERICAL mean:0.00560033 min:-0.106415 max:0.109868 sd:0.0411823
    327: "embedded_sentence.393" NUMERICAL mean:0.000437511 min:-0.115213 max:0.121544 sd:0.0406626
    328: "embedded_sentence.394" NUMERICAL mean:-0.00507897 min:-0.112722 max:0.112578 sd:0.0407212
    329: "embedded_sentence.395" NUMERICAL mean:-0.0104218 min:-0.106171 max:0.13395 sd:0.0412331
    330: "embedded_sentence.396" NUMERICAL mean:-0.025218 min:-0.121914 max:0.13782 sd:0.0420871
    331: "embedded_sentence.397" NUMERICAL mean:-0.00425221 min:-0.117618 max:0.106735 sd:0.0438757
    332: "embedded_sentence.398" NUMERICAL mean:-0.0112567 min:-0.136641 max:0.12107 sd:0.0385446
    333: "embedded_sentence.399" NUMERICAL mean:-0.00238481 min:-0.14689 max:0.132483 sd:0.0512686
    334: "embedded_sentence.4" NUMERICAL mean:0.0126995 min:-0.128462 max:0.120181 sd:0.0460968
    335: "embedded_sentence.40" NUMERICAL mean:0.00390461 min:-0.107059 max:0.128317 sd:0.036992
    336: "embedded_sentence.400" NUMERICAL mean:-0.00854602 min:-0.110339 max:0.123831 sd:0.0428819
    337: "embedded_sentence.401" NUMERICAL mean:-0.0120933 min:-0.110716 max:0.107581 sd:0.0391564
    338: "embedded_sentence.402" NUMERICAL mean:-0.00798588 min:-0.114245 max:0.109355 sd:0.0417294
    339: "embedded_sentence.403" NUMERICAL mean:-0.00715776 min:-0.110958 max:0.109412 sd:0.0426725
    340: "embedded_sentence.404" NUMERICAL mean:0.0421547 min:-0.0936097 max:0.14341 sd:0.0449053
    341: "embedded_sentence.405" NUMERICAL mean:0.0138744 min:-0.101141 max:0.110993 sd:0.0409959
    342: "embedded_sentence.406" NUMERICAL mean:0.0221997 min:-0.10012 max:0.12351 sd:0.0512918
    343: "embedded_sentence.407" NUMERICAL mean:0.00840243 min:-0.100731 max:0.108785 sd:0.0385815
    344: "embedded_sentence.408" NUMERICAL mean:-0.00995255 min:-0.119931 max:0.107382 sd:0.0397331
    345: "embedded_sentence.409" NUMERICAL mean:0.00122281 min:-0.123687 max:0.110221 sd:0.0419264
    346: "embedded_sentence.41" NUMERICAL mean:-0.00821721 min:-0.124872 max:0.101206 sd:0.0389586
    347: "embedded_sentence.410" NUMERICAL mean:0.00722765 min:-0.120324 max:0.118298 sd:0.0397953
    348: "embedded_sentence.411" NUMERICAL mean:0.00372596 min:-0.110838 max:0.104775 sd:0.0397102
    349: "embedded_sentence.412" NUMERICAL mean:0.00750692 min:-0.105861 max:0.113608 sd:0.0404272
    350: "embedded_sentence.413" NUMERICAL mean:0.00702045 min:-0.100497 max:0.109256 sd:0.040414
    351: "embedded_sentence.414" NUMERICAL mean:0.0129925 min:-0.104637 max:0.129069 sd:0.0476144
    352: "embedded_sentence.415" NUMERICAL mean:0.00895771 min:-0.103221 max:0.131867 sd:0.0416565
    353: "embedded_sentence.416" NUMERICAL mean:-0.0113754 min:-0.108457 max:0.108912 sd:0.039076
    354: "embedded_sentence.417" NUMERICAL mean:-0.00972072 min:-0.108896 max:0.120041 sd:0.039969
    355: "embedded_sentence.418" NUMERICAL mean:0.0103305 min:-0.115689 max:0.117791 sd:0.0438928
    356: "embedded_sentence.419" NUMERICAL mean:-0.011858 min:-0.110159 max:0.112286 sd:0.0405172
    357: "embedded_sentence.42" NUMERICAL mean:-0.0263568 min:-0.128555 max:0.12256 sd:0.0438572
    358: "embedded_sentence.420" NUMERICAL mean:0.0113019 min:-0.117355 max:0.110719 sd:0.0390142
    359: "embedded_sentence.421" NUMERICAL mean:-0.00325833 min:-0.11971 max:0.0998387 sd:0.0386342
    360: "embedded_sentence.422" NUMERICAL mean:-0.0175019 min:-0.121014 max:0.108533 sd:0.0430717
    361: "embedded_sentence.423" NUMERICAL mean:0.00661466 min:-0.121052 max:0.104438 sd:0.0401472
    362: "embedded_sentence.424" NUMERICAL mean:0.0157025 min:-0.119043 max:0.121705 sd:0.0455012
    363: "embedded_sentence.425" NUMERICAL mean:0.00671776 min:-0.119955 max:0.135544 sd:0.046337
    364: "embedded_sentence.426" NUMERICAL mean:0.00625655 min:-0.110938 max:0.120801 sd:0.0434661
    365: "embedded_sentence.427" NUMERICAL mean:0.0204839 min:-0.112639 max:0.12859 sd:0.0461795
    366: "embedded_sentence.428" NUMERICAL mean:-0.00954845 min:-0.131481 max:0.103867 sd:0.0409481
    367: "embedded_sentence.429" NUMERICAL mean:0.0227497 min:-0.114759 max:0.128784 sd:0.0461912
    368: "embedded_sentence.43" NUMERICAL mean:-0.00742056 min:-0.132266 max:0.135953 sd:0.0492916
    369: "embedded_sentence.430" NUMERICAL mean:-0.0143054 min:-0.116372 max:0.0982788 sd:0.0397653
    370: "embedded_sentence.431" NUMERICAL mean:0.00108119 min:-0.10975 max:0.113431 sd:0.0395805
    371: "embedded_sentence.432" NUMERICAL mean:-0.0124634 min:-0.128303 max:0.122121 sd:0.043612
    372: "embedded_sentence.433" NUMERICAL mean:-0.000974066 min:-0.127452 max:0.143976 sd:0.0512878
    373: "embedded_sentence.434" NUMERICAL mean:-0.000695708 min:-0.117519 max:0.132419 sd:0.048299
    374: "embedded_sentence.435" NUMERICAL mean:-0.00800422 min:-0.11716 max:0.106095 sd:0.0385783
    375: "embedded_sentence.436" NUMERICAL mean:-0.00449899 min:-0.119801 max:0.13136 sd:0.0450766
    376: "embedded_sentence.437" NUMERICAL mean:0.00152719 min:-0.101368 max:0.111586 sd:0.0373092
    377: "embedded_sentence.438" NUMERICAL mean:-0.00746199 min:-0.110446 max:0.107505 sd:0.0409118
    378: "embedded_sentence.439" NUMERICAL mean:-0.000542517 min:-0.126726 max:0.150725 sd:0.0498822
    379: "embedded_sentence.44" NUMERICAL mean:-0.0136633 min:-0.125995 max:0.100658 sd:0.0357859
    380: "embedded_sentence.440" NUMERICAL mean:0.0162618 min:-0.110413 max:0.112766 sd:0.039636
    381: "embedded_sentence.441" NUMERICAL mean:-0.0252852 min:-0.140847 max:0.123998 sd:0.045552
    382: "embedded_sentence.442" NUMERICAL mean:-0.00971423 min:-0.14093 max:0.115633 sd:0.0430468
    383: "embedded_sentence.443" NUMERICAL mean:-0.00171618 min:-0.130186 max:0.122902 sd:0.0446095
    384: "embedded_sentence.444" NUMERICAL mean:0.0108986 min:-0.114492 max:0.110956 sd:0.0418642
    385: "embedded_sentence.445" NUMERICAL mean:-0.00650931 min:-0.106713 max:0.126819 sd:0.0394136
    386: "embedded_sentence.446" NUMERICAL mean:7.68805e-05 min:-0.107121 max:0.104196 sd:0.0371536
    387: "embedded_sentence.447" NUMERICAL mean:-0.00166973 min:-0.106304 max:0.113193 sd:0.0417721
    388: "embedded_sentence.448" NUMERICAL mean:0.00143107 min:-0.112879 max:0.117707 sd:0.0438514
    389: "embedded_sentence.449" NUMERICAL mean:0.00577755 min:-0.114301 max:0.116267 sd:0.0413021
    390: "embedded_sentence.45" NUMERICAL mean:-0.00672393 min:-0.105793 max:0.106381 sd:0.0395707
    391: "embedded_sentence.450" NUMERICAL mean:0.00523777 min:-0.121324 max:0.109753 sd:0.0422962
    392: "embedded_sentence.451" NUMERICAL mean:0.00232381 min:-0.107421 max:0.116006 sd:0.0411045
    393: "embedded_sentence.452" NUMERICAL mean:0.0131371 min:-0.119915 max:0.110052 sd:0.0388742
    394: "embedded_sentence.453" NUMERICAL mean:0.00384022 min:-0.113448 max:0.103866 sd:0.0399839
    395: "embedded_sentence.454" NUMERICAL mean:0.00746132 min:-0.11867 max:0.107228 sd:0.0393659
    396: "embedded_sentence.455" NUMERICAL mean:0.0217711 min:-0.130108 max:0.130266 sd:0.0457751
    397: "embedded_sentence.456" NUMERICAL mean:-0.00486574 min:-0.125269 max:0.103216 sd:0.0417326
    398: "embedded_sentence.457" NUMERICAL mean:-0.00370284 min:-0.152411 max:0.118391 sd:0.0475716
    399: "embedded_sentence.458" NUMERICAL mean:-0.0252088 min:-0.129244 max:0.110772 sd:0.0447074
    400: "embedded_sentence.459" NUMERICAL mean:0.0196455 min:-0.107007 max:0.110025 sd:0.0371162
    401: "embedded_sentence.46" NUMERICAL mean:-0.00846792 min:-0.137635 max:0.111598 sd:0.0406422
    402: "embedded_sentence.460" NUMERICAL mean:0.00486969 min:-0.133702 max:0.117438 sd:0.0404765
    403: "embedded_sentence.461" NUMERICAL mean:0.00879324 min:-0.123721 max:0.109769 sd:0.0418885
    404: "embedded_sentence.462" NUMERICAL mean:0.00541842 min:-0.103881 max:0.115937 sd:0.040526
    405: "embedded_sentence.463" NUMERICAL mean:0.0112013 min:-0.129965 max:0.125135 sd:0.0445652
    406: "embedded_sentence.464" NUMERICAL mean:-0.00978469 min:-0.112536 max:0.136367 sd:0.0432779
    407: "embedded_sentence.465" NUMERICAL mean:-0.00372292 min:-0.132975 max:0.107404 sd:0.0434915
    408: "embedded_sentence.466" NUMERICAL mean:0.000832961 min:-0.106678 max:0.109534 sd:0.041454
    409: "embedded_sentence.467" NUMERICAL mean:0.0128707 min:-0.123202 max:0.108301 sd:0.036966
    410: "embedded_sentence.468" NUMERICAL mean:0.00143448 min:-0.109754 max:0.115596 sd:0.0410802
    411: "embedded_sentence.469" NUMERICAL mean:0.00821259 min:-0.0968573 max:0.116681 sd:0.037229
    412: "embedded_sentence.47" NUMERICAL mean:0.00542722 min:-0.107879 max:0.112788 sd:0.0407962
    413: "embedded_sentence.470" NUMERICAL mean:-0.0126405 min:-0.11236 max:0.104975 sd:0.0410705
    414: "embedded_sentence.471" NUMERICAL mean:0.00967789 min:-0.114741 max:0.113365 sd:0.0415494
    415: "embedded_sentence.472" NUMERICAL mean:0.0051147 min:-0.116287 max:0.123708 sd:0.038196
    416: "embedded_sentence.473" NUMERICAL mean:0.00460656 min:-0.117806 max:0.116034 sd:0.0417151
    417: "embedded_sentence.474" NUMERICAL mean:-0.00244138 min:-0.103319 max:0.116585 sd:0.0374234
    418: "embedded_sentence.475" NUMERICAL mean:-0.00797766 min:-0.112168 max:0.110854 sd:0.043268
    419: "embedded_sentence.476" NUMERICAL mean:-0.0123356 min:-0.118527 max:0.110389 sd:0.0415487
    420: "embedded_sentence.477" NUMERICAL mean:-0.00891097 min:-0.109911 max:0.114824 sd:0.0409558
    421: "embedded_sentence.478" NUMERICAL mean:0.0531792 min:-0.123494 max:0.14429 sd:0.0446525
    422: "embedded_sentence.479" NUMERICAL mean:0.00310177 min:-0.126525 max:0.135642 sd:0.0508086
    423: "embedded_sentence.48" NUMERICAL mean:0.00416469 min:-0.106566 max:0.110393 sd:0.0387239
    424: "embedded_sentence.480" NUMERICAL mean:0.00178777 min:-0.101512 max:0.111535 sd:0.0393616
    425: "embedded_sentence.481" NUMERICAL mean:0.000436977 min:-0.141595 max:0.116526 sd:0.0498771
    426: "embedded_sentence.482" NUMERICAL mean:0.0139387 min:-0.109079 max:0.125151 sd:0.0395955
    427: "embedded_sentence.483" NUMERICAL mean:-0.0190178 min:-0.116579 max:0.12211 sd:0.0404221
    428: "embedded_sentence.484" NUMERICAL mean:0.0111983 min:-0.115318 max:0.114151 sd:0.0407415
    429: "embedded_sentence.485" NUMERICAL mean:-0.0210413 min:-0.12817 max:0.102505 sd:0.0409111
    430: "embedded_sentence.486" NUMERICAL mean:0.00291598 min:-0.136717 max:0.132649 sd:0.0483605
    431: "embedded_sentence.487" NUMERICAL mean:0.0258506 min:-0.118507 max:0.139141 sd:0.0476916
    432: "embedded_sentence.488" NUMERICAL mean:0.00950834 min:-0.117085 max:0.104573 sd:0.0394689
    433: "embedded_sentence.489" NUMERICAL mean:-0.00655678 min:-0.113501 max:0.116317 sd:0.0412641
    434: "embedded_sentence.49" NUMERICAL mean:0.010748 min:-0.101981 max:0.119391 sd:0.0397083
    435: "embedded_sentence.490" NUMERICAL mean:0.0025444 min:-0.0976397 max:0.133059 sd:0.0391231
    436: "embedded_sentence.491" NUMERICAL mean:-0.00116524 min:-0.115012 max:0.108975 sd:0.0373331
    437: "embedded_sentence.492" NUMERICAL mean:-0.00805514 min:-0.112223 max:0.118394 sd:0.0409569
    438: "embedded_sentence.493" NUMERICAL mean:-0.00381922 min:-0.109779 max:0.113538 sd:0.0375221
    439: "embedded_sentence.494" NUMERICAL mean:0.0192517 min:-0.108658 max:0.118238 sd:0.0414103
    440: "embedded_sentence.495" NUMERICAL mean:-0.00252727 min:-0.118617 max:0.100404 sd:0.0398346
    441: "embedded_sentence.496" NUMERICAL mean:-0.000870086 min:-0.10941 max:0.119059 sd:0.043479
    442: "embedded_sentence.497" NUMERICAL mean:-0.00296294 min:-0.123757 max:0.109776 sd:0.0420959
    443: "embedded_sentence.498" NUMERICAL mean:0.0127804 min:-0.138546 max:0.154906 sd:0.0511673
    444: "embedded_sentence.499" NUMERICAL mean:-0.00481274 min:-0.104637 max:0.112387 sd:0.0419786
    445: "embedded_sentence.5" NUMERICAL mean:0.0120099 min:-0.120963 max:0.118971 sd:0.041685
    446: "embedded_sentence.50" NUMERICAL mean:0.0382225 min:-0.0980938 max:0.129267 sd:0.0373726
    447: "embedded_sentence.500" NUMERICAL mean:-0.012455 min:-0.109502 max:0.102241 sd:0.0402451
    448: "embedded_sentence.501" NUMERICAL mean:-0.0236005 min:-0.117228 max:0.124977 sd:0.0464432
    449: "embedded_sentence.502" NUMERICAL mean:0.00916425 min:-0.128705 max:0.110148 sd:0.0412428
    450: "embedded_sentence.503" NUMERICAL mean:-0.0099854 min:-0.179229 max:0.112813 sd:0.0666002
    451: "embedded_sentence.504" NUMERICAL mean:0.0140659 min:-0.124558 max:0.131239 sd:0.0459631
    452: "embedded_sentence.505" NUMERICAL mean:0.00529723 min:-0.119894 max:0.104362 sd:0.0399805
    453: "embedded_sentence.506" NUMERICAL mean:-0.00319069 min:-0.111178 max:0.108562 sd:0.040611
    454: "embedded_sentence.507" NUMERICAL mean:-0.00332249 min:-0.108088 max:0.118358 sd:0.0396039
    455: "embedded_sentence.508" NUMERICAL mean:-0.00396023 min:-0.11048 max:0.107852 sd:0.0375341
    456: "embedded_sentence.509" NUMERICAL mean:-0.00917504 min:-0.116661 max:0.100524 sd:0.0361387
    457: "embedded_sentence.51" NUMERICAL mean:-0.0244919 min:-0.143322 max:0.151466 sd:0.0569238
    458: "embedded_sentence.510" NUMERICAL mean:0.037723 min:-0.0965472 max:0.140981 sd:0.0479428
    459: "embedded_sentence.511" NUMERICAL mean:0.00788656 min:-0.116457 max:0.102988 sd:0.0402552
    460: "embedded_sentence.52" NUMERICAL mean:0.0137383 min:-0.119567 max:0.149818 sd:0.0480009
    461: "embedded_sentence.53" NUMERICAL mean:-0.00754001 min:-0.119613 max:0.139327 sd:0.0441231
    462: "embedded_sentence.54" NUMERICAL mean:-0.00119265 min:-0.117568 max:0.0984011 sd:0.0386896
    463: "embedded_sentence.55" NUMERICAL mean:-0.00382799 min:-0.113112 max:0.107257 sd:0.0435431
    464: "embedded_sentence.56" NUMERICAL mean:0.00818074 min:-0.145547 max:0.123275 sd:0.0429192
    465: "embedded_sentence.57" NUMERICAL mean:-0.00208038 min:-0.126433 max:0.101673 sd:0.0393041
    466: "embedded_sentence.58" NUMERICAL mean:0.00506083 min:-0.118728 max:0.13801 sd:0.0459501
    467: "embedded_sentence.59" NUMERICAL mean:-0.00110454 min:-0.111315 max:0.10866 sd:0.0384711
    468: "embedded_sentence.6" NUMERICAL mean:0.00266504 min:-0.107839 max:0.108908 sd:0.0381836
    469: "embedded_sentence.60" NUMERICAL mean:-0.00560149 min:-0.126673 max:0.142958 sd:0.0476651
    470: "embedded_sentence.61" NUMERICAL mean:-0.010492 min:-0.116135 max:0.117787 sd:0.0398593
    471: "embedded_sentence.62" NUMERICAL mean:-0.0196407 min:-0.143423 max:0.104133 sd:0.0483823
    472: "embedded_sentence.63" NUMERICAL mean:0.0072672 min:-0.134359 max:0.115527 sd:0.0442733
    473: "embedded_sentence.64" NUMERICAL mean:-0.00813338 min:-0.104328 max:0.11042 sd:0.0378631
    474: "embedded_sentence.65" NUMERICAL mean:0.0252276 min:-0.134246 max:0.126575 sd:0.0404105
    475: "embedded_sentence.66" NUMERICAL mean:0.0121496 min:-0.121565 max:0.115153 sd:0.0399014
    476: "embedded_sentence.67" NUMERICAL mean:0.000328628 min:-0.108976 max:0.10698 sd:0.0409231
    477: "embedded_sentence.68" NUMERICAL mean:0.0209823 min:-0.111598 max:0.12123 sd:0.0391018
    478: "embedded_sentence.69" NUMERICAL mean:0.00544792 min:-0.108988 max:0.126124 sd:0.0422695
    479: "embedded_sentence.7" NUMERICAL mean:-0.00274169 min:-0.104539 max:0.13168 sd:0.0381854
    480: "embedded_sentence.70" NUMERICAL mean:-0.000593016 min:-0.119492 max:0.113604 sd:0.0415354
    481: "embedded_sentence.71" NUMERICAL mean:-0.000604193 min:-0.128741 max:0.107355 sd:0.0426992
    482: "embedded_sentence.72" NUMERICAL mean:-0.00433507 min:-0.113435 max:0.102836 sd:0.0414469
    483: "embedded_sentence.73" NUMERICAL mean:-0.0101648 min:-0.10628 max:0.119432 sd:0.0400882
    484: "embedded_sentence.74" NUMERICAL mean:0.0132994 min:-0.123574 max:0.103854 sd:0.0381882
    485: "embedded_sentence.75" NUMERICAL mean:-0.00154112 min:-0.135068 max:0.106161 sd:0.0393081
    486: "embedded_sentence.76" NUMERICAL mean:-0.0107704 min:-0.106198 max:0.106547 sd:0.0380247
    487: "embedded_sentence.77" NUMERICAL mean:0.0151205 min:-0.0985188 max:0.107297 sd:0.0381537
    488: "embedded_sentence.78" NUMERICAL mean:0.00829679 min:-0.102936 max:0.116536 sd:0.0410818
    489: "embedded_sentence.79" NUMERICAL mean:0.00578581 min:-0.156252 max:0.125833 sd:0.0489822
    490: "embedded_sentence.8" NUMERICAL mean:0.0078143 min:-0.1422 max:0.125118 sd:0.0480273
    491: "embedded_sentence.80" NUMERICAL mean:-0.00466792 min:-0.10975 max:0.118669 sd:0.0422673
    492: "embedded_sentence.81" NUMERICAL mean:0.00499065 min:-0.0934409 max:0.115151 sd:0.0382445
    493: "embedded_sentence.82" NUMERICAL mean:-0.0120384 min:-0.115119 max:0.109741 sd:0.039712
    494: "embedded_sentence.83" NUMERICAL mean:-0.0116498 min:-0.107953 max:0.113206 sd:0.0408114
    495: "embedded_sentence.84" NUMERICAL mean:-0.0210408 min:-0.108707 max:0.0992159 sd:0.0386516
    496: "embedded_sentence.85" NUMERICAL mean:-0.00273396 min:-0.12944 max:0.12272 sd:0.0449487
    497: "embedded_sentence.86" NUMERICAL mean:0.00658216 min:-0.113506 max:0.112219 sd:0.039801
    498: "embedded_sentence.87" NUMERICAL mean:-0.00378743 min:-0.117676 max:0.109386 sd:0.0402421
    499: "embedded_sentence.88" NUMERICAL mean:-0.0205237 min:-0.107587 max:0.103141 sd:0.040405
    500: "embedded_sentence.89" NUMERICAL mean:-0.000411177 min:-0.119937 max:0.109877 sd:0.0421414
    501: "embedded_sentence.9" NUMERICAL mean:0.0295029 min:-0.128134 max:0.118291 sd:0.0394542
    502: "embedded_sentence.90" NUMERICAL mean:-0.00181531 min:-0.117795 max:0.106343 sd:0.0421115
    503: "embedded_sentence.91" NUMERICAL mean:-0.00550051 min:-0.127822 max:0.113907 sd:0.0399804
    504: "embedded_sentence.92" NUMERICAL mean:-0.00547455 min:-0.126723 max:0.119811 sd:0.0431932
    505: "embedded_sentence.93" NUMERICAL mean:0.014195 min:-0.105489 max:0.118567 sd:0.0413103
    506: "embedded_sentence.94" NUMERICAL mean:0.0188997 min:-0.104824 max:0.132286 sd:0.0497162
    507: "embedded_sentence.95" NUMERICAL mean:0.00497901 min:-0.108731 max:0.124192 sd:0.0414468
    508: "embedded_sentence.96" NUMERICAL mean:-0.0179242 min:-0.125507 max:0.10199 sd:0.0383211
    509: "embedded_sentence.97" NUMERICAL mean:0.00327183 min:-0.122499 max:0.123037 sd:0.0419092
    510: "embedded_sentence.98" NUMERICAL mean:0.0216785 min:-0.10081 max:0.116099 sd:0.0479454
    511: "embedded_sentence.99" NUMERICAL mean:0.019005 min:-0.125922 max:0.117505 sd:0.0429193

CATEGORICAL: 1 (0.194932%)
    512: "__LABEL" CATEGORICAL integerized vocab-size:3 no-ood-item

Terminology:
    nas: Number of non-available (i.e. missing) values.
    ood: Out of dictionary.
    manually-defined: Attribute which type is manually defined by the user i.e. the type was not automatically inferred.
    tokenized: The attribute value is obtained through tokenization.
    has-dict: The attribute is attached to a string dictionary e.g. a categorical attribute stored as a string.
    vocab-size: Number of unique values.

[INFO kernel.cc:762] Configure learner
[INFO kernel.cc:787] Training config:
learner: "RANDOM_FOREST"
features: "embedded_sentence\\.0"
features: "embedded_sentence\\.1"
features: "embedded_sentence\\.10"
features: "embedded_sentence\\.100"
features: "embedded_sentence\\.101"
features: "embedded_sentence\\.102"
features: "embedded_sentence\\.103"
features: "embedded_sentence\\.104"
features: "embedded_sentence\\.105"
features: "embedded_sentence\\.106"
features: "embedded_sentence\\.107"
features: "embedded_sentence\\.108"
features: "embedded_sentence\\.109"
features: "embedded_sentence\\.11"
features: "embedded_sentence\\.110"
features: "embedded_sentence\\.111"
features: "embedded_sentence\\.112"
features: "embedded_sentence\\.113"
features: "embedded_sentence\\.114"
features: "embedded_sentence\\.115"
features: "embedded_sentence\\.116"
features: "embedded_sentence\\.117"
features: "embedded_sentence\\.118"
features: "embedded_sentence\\.119"
features: "embedded_sentence\\.12"
features: "embedded_sentence\\.120"
features: "embedded_sentence\\.121"
features: "embedded_sentence\\.122"
features: "embedded_sentence\\.123"
features: "embedded_sentence\\.124"
features: "embedded_sentence\\.125"
features: "embedded_sentence\\.126"
features: "embedded_sentence\\.127"
features: "embedded_sentence\\.128"
features: "embedded_sentence\\.129"
features: "embedded_sentence\\.13"
features: "embedded_sentence\\.130"
features: "embedded_sentence\\.131"
features: "embedded_sentence\\.132"
features: "embedded_sentence\\.133"
features: "embedded_sentence\\.134"
features: "embedded_sentence\\.135"
features: "embedded_sentence\\.136"
features: "embedded_sentence\\.137"
features: "embedded_sentence\\.138"
features: "embedded_sentence\\.139"
features: "embedded_sentence\\.14"
features: "embedded_sentence\\.140"
features: "embedded_sentence\\.141"
features: "embedded_sentence\\.142"
features: "embedded_sentence\\.143"
features: "embedded_sentence\\.144"
features: "embedded_sentence\\.145"
features: "embedded_sentence\\.146"
features: "embedded_sentence\\.147"
features: "embedded_sentence\\.148"
features: "embedded_sentence\\.149"
features: "embedded_sentence\\.15"
features: "embedded_sentence\\.150"
features: "embedded_sentence\\.151"
features: "embedded_sentence\\.152"
features: "embedded_sentence\\.153"
features: "embedded_sentence\\.154"
features: "embedded_sentence\\.155"
features: "embedded_sentence\\.156"
features: "embedded_sentence\\.157"
features: "embedded_sentence\\.158"
features: "embedded_sentence\\.159"
features: "embedded_sentence\\.16"
features: "embedded_sentence\\.160"
features: "embedded_sentence\\.161"
features: "embedded_sentence\\.162"
features: "embedded_sentence\\.163"
features: "embedded_sentence\\.164"
features: "embedded_sentence\\.165"
features: "embedded_sentence\\.166"
features: "embedded_sentence\\.167"
features: "embedded_sentence\\.168"
features: "embedded_sentence\\.169"
features: "embedded_sentence\\.17"
features: "embedded_sentence\\.170"
features: "embedded_sentence\\.171"
features: "embedded_sentence\\.172"
features: "embedded_sentence\\.173"
features: "embedded_sentence\\.174"
features: "embedded_sentence\\.175"
features: "embedded_sentence\\.176"
features: "embedded_sentence\\.177"
features: "embedded_sentence\\.178"
features: "embedded_sentence\\.179"
features: "embedded_sentence\\.18"
features: "embedded_sentence\\.180"
features: "embedded_sentence\\.181"
features: "embedded_sentence\\.182"
features: "embedded_sentence\\.183"
features: "embedded_sentence\\.184"
features: "embedded_sentence\\.185"
features: "embedded_sentence\\.186"
features: "embedded_sentence\\.187"
features: "embedded_sentence\\.188"
features: "embedded_sentence\\.189"
features: "embedded_sentence\\.19"
features: "embedded_sentence\\.190"
features: "embedded_sentence\\.191"
features: "embedded_sentence\\.192"
features: "embedded_sentence\\.193"
features: "embedded_sentence\\.194"
features: "embedded_sentence\\.195"
features: "embedded_sentence\\.196"
features: "embedded_sentence\\.197"
features: "embedded_sentence\\.198"
features: "embedded_sentence\\.199"
features: "embedded_sentence\\.2"
features: "embedded_sentence\\.20"
features: "embedded_sentence\\.200"
features: "embedded_sentence\\.201"
features: "embedded_sentence\\.202"
features: "embedded_sentence\\.203"
features: "embedded_sentence\\.204"
features: "embedded_sentence\\.205"
features: "embedded_sentence\\.206"
features: "embedded_sentence\\.207"
features: "embedded_sentence\\.208"
features: "embedded_sentence\\.209"
features: "embedded_sentence\\.21"
features: "embedded_sentence\\.210"
features: "embedded_sentence\\.211"
features: "embedded_sentence\\.212"
features: "embedded_sentence\\.213"
features: "embedded_sentence\\.214"
features: "embedded_sentence\\.215"
features: "embedded_sentence\\.216"
features: "embedded_sentence\\.217"
features: "embedded_sentence\\.218"
features: "embedded_sentence\\.219"
features: "embedded_sentence\\.22"
features: "embedded_sentence\\.220"
features: "embedded_sentence\\.221"
features: "embedded_sentence\\.222"
features: "embedded_sentence\\.223"
features: "embedded_sentence\\.224"
features: "embedded_sentence\\.225"
features: "embedded_sentence\\.226"
features: "embedded_sentence\\.227"
features: "embedded_sentence\\.228"
features: "embedded_sentence\\.229"
features: "embedded_sentence\\.23"
features: "embedded_sentence\\.230"
features: "embedded_sentence\\.231"
features: "embedded_sentence\\.232"
features: "embedded_sentence\\.233"
features: "embedded_sentence\\.234"
features: "embedded_sentence\\.235"
features: "embedded_sentence\\.236"
features: "embedded_sentence\\.237"
features: "embedded_sentence\\.238"
features: "embedded_sentence\\.239"
features: "embedded_sentence\\.24"
features: "embedded_sentence\\.240"
features: "embedded_sentence\\.241"
features: "embedded_sentence\\.242"
features: "embedded_sentence\\.243"
features: "embedded_sentence\\.244"
features: "embedded_sentence\\.245"
features: "embedded_sentence\\.246"
features: "embedded_sentence\\.247"
features: "embedded_sentence\\.248"
features: "embedded_sentence\\.249"
features: "embedded_sentence\\.25"
features: "embedded_sentence\\.250"
features: "embedded_sentence\\.251"
features: "embedded_sentence\\.252"
features: "embedded_sentence\\.253"
features: "embedded_sentence\\.254"
features: "embedded_sentence\\.255"
features: "embedded_sentence\\.256"
features: "embedded_sentence\\.257"
features: "embedded_sentence\\.258"
features: "embedded_sentence\\.259"
features: "embedded_sentence\\.26"
features: "embedded_sentence\\.260"
features: "embedded_sentence\\.261"
features: "embedded_sentence\\.262"
features: "embedded_sentence\\.263"
features: "embedded_sentence\\.264"
features: "embedded_sentence\\.265"
features: "embedded_sentence\\.266"
features: "embedded_sentence\\.267"
features: "embedded_sentence\\.268"
features: "embedded_sentence\\.269"
features: "embedded_sentence\\.27"
features: "embedded_sentence\\.270"
features: "embedded_sentence\\.271"
features: "embedded_sentence\\.272"
features: "embedded_sentence\\.273"
features: "embedded_sentence\\.274"
features: "embedded_sentence\\.275"
features: "embedded_sentence\\.276"
features: "embedded_sentence\\.277"
features: "embedded_sentence\\.278"
features: "embedded_sentence\\.279"
features: "embedded_sentence\\.28"
features: "embedded_sentence\\.280"
features: "embedded_sentence\\.281"
features: "embedded_sentence\\.282"
features: "embedded_sentence\\.283"
features: "embedded_sentence\\.284"
features: "embedded_sentence\\.285"
features: "embedded_sentence\\.286"
features: "embedded_sentence\\.287"
features: "embedded_sentence\\.288"
features: "embedded_sentence\\.289"
features: "embedded_sentence\\.29"
features: "embedded_sentence\\.290"
features: "embedded_sentence\\.291"
features: "embedded_sentence\\.292"
features: "embedded_sentence\\.293"
features: "embedded_sentence\\.294"
features: "embedded_sentence\\.295"
features: "embedded_sentence\\.296"
features: "embedded_sentence\\.297"
features: "embedded_sentence\\.298"
features: "embedded_sentence\\.299"
features: "embedded_sentence\\.3"
features: "embedded_sentence\\.30"
features: "embedded_sentence\\.300"
features: "embedded_sentence\\.301"
features: "embedded_sentence\\.302"
features: "embedded_sentence\\.303"
features: "embedded_sentence\\.304"
features: "embedded_sentence\\.305"
features: "embedded_sentence\\.306"
features: "embedded_sentence\\.307"
features: "embedded_sentence\\.308"
features: "embedded_sentence\\.309"
features: "embedded_sentence\\.31"
features: "embedded_sentence\\.310"
features: "embedded_sentence\\.311"
features: "embedded_sentence\\.312"
features: "embedded_sentence\\.313"
features: "embedded_sentence\\.314"
features: "embedded_sentence\\.315"
features: "embedded_sentence\\.316"
features: "embedded_sentence\\.317"
features: "embedded_sentence\\.318"
features: "embedded_sentence\\.319"
features: "embedded_sentence\\.32"
features: "embedded_sentence\\.320"
features: "embedded_sentence\\.321"
features: "embedded_sentence\\.322"
features: "embedded_sentence\\.323"
features: "embedded_sentence\\.324"
features: "embedded_sentence\\.325"
features: "embedded_sentence\\.326"
features: "embedded_sentence\\.327"
features: "embedded_sentence\\.328"
features: "embedded_sentence\\.329"
features: "embedded_sentence\\.33"
features: "embedded_sentence\\.330"
features: "embedded_sentence\\.331"
features: "embedded_sentence\\.332"
features: "embedded_sentence\\.333"
features: "embedded_sentence\\.334"
features: "embedded_sentence\\.335"
features: "embedded_sentence\\.336"
features: "embedded_sentence\\.337"
features: "embedded_sentence\\.338"
features: "embedded_sentence\\.339"
features: "embedded_sentence\\.34"
features: "embedded_sentence\\.340"
features: "embedded_sentence\\.341"
features: "embedded_sentence\\.342"
features: "embedded_sentence\\.343"
features: "embedded_sentence\\.344"
features: "embedded_sentence\\.345"
features: "embedded_sentence\\.346"
features: "embedded_sentence\\.347"
features: "embedded_sentence\\.348"
features: "embedded_sentence\\.349"
features: "embedded_sentence\\.35"
features: "embedded_sentence\\.350"
features: "embedded_sentence\\.351"
features: "embedded_sentence\\.352"
features: "embedded_sentence\\.353"
features: "embedded_sentence\\.354"
features: "embedded_sentence\\.355"
features: "embedded_sentence\\.356"
features: "embedded_sentence\\.357"
features: "embedded_sentence\\.358"
features: "embedded_sentence\\.359"
features: "embedded_sentence\\.36"
features: "embedded_sentence\\.360"
features: "embedded_sentence\\.361"
features: "embedded_sentence\\.362"
features: "embedded_sentence\\.363"
features: "embedded_sentence\\.364"
features: "embedded_sentence\\.365"
features: "embedded_sentence\\.366"
features: "embedded_sentence\\.367"
features: "embedded_sentence\\.368"
features: "embedded_sentence\\.369"
features: "embedded_sentence\\.37"
features: "embedded_sentence\\.370"
features: "embedded_sentence\\.371"
features: "embedded_sentence\\.372"
features: "embedded_sentence\\.373"
features: "embedded_sentence\\.374"
features: "embedded_sentence\\.375"
features: "embedded_sentence\\.376"
features: "embedded_sentence\\.377"
features: "embedded_sentence\\.378"
features: "embedded_sentence\\.379"
features: "embedded_sentence\\.38"
features: "embedded_sentence\\.380"
features: "embedded_sentence\\.381"
features: "embedded_sentence\\.382"
features: "embedded_sentence\\.383"
features: "embedded_sentence\\.384"
features: "embedded_sentence\\.385"
features: "embedded_sentence\\.386"
features: "embedded_sentence\\.387"
features: "embedded_sentence\\.388"
features: "embedded_sentence\\.389"
features: "embedded_sentence\\.39"
features: "embedded_sentence\\.390"
features: "embedded_sentence\\.391"
features: "embedded_sentence\\.392"
features: "embedded_sentence\\.393"
features: "embedded_sentence\\.394"
features: "embedded_sentence\\.395"
features: "embedded_sentence\\.396"
features: "embedded_sentence\\.397"
features: "embedded_sentence\\.398"
features: "embedded_sentence\\.399"
features: "embedded_sentence\\.4"
features: "embedded_sentence\\.40"
features: "embedded_sentence\\.400"
features: "embedded_sentence\\.401"
features: "embedded_sentence\\.402"
features: "embedded_sentence\\.403"
features: "embedded_sentence\\.404"
features: "embedded_sentence\\.405"
features: "embedded_sentence\\.406"
features: "embedded_sentence\\.407"
features: "embedded_sentence\\.408"
features: "embedded_sentence\\.409"
features: "embedded_sentence\\.41"
features: "embedded_sentence\\.410"
features: "embedded_sentence\\.411"
features: "embedded_sentence\\.412"
features: "embedded_sentence\\.413"
features: "embedded_sentence\\.414"
features: "embedded_sentence\\.415"
features: "embedded_sentence\\.416"
features: "embedded_sentence\\.417"
features: "embedded_sentence\\.418"
features: "embedded_sentence\\.419"
features: "embedded_sentence\\.42"
features: "embedded_sentence\\.420"
features: "embedded_sentence\\.421"
features: "embedded_sentence\\.422"
features: "embedded_sentence\\.423"
features: "embedded_sentence\\.424"
features: "embedded_sentence\\.425"
features: "embedded_sentence\\.426"
features: "embedded_sentence\\.427"
features: "embedded_sentence\\.428"
features: "embedded_sentence\\.429"
features: "embedded_sentence\\.43"
features: "embedded_sentence\\.430"
features: "embedded_sentence\\.431"
features: "embedded_sentence\\.432"
features: "embedded_sentence\\.433"
features: "embedded_sentence\\.434"
features: "embedded_sentence\\.435"
features: "embedded_sentence\\.436"
features: "embedded_sentence\\.437"
features: "embedded_sentence\\.438"
features: "embedded_sentence\\.439"
features: "embedded_sentence\\.44"
features: "embedded_sentence\\.440"
features: "embedded_sentence\\.441"
features: "embedded_sentence\\.442"
features: "embedded_sentence\\.443"
features: "embedded_sentence\\.444"
features: "embedded_sentence\\.445"
features: "embedded_sentence\\.446"
features: "embedded_sentence\\.447"
features: "embedded_sentence\\.448"
features: "embedded_sentence\\.449"
features: "embedded_sentence\\.45"
features: "embedded_sentence\\.450"
features: "embedded_sentence\\.451"
features: "embedded_sentence\\.452"
features: "embedded_sentence\\.453"
features: "embedded_sentence\\.454"
features: "embedded_sentence\\.455"
features: "embedded_sentence\\.456"
features: "embedded_sentence\\.457"
features: "embedded_sentence\\.458"
features: "embedded_sentence\\.459"
features: "embedded_sentence\\.46"
features: "embedded_sentence\\.460"
features: "embedded_sentence\\.461"
features: "embedded_sentence\\.462"
features: "embedded_sentence\\.463"
features: "embedded_sentence\\.464"
features: "embedded_sentence\\.465"
features: "embedded_sentence\\.466"
features: "embedded_sentence\\.467"
features: "embedded_sentence\\.468"
features: "embedded_sentence\\.469"
features: "embedded_sentence\\.47"
features: "embedded_sentence\\.470"
features: "embedded_sentence\\.471"
features: "embedded_sentence\\.472"
features: "embedded_sentence\\.473"
features: "embedded_sentence\\.474"
features: "embedded_sentence\\.475"
features: "embedded_sentence\\.476"
features: "embedded_sentence\\.477"
features: "embedded_sentence\\.478"
features: "embedded_sentence\\.479"
features: "embedded_sentence\\.48"
features: "embedded_sentence\\.480"
features: "embedded_sentence\\.481"
features: "embedded_sentence\\.482"
features: "embedded_sentence\\.483"
features: "embedded_sentence\\.484"
features: "embedded_sentence\\.485"
features: "embedded_sentence\\.486"
features: "embedded_sentence\\.487"
features: "embedded_sentence\\.488"
features: "embedded_sentence\\.489"
features: "embedded_sentence\\.49"
features: "embedded_sentence\\.490"
features: "embedded_sentence\\.491"
features: "embedded_sentence\\.492"
features: "embedded_sentence\\.493"
features: "embedded_sentence\\.494"
features: "embedded_sentence\\.495"
features: "embedded_sentence\\.496"
features: "embedded_sentence\\.497"
features: "embedded_sentence\\.498"
features: "embedded_sentence\\.499"
features: "embedded_sentence\\.5"
features: "embedded_sentence\\.50"
features: "embedded_sentence\\.500"
features: "embedded_sentence\\.501"
features: "embedded_sentence\\.502"
features: "embedded_sentence\\.503"
features: "embedded_sentence\\.504"
features: "embedded_sentence\\.505"
features: "embedded_sentence\\.506"
features: "embedded_sentence\\.507"
features: "embedded_sentence\\.508"
features: "embedded_sentence\\.509"
features: "embedded_sentence\\.51"
features: "embedded_sentence\\.510"
features: "embedded_sentence\\.511"
features: "embedded_sentence\\.52"
features: "embedded_sentence\\.53"
features: "embedded_sentence\\.54"
features: "embedded_sentence\\.55"
features: "embedded_sentence\\.56"
features: "embedded_sentence\\.57"
features: "embedded_sentence\\.58"
features: "embedded_sentence\\.59"
features: "embedded_sentence\\.6"
features: "embedded_sentence\\.60"
features: "embedded_sentence\\.61"
features: "embedded_sentence\\.62"
features: "embedded_sentence\\.63"
features: "embedded_sentence\\.64"
features: "embedded_sentence\\.65"
features: "embedded_sentence\\.66"
features: "embedded_sentence\\.67"
features: "embedded_sentence\\.68"
features: "embedded_sentence\\.69"
features: "embedded_sentence\\.7"
features: "embedded_sentence\\.70"
features: "embedded_sentence\\.71"
features: "embedded_sentence\\.72"
features: "embedded_sentence\\.73"
features: "embedded_sentence\\.74"
features: "embedded_sentence\\.75"
features: "embedded_sentence\\.76"
features: "embedded_sentence\\.77"
features: "embedded_sentence\\.78"
features: "embedded_sentence\\.79"
features: "embedded_sentence\\.8"
features: "embedded_sentence\\.80"
features: "embedded_sentence\\.81"
features: "embedded_sentence\\.82"
features: "embedded_sentence\\.83"
features: "embedded_sentence\\.84"
features: "embedded_sentence\\.85"
features: "embedded_sentence\\.86"
features: "embedded_sentence\\.87"
features: "embedded_sentence\\.88"
features: "embedded_sentence\\.89"
features: "embedded_sentence\\.9"
features: "embedded_sentence\\.90"
features: "embedded_sentence\\.91"
features: "embedded_sentence\\.92"
features: "embedded_sentence\\.93"
features: "embedded_sentence\\.94"
features: "embedded_sentence\\.95"
features: "embedded_sentence\\.96"
features: "embedded_sentence\\.97"
features: "embedded_sentence\\.98"
features: "embedded_sentence\\.99"
label: "__LABEL"
task: CLASSIFICATION
[yggdrasil_decision_forests.model.random_forest.proto.random_forest_config] {
  num_trees: 100
  decision_tree {
    max_depth: 16
    min_examples: 5
    in_split_min_examples_check: true
    missing_value_policy: GLOBAL_IMPUTATION
    allow_na_conditions: false
    categorical_set_greedy_forward {
      sampling: 0.1
      max_num_items: -1
      min_item_frequency: 1
    }
    growing_strategy_local {
    }
    categorical {
      cart {
      }
    }
    num_candidate_attributes_ratio: -1
    axis_aligned_split {
    }
    internal {
      sorting_strategy: PRESORTED
    }
  }
  winner_take_all_inference: true
  compute_oob_performances: true
  compute_oob_variable_importances: false
  adapt_bootstrap_size_ratio_for_maximum_training_duration: false
}

[INFO kernel.cc:790] Deployment config:
num_threads: 6

[INFO kernel.cc:817] Train model
[INFO random_forest.cc:315] Training random forest on 67349 example(s) and 512 feature(s).
[INFO random_forest.cc:628] Training of tree  1/100 (tree index:1) done accuracy:0.743339 logloss:9.25099
[INFO random_forest.cc:628] Training of tree  11/100 (tree index:10) done accuracy:0.788438 logloss:1.97592
[INFO random_forest.cc:628] Training of tree  21/100 (tree index:20) done accuracy:0.82798 logloss:0.687896
[INFO random_forest.cc:628] Training of tree  31/100 (tree index:28) done accuracy:0.8427 logloss:0.466909
[INFO random_forest.cc:628] Training of tree  41/100 (tree index:40) done accuracy:0.851327 logloss:0.403339
[INFO random_forest.cc:628] Training of tree  51/100 (tree index:53) done accuracy:0.856553 logloss:0.379845
[INFO random_forest.cc:628] Training of tree  61/100 (tree index:59) done accuracy:0.859998 logloss:0.369493
[INFO random_forest.cc:628] Training of tree  71/100 (tree index:69) done accuracy:0.862864 logloss:0.365896
[INFO random_forest.cc:628] Training of tree  81/100 (tree index:79) done accuracy:0.864556 logloss:0.363075
[INFO random_forest.cc:628] Training of tree  91/100 (tree index:91) done accuracy:0.865596 logloss:0.361243
[INFO random_forest.cc:628] Training of tree  100/100 (tree index:99) done accuracy:0.866991 logloss:0.360368
[INFO random_forest.cc:696] Final OOB metrics: accuracy:0.866991 logloss:0.360368
[INFO kernel.cc:828] Export model in log directory: /tmp/tmpw2g04fbi
[INFO kernel.cc:836] Save model in resources
[INFO kernel.cc:988] Loading model from path
[INFO decision_forest.cc:590] Model loaded with 100 root(s), 561666 node(s), and 512 input feature(s).
[INFO abstract_model.cc:993] Engine "RandomForestOptPred" built
[INFO kernel.cc:848] Use fast generic engine
1053/1053 [==============================] - 75s 66ms/step
evaluation = model_2.evaluate(test_ds)

print(f"BinaryCrossentropyloss: {evaluation[0]}")
print(f"Accuracy: {evaluation[1]}")
14/14 [==============================] - 2s 16ms/step - loss: 0.0000e+00 - accuracy: 0.7821
BinaryCrossentropyloss: 0.0
Accuracy: 0.7821100950241089

Perhatikan bahwa kumpulan kategoris mewakili teks secara berbeda dari penyematan padat, jadi mungkin berguna untuk menggunakan kedua strategi secara bersamaan.

Latih pohon keputusan dan jaringan saraf bersama-sama

Contoh sebelumnya menggunakan Neural Network (NN) yang telah dilatih sebelumnya untuk memproses fitur teks sebelum meneruskannya ke Hutan Acak. Contoh ini akan melatih Neural Network dan Random Forest dari awal.

Hutan Keputusan TF-DF ini tidak gradien back-propagate ( meskipun ini adalah subjek penelitian yang sedang berlangsung ). Oleh karena itu, pelatihan terjadi dalam dua tahap:

  1. Latih jaringan saraf sebagai tugas klasifikasi standar:
example → [Normalize] → [Neural Network*] → [classification head] → prediction
*: Training.
  1. Ganti kepala Neural Network (lapisan terakhir dan soft-max) dengan Hutan Acak. Latih Hutan Acak seperti biasa:
example → [Normalize] → [Neural Network] → [Random Forest*] → prediction
*: Training.

Siapkan kumpulan datanya

Contoh ini menggunakan Palmer Penguins dataset. Lihat colab Pemula untuk rincian.

Pertama, unduh data mentah:

wget -q https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins.csv -O /tmp/penguins.csv

Muat kumpulan data ke dalam Pandas Dataframe.

dataset_df = pd.read_csv("/tmp/penguins.csv")

# Display the first 3 examples.
dataset_df.head(3)

Siapkan dataset untuk pelatihan.

label = "species"

# Replaces numerical NaN (representing missing values in Pandas Dataframe) with 0s.
# ...Neural Nets don't work well with numerical NaNs.
for col in dataset_df.columns:
  if dataset_df[col].dtype not in [str, object]:
    dataset_df[col] = dataset_df[col].fillna(0)
# Split the dataset into a training and testing dataset.

def split_dataset(dataset, test_ratio=0.30):
  """Splits a panda dataframe in two."""
  test_indices = np.random.rand(len(dataset)) < test_ratio
  return dataset[~test_indices], dataset[test_indices]

train_ds_pd, test_ds_pd = split_dataset(dataset_df)
print("{} examples in training, {} examples for testing.".format(
    len(train_ds_pd), len(test_ds_pd)))

# Convert the datasets into tensorflow datasets
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(train_ds_pd, label=label)
test_ds = tfdf.keras.pd_dataframe_to_tf_dataset(test_ds_pd, label=label)
252 examples in training, 92 examples for testing.
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_decision_forests/keras/core.py:1612: FutureWarning: In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
  features_dataframe = dataframe.drop(label, 1)

Membangun model

Berikutnya membuat model jaringan saraf menggunakan gaya fungsional Keras' .

Untuk menjaga agar contoh tetap sederhana, model ini hanya menggunakan dua input.

input_1 = tf.keras.Input(shape=(1,), name="bill_length_mm", dtype="float")
input_2 = tf.keras.Input(shape=(1,), name="island", dtype="string")

nn_raw_inputs = [input_1, input_2]

Gunakan preprocessing lapisan untuk mengubah masukan baku untuk input apropriate untuk netrwork saraf.

# Normalization.
Normalization = tf.keras.layers.Normalization
CategoryEncoding = tf.keras.layers.CategoryEncoding
StringLookup = tf.keras.layers.StringLookup

values = train_ds_pd["bill_length_mm"].values[:, tf.newaxis]
input_1_normalizer = Normalization()
input_1_normalizer.adapt(values)

values = train_ds_pd["island"].values
input_2_indexer = StringLookup(max_tokens=32)
input_2_indexer.adapt(values)

input_2_onehot = CategoryEncoding(output_mode="binary", max_tokens=32)

normalized_input_1 = input_1_normalizer(input_1)
normalized_input_2 = input_2_onehot(input_2_indexer(input_2))

nn_processed_inputs = [normalized_input_1, normalized_input_2]
WARNING:tensorflow:max_tokens is deprecated, please use num_tokens instead.
WARNING:tensorflow:max_tokens is deprecated, please use num_tokens instead.

Membangun tubuh jaringan saraf:

y = tf.keras.layers.Concatenate()(nn_processed_inputs)
y = tf.keras.layers.Dense(16, activation=tf.nn.relu6)(y)
last_layer = tf.keras.layers.Dense(8, activation=tf.nn.relu, name="last")(y)

# "3" for the three label classes. If it were a binary classification, the
# output dim would be 1.
classification_output = tf.keras.layers.Dense(3)(y)

nn_model = tf.keras.models.Model(nn_raw_inputs, classification_output)

Ini nn_model langsung menghasilkan klasifikasi logits.

Selanjutnya membuat model hutan keputusan. Ini akan beroperasi pada fitur tingkat tinggi yang diekstraksi jaringan saraf di lapisan terakhir sebelum kepala klasifikasi itu.

# To reduce the risk of mistakes, group both the decision forest and the
# neural network in a single keras model.
nn_without_head = tf.keras.models.Model(inputs=nn_model.inputs, outputs=last_layer)
df_and_nn_model = tfdf.keras.RandomForestModel(preprocessing=nn_without_head)

Latih dan evaluasi model

Model akan dilatih dalam dua tahap. Pertama-tama latih jaringan saraf dengan kepala klasifikasinya sendiri:

%set_cell_height 300

nn_model.compile(
  optimizer=tf.keras.optimizers.Adam(),
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  metrics=["accuracy"])

nn_model.fit(x=train_ds, validation_data=test_ds, epochs=10)
nn_model.summary()
<IPython.core.display.Javascript object>
Epoch 1/10
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/keras/engine/functional.py:559: UserWarning: Input dict contained keys ['bill_depth_mm', 'flipper_length_mm', 'body_mass_g', 'sex', 'year'] which did not match any model input. They will be ignored by the model.
  inputs = self._flatten_to_reference_inputs(inputs)
4/4 [==============================] - 0s 53ms/step - loss: 1.0232 - accuracy: 0.3730 - val_loss: 1.0186 - val_accuracy: 0.3587
Epoch 2/10
4/4 [==============================] - 0s 7ms/step - loss: 1.0107 - accuracy: 0.3810 - val_loss: 1.0096 - val_accuracy: 0.3587
Epoch 3/10
4/4 [==============================] - 0s 7ms/step - loss: 1.0006 - accuracy: 0.3889 - val_loss: 1.0006 - val_accuracy: 0.3696
Epoch 4/10
4/4 [==============================] - 0s 7ms/step - loss: 0.9909 - accuracy: 0.3968 - val_loss: 0.9915 - val_accuracy: 0.3696
Epoch 5/10
4/4 [==============================] - 0s 7ms/step - loss: 0.9813 - accuracy: 0.3968 - val_loss: 0.9825 - val_accuracy: 0.3696
Epoch 6/10
4/4 [==============================] - 0s 7ms/step - loss: 0.9717 - accuracy: 0.4008 - val_loss: 0.9735 - val_accuracy: 0.3696
Epoch 7/10
4/4 [==============================] - 0s 7ms/step - loss: 0.9621 - accuracy: 0.4048 - val_loss: 0.9645 - val_accuracy: 0.4457
Epoch 8/10
4/4 [==============================] - 0s 7ms/step - loss: 0.9525 - accuracy: 0.6111 - val_loss: 0.9555 - val_accuracy: 0.6522
Epoch 9/10
4/4 [==============================] - 0s 8ms/step - loss: 0.9430 - accuracy: 0.7262 - val_loss: 0.9465 - val_accuracy: 0.6848
Epoch 10/10
4/4 [==============================] - 0s 7ms/step - loss: 0.9335 - accuracy: 0.7460 - val_loss: 0.9374 - val_accuracy: 0.7283
Model: "model_1"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 island (InputLayer)            [(None, 1)]          0           []                               
                                                                                                  
 bill_length_mm (InputLayer)    [(None, 1)]          0           []                               
                                                                                                  
 string_lookup (StringLookup)   (None, 1)            0           ['island[0][0]']                 
                                                                                                  
 normalization (Normalization)  (None, 1)            3           ['bill_length_mm[0][0]']         
                                                                                                  
 category_encoding (CategoryEnc  (None, 32)          0           ['string_lookup[0][0]']          
 oding)                                                                                           
                                                                                                  
 concatenate (Concatenate)      (None, 33)           0           ['normalization[0][0]',          
                                                                  'category_encoding[0][0]']      
                                                                                                  
 dense (Dense)                  (None, 16)           544         ['concatenate[0][0]']            
                                                                                                  
 dense_1 (Dense)                (None, 3)            51          ['dense[0][0]']                  
                                                                                                  
==================================================================================================
Total params: 598
Trainable params: 595
Non-trainable params: 3
__________________________________________________________________________________________________

Lapisan jaringan saraf dibagi antara dua model. Jadi sekarang setelah jaringan saraf dilatih, model hutan keputusan akan cocok dengan output terlatih dari lapisan jaringan saraf:

%set_cell_height 300

df_and_nn_model.compile(metrics=["accuracy"])
with sys_pipes():
  df_and_nn_model.fit(x=train_ds)
<IPython.core.display.Javascript object>
1/4 [======>.......................] - ETA: 0s
[INFO kernel.cc:736] Start Yggdrasil model training
[INFO kernel.cc:737] Collect training examples
[INFO kernel.cc:392] Number of batches: 4
[INFO kernel.cc:393] Number of examples: 252
[INFO kernel.cc:759] Dataset:
Number of records: 252
Number of columns: 9

Number of columns by type:
    NUMERICAL: 8 (88.8889%)
    CATEGORICAL: 1 (11.1111%)

Columns:

NUMERICAL: 8 (88.8889%)
    0: "model_2/last/Relu:0.0" NUMERICAL mean:0.0612511 min:0 max:1.05271 sd:0.1172
    1: "model_2/last/Relu:0.1" NUMERICAL mean:0.145744 min:0 max:0.357441 sd:0.140661
    2: "model_2/last/Relu:0.2" NUMERICAL mean:0.114429 min:0 max:0.527097 sd:0.0945893
    3: "model_2/last/Relu:0.3" NUMERICAL mean:0.0132481 min:0 max:0.124071 sd:0.0305115
    4: "model_2/last/Relu:0.4" NUMERICAL mean:0.0538435 min:0 max:0.446979 sd:0.110693
    5: "model_2/last/Relu:0.5" NUMERICAL mean:0.000560531 min:0 max:0.0364899 sd:0.00370266
    6: "model_2/last/Relu:0.6" NUMERICAL mean:0.0278776 min:0 max:0.449398 sd:0.0592763
    7: "model_2/last/Relu:0.7" NUMERICAL mean:0.0485136 min:0 max:0.319197 sd:0.104035

CATEGORICAL: 1 (11.1111%)
    8: "__LABEL" CATEGORICAL integerized vocab-size:4 no-ood-item

Terminology:
    nas: Number of non-available (i.e. missing) values.
    ood: Out of dictionary.
    manually-defined: Attribute which type is manually defined by the user i.e. the type was not automatically inferred.
    tokenized: The attribute value is obtained through tokenization.
    has-dict: The attribute is attached to a string dictionary e.g. a categorical attribute stored as a string.
    vocab-size: Number of unique values.

[INFO kernel.cc:762] Configure learner
[INFO kernel.cc:787] Training config:
learner: "RANDOM_FOREST"
features: "model_2/last/Relu:0\\.0"
features: "model_2/last/Relu:0\\.1"
features: "model_2/last/Relu:0\\.2"
features: "model_2/last/Relu:0\\.3"
features: "model_2/last/Relu:0\\.4"
features: "model_2/last/Relu:0\\.5"
features: "model_2/last/Relu:0\\.6"
features: "model_2/last/Relu:0\\.7"
label: "__LABEL"
task: CLASSIFICATION
[yggdrasil_decision_forests.model.random_forest.proto.random_forest_config] {
  num_trees: 300
  decision_tree {
    max_depth: 16
    min_examples: 5
    in_split_min_examples_check: true
    missing_value_policy: GLOBAL_IMPUTATION
    allow_na_conditions: false
    categorical_set_greedy_forward {
      sampling: 0.1
      max_num_items: -1
      min_item_frequency: 1
    }
    growing_strategy_local {
    }
    categorical {
      cart {
      }
    }
    num_candidate_attributes_ratio: -1
    axis_aligned_split {
    }
    internal {
      sorting_strategy: PRESORTED
    }
  }
  winner_take_all_inference: true
  compute_oob_performances: true
  compute_oob_variable_importances: false
  adapt_bootstrap_size_ratio_for_maximum_training_duration: false
}

[INFO kernel.cc:790] Deployment config:
num_threads: 6

[INFO kernel.cc:817] Train model
[INFO random_forest.cc:315] Training random forest on 252 example(s) and 8 feature(s).
[INFO random_forest.cc:628] Training of tree  1/300 (tree index:0) done accuracy:0.944444 logloss:2.00243
[INFO random_forest.cc:628] Training of tree  11/300 (tree index:10) done accuracy:0.948207 logloss:1.04535
[INFO random_forest.cc:628] Training of tree  21/300 (tree index:20) done accuracy:0.956349 logloss:0.763534
[INFO random_forest.cc:628] Training of tree  31/300 (tree index:30) done accuracy:0.952381 logloss:0.633103
[INFO random_forest.cc:628] Training of tree  41/300 (tree index:40) done accuracy:0.952381 logloss:0.634035
[INFO random_forest.cc:628] Training of tree  51/300 (tree index:49) done accuracy:0.952381 logloss:0.63407
[INFO random_forest.cc:628] Training of tree  61/300 (tree index:60) done accuracy:0.952381 logloss:0.632213
[INFO random_forest.cc:628] Training of tree  71/300 (tree index:69) done accuracy:0.952381 logloss:0.634892
[INFO random_forest.cc:628] Training of tree  81/300 (tree index:80) done accuracy:0.948413 logloss:0.634806
[INFO random_forest.cc:628] Training of tree  91/300 (tree index:90) done accuracy:0.948413 logloss:0.634308
[INFO random_forest.cc:628] Training of tree  101/300 (tree index:100) done accuracy:0.944444 logloss:0.63434
[INFO random_forest.cc:628] Training of tree  111/300 (tree index:110) done accuracy:0.944444 logloss:0.63474
[INFO random_forest.cc:628] Training of tree  121/300 (tree index:120) done accuracy:0.944444 logloss:0.634896
[INFO random_forest.cc:628] Training of tree  131/300 (tree index:130) done accuracy:0.948413 logloss:0.634515
[INFO random_forest.cc:628] Training of tree  141/300 (tree index:138) done accuracy:0.944444 logloss:0.635284
[INFO random_forest.cc:628] Training of tree  151/300 (tree index:150) done accuracy:0.944444 logloss:0.634902
[INFO random_forest.cc:628] Training of tree  161/300 (tree index:160) done accuracy:0.944444 logloss:0.633816
[INFO random_forest.cc:628] Training of tree  171/300 (tree index:170) done accuracy:0.944444 logloss:0.632936
[INFO random_forest.cc:628] Training of tree  181/300 (tree index:180) done accuracy:0.944444 logloss:0.632445
[INFO random_forest.cc:628] Training of tree  191/300 (tree index:189) done accuracy:0.944444 logloss:0.632614
[INFO random_forest.cc:628] Training of tree  201/300 (tree index:199) done accuracy:0.944444 logloss:0.632688
[INFO random_forest.cc:628] Training of tree  211/300 (tree index:206) done accuracy:0.944444 logloss:0.633056
[INFO random_forest.cc:628] Training of tree  221/300 (tree index:220) done accuracy:0.944444 logloss:0.633952
[INFO random_forest.cc:628] Training of tree  231/300 (tree index:231) done accuracy:0.944444 logloss:0.634217
[INFO random_forest.cc:628] Training of tree  241/300 (tree index:240) done accuracy:0.944444 logloss:0.634271
[INFO random_forest.cc:628] Training of tree  251/300 (tree index:244) done accuracy:0.944444 logloss:0.634761
[INFO random_forest.cc:628] Training of tree  261/300 (tree index:261) done accuracy:0.944444 logloss:0.634685
[INFO random_forest.cc:628] Training of tree  271/300 (tree index:268) done accuracy:0.944444 logloss:0.634395
[INFO random_forest.cc:628] Training of tree  281/300 (tree index:280) done accuracy:0.944444 logloss:0.633878
[INFO random_forest.cc:628] Training of tree  291/300 (tree index:291) done accuracy:0.944444 logloss:0.633605
[INFO random_forest.cc:628] Training of tree  300/300 (tree index:299) done accuracy:0.944444 logloss:0.633627
[INFO random_forest.cc:696] Final OOB metrics: accuracy:0.944444 logloss:0.633627
[INFO kernel.cc:828] Export model in log directory: /tmp/tmpb92rvbmj
[INFO kernel.cc:836] Save model in resources
[INFO kernel.cc:988] Loading model from path
[INFO decision_forest.cc:590] Model loaded with 300 root(s), 4148 node(s), and 8 input feature(s).
[INFO kernel.cc:848] Use fast generic engine
4/4 [==============================] - 0s 18ms/step

Sekarang evaluasi model yang disusun:

print("Evaluation:", df_and_nn_model.evaluate(test_ds))
2/2 [==============================] - 0s 5ms/step - loss: 0.0000e+00 - accuracy: 0.9565
Evaluation: [0.0, 0.95652174949646]

Bandingkan dengan Neural Network saja:

print("Evaluation :", nn_model.evaluate(test_ds))
2/2 [==============================] - 0s 4ms/step - loss: 0.9374 - accuracy: 0.7283
Evaluation : [0.9373641610145569, 0.72826087474823]