BERT-Frage Antwort mit TensorFlow Lite Model Maker

Auf TensorFlow.org ansehen Quelle auf GitHub anzeigen Notizbuch herunterladen

Die TensorFlow Lite Model Maker Bibliothek vereinfacht den Prozess der Anpassung und ein TensorFlow Modell auf bestimmte Eingangsdaten umzuwandeln , wenn auf dem Gerät ML Anwendungen dieses Modells für die Bereitstellung.

Dieses Notebook zeigt ein End-to-End-Beispiel, das die Model Maker-Bibliothek verwendet, um die Anpassung und Konvertierung eines häufig verwendeten Frage-Antwort-Modells für Frage-Antwort-Aufgaben zu veranschaulichen.

Einführung in die BERT-Frage-Antwort-Aufgabe

Die unterstützte Aufgabe in dieser Bibliothek ist die extraktive Frage-Antwort-Aufgabe, was bedeutet, dass bei einer gegebenen Passage und einer Frage die Antwort die Spanne in der Passage ist. Das Bild unten zeigt ein Beispiel für die Frageantwort.

Antworten sind Spannweiten in der Passage (Bild - Kredit: SQUAD Blog )

Beim Modell der Frage-Antwort-Aufgabe sollten die Eingaben die Passage und das Fragepaar sein, die bereits vorverarbeitet sind, die Ausgaben sollten die Start-Logits und End-Logits für jeden Token in der Passage sein. Die Größe der Eingabe kann entsprechend der Länge der Passage und der Frage eingestellt und angepasst werden.

End-to-End-Übersicht

Der folgende Codeausschnitt zeigt, wie Sie das Modell in wenigen Codezeilen abrufen. Der Gesamtprozess umfasst 5 Schritte: (1) ein Modell auswählen, (2) Daten laden, (3) das Modell neu trainieren, (4) auswerten und (5) in das TensorFlow Lite-Format exportieren.

# Chooses a model specification that represents the model.
spec = model_spec.get('mobilebert_qa')

# Gets the training data and validation data.
train_data = DataLoader.from_squad(train_data_path, spec, is_training=True)
validation_data = DataLoader.from_squad(validation_data_path, spec, is_training=False)

# Fine-tunes the model.
model = question_answer.create(train_data, model_spec=spec)

# Gets the evaluation result.
metric = model.evaluate(validation_data)

# Exports the model to the TensorFlow Lite format with metadata in the export directory.
model.export(export_dir)

In den folgenden Abschnitten wird der Code genauer erläutert.

Voraussetzungen

Um dieses Beispiel auszuführen, installieren Sie die erforderlichen Pakete, einschließlich der Model Maker - Paket aus dem Repo - GitHub .

pip install -q tflite-model-maker

Importieren Sie die erforderlichen Pakete.

import numpy as np
import os

import tensorflow as tf
assert tf.__version__.startswith('2')

from tflite_model_maker import model_spec
from tflite_model_maker import question_answer
from tflite_model_maker.config import ExportFormat
from tflite_model_maker.question_answer import DataLoader
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_addons/utils/ensure_tf_install.py:67: UserWarning: Tensorflow Addons supports using Python ops for all Tensorflow versions above or equal to 2.3.0 and strictly below 2.6.0 (nightly versions are not supported). 
 The versions of TensorFlow you are currently using is 2.6.0 and is not supported. 
Some things might work, some things might not.
If you were to encounter a bug, do not file an issue.
If you want to make sure you're using a tested and supported configuration, either change the TensorFlow version or the TensorFlow Addons's version. 
You can find the compatibility matrix in TensorFlow Addon's readme:
https://github.com/tensorflow/addons
  UserWarning,
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/numba/core/errors.py:154: UserWarning: Insufficiently recent colorama version found. Numba requires colorama >= 0.3.9
  warnings.warn(msg)

Die „End-to-End-Übersicht“ zeigt ein einfaches End-to-End-Beispiel. Die folgenden Abschnitte führen das Beispiel Schritt für Schritt durch, um mehr Details zu zeigen.

Wählen Sie eine model_spec, die ein Modell für die Frageantwort darstellt

Jedes model_spec Objekt stellt für Frage - Antwort ein bestimmtes Modell. Der Model Maker unterstützt derzeit MobileBERT- und BERT-Base-Modelle.

Unterstütztes Modell Name von model_spec Modellbeschreibung
MobilBERT 'mobilebert_qa' 4,3x kleiner und 5,5x schneller als BERT-Base bei gleichzeitiger Erzielung wettbewerbsfähiger Ergebnisse, geeignet für On-Device-Szenarien.
MobileBERT-SQuAD 'mobilebert_qa_squad' Gleiches Model Architektur als MobileBERT Modell und das Ausgangsmodell ist bereits auf umgeschult SQuAD1.1 .
BERT-Basis 'bert_qa' Standard-BERT-Modell, das bei NLP-Aufgaben weit verbreitet ist.

In diesem Tutorial MobileBERT-SQUAD wird als Beispiel verwendet. Da das Modell bereits auf umgeschult wird SQuAD1.1 , könnte es Abdeckung für Frage - Antwort - Aufgabe schneller.

spec = model_spec.get('mobilebert_qa_squad')
2021-08-12 11:59:51.438945: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:59:51.447414: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:59:51.448405: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

Laden Sie Eingabedaten, die für eine geräteinterne ML-App spezifisch sind, und verarbeiten Sie die Daten vor.

Die TriviaQA ist ein Leseverständnis - Datensatz mit über 650K Frage-Antwort-Beweise verdreifachen. In diesem Lernprogramm verwenden Sie eine Teilmenge dieses Datasets, um zu erfahren, wie Sie die Model Maker-Bibliothek verwenden.

Um die Daten zu laden, wandeln die TriviaQA Datensatz in das SQuAD1.1 Format , indem Sie den Konverter - --sample_size=8000 web Python - Skript mit --sample_size=8000 und einer Reihe von web - Daten. Ändern Sie den Konvertierungscode ein wenig, indem Sie:

  • Überspringen der Beispiele, die im Kontextdokument keine Antwort finden konnten;
  • Erhalten der ursprünglichen Antwort im Kontext ohne Groß- oder Kleinschreibung.

Laden Sie die archivierte Version des bereits konvertierten Datensatzes herunter.

train_data_path = tf.keras.utils.get_file(
    fname='triviaqa-web-train-8000.json',
    origin='https://storage.googleapis.com/download.tensorflow.org/models/tflite/dataset/triviaqa-web-train-8000.json')
validation_data_path = tf.keras.utils.get_file(
    fname='triviaqa-verified-web-dev.json',
    origin='https://storage.googleapis.com/download.tensorflow.org/models/tflite/dataset/triviaqa-verified-web-dev.json')
Downloading data from https://storage.googleapis.com/download.tensorflow.org/models/tflite/dataset/triviaqa-web-train-8000.json
32571392/32570663 [==============================] - 0s 0us/step
32579584/32570663 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/download.tensorflow.org/models/tflite/dataset/triviaqa-verified-web-dev.json
1171456/1167744 [==============================] - 0s 0us/step
1179648/1167744 [==============================] - 0s 0us/step

Sie können das MobileBERT-Modell auch mit Ihrem eigenen Datensatz trainieren. Wenn Sie dieses Notizbuch auf Colab ausführen, laden Sie Ihre Daten über die linke Seitenleiste hoch.

Datei hochladen

Wenn Sie nicht Ihre Daten in die Cloud laden, können Sie auch die Bibliothek offline ausgeführt werden, indem die folgende Anleitung .

Verwenden Sie das DataLoader.from_squad Verfahren zu laden und die Vorverarbeitung der sQuad Formatdaten entsprechend einem spezifischen model_spec . Sie können entweder die Formate SQuAD2.0 oder SQuAD1.1 verwenden. Einstellparameter version_2_with_negative als True Mittel , um die Formate SQuAD2.0. Ansonsten ist das Format SQuAD1.1. Standardmäßig version_2_with_negative ist False .

train_data = DataLoader.from_squad(train_data_path, spec, is_training=True)
validation_data = DataLoader.from_squad(validation_data_path, spec, is_training=False)
2021-08-12 12:02:04.752380: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-12 12:02:04.753227: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:04.754341: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:04.755202: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:05.293390: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:05.294462: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:05.295445: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 12:02:05.296323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14648 MB memory:  -> device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0

Passen Sie das TensorFlow-Modell an

Erstellen Sie ein benutzerdefiniertes Frage-Antwort-Modell basierend auf den geladenen Daten. Die create Funktion die folgenden Schritte umfasst:

  1. Erzeugt das Modell für die Frage - Antwort nach model_spec .
  2. Trainieren Sie das Frage-Antwort-Modell. Die Standard - Epochen und die Standardchargengröße werden entsprechend zwei Variablen default_training_epochs und default_batch_size im model_spec Objekt.
model = question_answer.create(train_data, model_spec=spec)
INFO:tensorflow:Retraining the models...
INFO:tensorflow:Retraining the models...
2021-08-12 12:02:17.450548: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/2
1067/1067 [==============================] - 423s 350ms/step - loss: 1.1346 - start_positions_loss: 1.1321 - end_positions_loss: 1.1371
Epoch 2/2
1067/1067 [==============================] - 373s 350ms/step - loss: 0.7933 - start_positions_loss: 0.7927 - end_positions_loss: 0.7939

Sehen Sie sich die detaillierte Modellstruktur an.

model.summary()
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_word_ids (InputLayer)     [(None, 384)]        0                                            
__________________________________________________________________________________________________
input_mask (InputLayer)         [(None, 384)]        0                                            
__________________________________________________________________________________________________
input_type_ids (InputLayer)     [(None, 384)]        0                                            
__________________________________________________________________________________________________
hub_keras_layer_v1v2 (HubKerasL {'start_logits': (No 24582914    input_word_ids[0][0]             
                                                                 input_mask[0][0]                 
                                                                 input_type_ids[0][0]             
__________________________________________________________________________________________________
start_positions (Lambda)        (None, None)         0           hub_keras_layer_v1v2[0][1]       
__________________________________________________________________________________________________
end_positions (Lambda)          (None, None)         0           hub_keras_layer_v1v2[0][0]       
==================================================================================================
Total params: 24,582,914
Trainable params: 24,582,914
Non-trainable params: 0
__________________________________________________________________________________________________

Bewerten Sie das benutzerdefinierte Modell

Werten Sie das Modell auf den Validierungsdaten und eine dict von Metriken einschließlich f1 - Score und exact match usw. Beachten Sie, dass Metriken unterschiedlich sind für SQuAD1.1 und SQuAD2.0.

model.evaluate(validation_data)
INFO:tensorflow:Made predictions for 200 records.
INFO:tensorflow:Made predictions for 200 records.
INFO:tensorflow:Made predictions for 400 records.
INFO:tensorflow:Made predictions for 400 records.
INFO:tensorflow:Made predictions for 600 records.
INFO:tensorflow:Made predictions for 600 records.
INFO:tensorflow:Made predictions for 800 records.
INFO:tensorflow:Made predictions for 800 records.
INFO:tensorflow:Made predictions for 1000 records.
INFO:tensorflow:Made predictions for 1000 records.
INFO:tensorflow:Made predictions for 1200 records.
INFO:tensorflow:Made predictions for 1200 records.
{'exact_match': 0.5884353741496599, 'final_f1': 0.6621698029861295}

In das TensorFlow Lite-Modell exportieren

Konvertieren Sie das trainierte Modell TensorFlow Lite - Modell - Format mit Metadaten , damit Sie später in einer On-Gerät ML Anwendung verwenden können. Die vocab-Datei ist in Metadaten eingebettet. Der Standarddateiname ist TFLite model.tflite .

In vielen geräteinternen ML-Anwendungen ist die Modellgröße ein wichtiger Faktor. Daher wird empfohlen, das Modell zu quantisieren, um es zu verkleinern und möglicherweise schneller auszuführen. Die standardmäßige Quantisierungstechnik nach dem Training ist die dynamische Bereichsquantisierung für die BERT- und MobileBERT-Modelle.

model.export(export_dir='.')
2021-08-12 12:16:05.811327: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: /tmp/tmp7t_bxd9h/saved_model/assets
INFO:tensorflow:Assets written to: /tmp/tmp7t_bxd9h/saved_model/assets
2021-08-12 12:16:35.499794: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:351] Ignored output_format.
2021-08-12 12:16:35.499841: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:354] Ignored drop_control_dependency.
2021-08-12 12:16:35.499849: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:360] Ignored change_concat_input_ranges.
2021-08-12 12:16:35.501017: I tensorflow/cc/saved_model/reader.cc:38] Reading SavedModel from: /tmp/tmp7t_bxd9h/saved_model
2021-08-12 12:16:35.567920: I tensorflow/cc/saved_model/reader.cc:90] Reading meta graph with tags { serve }
2021-08-12 12:16:35.567966: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmp7t_bxd9h/saved_model
2021-08-12 12:16:35.922151: I tensorflow/cc/saved_model/loader.cc:211] Restoring SavedModel bundle.
2021-08-12 12:16:37.787828: I tensorflow/cc/saved_model/loader.cc:195] Running initialization op on SavedModel bundle at path: /tmp/tmp7t_bxd9h/saved_model
2021-08-12 12:16:38.783520: I tensorflow/cc/saved_model/loader.cc:283] SavedModel load for tags { serve }; Status: success: OK. Took 3282507 microseconds.
2021-08-12 12:16:40.489883: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:210] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2021-08-12 12:16:43.756590: I tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1899] Estimated count of arithmetic ops: 18.380 G  ops, equivalently 9.190 G  MACs
2021-08-12 12:16:43.920701: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920748: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920754: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920759: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920765: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920770: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920775: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920780: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_0/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920797: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920801: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920806: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920811: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920817: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920822: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920826: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920833: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_1/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920848: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920853: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920858: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920863: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920870: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920874: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920879: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920883: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_2/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920897: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920902: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920907: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920911: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920917: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920922: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920926: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920931: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_3/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920949: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.920954: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.920958: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.920964: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.920970: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.920975: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.920980: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.920985: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_4/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.920999: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921004: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921008: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921013: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921020: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921024: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921029: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921033: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_5/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921048: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921052: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921057: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921062: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921068: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921072: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921077: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921082: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_6/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921096: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921100: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921105: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921110: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921116: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921120: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921125: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921129: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_7/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921146: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921151: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921155: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921168: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921174: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921179: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921183: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921188: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_8/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921201: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921206: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921211: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921216: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921222: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921227: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921232: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921236: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_9/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921250: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921254: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921259: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921264: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921270: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921275: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921280: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921284: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_10/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921298: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921302: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921307: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921312: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921318: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921323: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921327: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921333: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_11/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921348: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921354: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921359: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921363: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921370: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921374: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921379: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921384: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_12/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921398: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921403: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921408: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921412: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921418: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921423: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921428: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921432: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_13/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921446: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921451: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921456: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921461: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921467: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921472: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921477: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921481: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_14/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921507: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921512: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921517: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921521: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921527: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921532: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921537: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921542: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_15/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921556: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921560: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921565: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921570: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921576: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921581: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921585: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921590: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_16/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921604: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921609: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921613: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921618: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921624: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921628: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921633: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921638: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_17/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921651: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921655: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921660: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921665: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921672: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921676: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921681: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921686: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_18/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921699: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921704: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921708: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921714: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921720: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921724: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921729: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921734: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_19/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921747: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921752: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921756: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921760: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921766: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921771: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921776: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921780: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_20/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921795: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921799: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921804: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921808: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921815: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921820: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921824: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921829: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_21/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921843: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul15 because it has no allocated buffer.
2021-08-12 12:16:43.921848: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul17 because it has no allocated buffer.
2021-08-12 12:16:43.921853: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul19 because it has no allocated buffer.
2021-08-12 12:16:43.921857: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul21 because it has no allocated buffer.
2021-08-12 12:16:43.921863: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul_114 because it has no allocated buffer.
2021-08-12 12:16:43.921868: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul_116 because it has no allocated buffer.
2021-08-12 12:16:43.921872: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul_118 because it has no allocated buffer.
2021-08-12 12:16:43.921877: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_22/attention/self/MatMul_120 because it has no allocated buffer.
2021-08-12 12:16:43.921890: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul20 because it has no allocated buffer.
2021-08-12 12:16:43.921894: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul22 because it has no allocated buffer.
2021-08-12 12:16:43.921899: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul24 because it has no allocated buffer.
2021-08-12 12:16:43.921903: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul26 because it has no allocated buffer.
2021-08-12 12:16:43.921909: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul_125 because it has no allocated buffer.
2021-08-12 12:16:43.921914: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul_127 because it has no allocated buffer.
2021-08-12 12:16:43.921918: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul_129 because it has no allocated buffer.
2021-08-12 12:16:43.921923: I tensorflow/lite/tools/optimize/quantize_weights.cc:234] Skipping quantization of tensor bert/encoder/layer_23/attention/self/MatMul_131 because it has no allocated buffer.
INFO:tensorflow:Vocab file is inside the TFLite model with metadata.
INFO:tensorflow:Vocab file is inside the TFLite model with metadata.
INFO:tensorflow:Saved vocabulary in /tmp/tmpjncdf_eu/vocab.txt.
INFO:tensorflow:Saved vocabulary in /tmp/tmpjncdf_eu/vocab.txt.
INFO:tensorflow:Finished populating metadata and associated file to the model:
INFO:tensorflow:Finished populating metadata and associated file to the model:
INFO:tensorflow:./model.tflite
INFO:tensorflow:./model.tflite
INFO:tensorflow:The associated file that has been been packed to the model is:
INFO:tensorflow:The associated file that has been been packed to the model is:
INFO:tensorflow:['vocab.txt']
INFO:tensorflow:['vocab.txt']
INFO:tensorflow:TensorFlow Lite model exported successfully: ./model.tflite
INFO:tensorflow:TensorFlow Lite model exported successfully: ./model.tflite

Sie können die TensorFlow Lite Modelldatei in der Verwendung bert_qa Referenz App BertQuestionAnswerer API in TensorFlow Lite Task - Bibliothek , indem es von der linken Seitenleiste auf Colab herunterzuladen.

Die zulässigen Exportformate können eines oder eine Liste der folgenden sein:

Standardmäßig exportiert es nur das TensorFlow Lite-Modell mit Metadaten. Sie können auch selektiv verschiedene Dateien exportieren. Exportieren Sie beispielsweise nur die Vokabeldatei wie folgt:

model.export(export_dir='.', export_format=ExportFormat.VOCAB)
INFO:tensorflow:Saved vocabulary in ./vocab.txt.
INFO:tensorflow:Saved vocabulary in ./vocab.txt.

Sie können auch das tflite Modell mit der bewerten evaluate_tflite Methode. Dieser Schritt wird voraussichtlich lange dauern.

model.evaluate_tflite('model.tflite', validation_data)
INFO:tensorflow:Made predictions for 100 records.
INFO:tensorflow:Made predictions for 100 records.
INFO:tensorflow:Made predictions for 200 records.
INFO:tensorflow:Made predictions for 200 records.
INFO:tensorflow:Made predictions for 300 records.
INFO:tensorflow:Made predictions for 300 records.
INFO:tensorflow:Made predictions for 400 records.
INFO:tensorflow:Made predictions for 400 records.
INFO:tensorflow:Made predictions for 500 records.
INFO:tensorflow:Made predictions for 500 records.
INFO:tensorflow:Made predictions for 600 records.
INFO:tensorflow:Made predictions for 600 records.
INFO:tensorflow:Made predictions for 700 records.
INFO:tensorflow:Made predictions for 700 records.
INFO:tensorflow:Made predictions for 800 records.
INFO:tensorflow:Made predictions for 800 records.
INFO:tensorflow:Made predictions for 900 records.
INFO:tensorflow:Made predictions for 900 records.
INFO:tensorflow:Made predictions for 1000 records.
INFO:tensorflow:Made predictions for 1000 records.
INFO:tensorflow:Made predictions for 1100 records.
INFO:tensorflow:Made predictions for 1100 records.
INFO:tensorflow:Made predictions for 1200 records.
INFO:tensorflow:Made predictions for 1200 records.
{'exact_match': 0.5918367346938775, 'final_f1': 0.6682598580557765}

Erweiterte Nutzung

Die create Funktion ist der kritische Teil dieser Bibliothek , in der die model_spec Parameter , um die Modellspezifikation definiert. Die BertQASpec Klasse wird derzeit nicht unterstützt. Es gibt 2 Modelle: MobileBERT-Modell, BERT-Base-Modell. Die create Funktion die folgenden Schritte umfasst:

  1. Erzeugt das Modell für die Frage - Antwort nach model_spec .
  2. Trainieren Sie das Frage-Antwort-Modell.

In diesem Abschnitt werden mehrere fortgeschrittene Themen beschrieben, darunter das Anpassen des Modells, das Abstimmen der Trainingshyperparameter usw.

Passen Sie das Modell an

Sie können das Modell Infrastruktur wie Parameter einstellen seq_len und query_len in der BertQASpec Klasse.

Einstellbare Parameter für das Modell:

  • seq_len : Länge des Durchlasses zu Futter in das Modell.
  • query_len : Länge der Frage nach Futter in das Modell.
  • doc_stride : Der Schritt , wenn ein Schiebefenster Ansatz tut chunks der Dokumente zu nehmen.
  • initializer_range : Die stdev des truncated_normal_initializer für alle Gewichtsmatrizen initialisiert.
  • trainable : Boolean, ob vortrainiert Schicht trainierbar ist.

Einstellbare Parameter für die Trainingspipeline:

  • model_dir : Die Lage der Modell Checkpoint - Dateien. Wenn nicht festgelegt, wird das temporäre Verzeichnis verwendet.
  • dropout_rate : Die Rate für Aussteiger.
  • learning_rate : Die anfängliche Lernrate für Adam.
  • predict_batch_size : Chargengröße für die Vorhersage.
  • tpu : TPU - Adresse zu verbinden. Wird nur verwendet, wenn tpu verwendet wird.

Beispielsweise können Sie das Modell mit einer längeren Sequenzlänge trainieren. Wenn Sie das Modell ändern, müssen Sie zunächst ein neues Konstrukt model_spec .

new_spec = model_spec.get('mobilebert_qa')
new_spec.seq_len = 512

Die restlichen Schritte sind die gleichen. Beachten Sie, dass Sie sowohl die erneut ausführen müssen dataloader und create Teile wie verschiedene Modellspezifikationen verschiedene Vorverarbeitungsschritte haben.

Optimieren Sie die Trainingshyperparameter

Sie können auch die Melodie Ausbildung Hyper wie epochs und batch_size die Modellleistung auswirken. Zum Beispiel,

  • epochs : mehr Epochen eine bessere Leistung erzielen können, aber zu Überanpassung führen kann.
  • batch_size : Anzahl der Proben in einem Trainingsschritt zu verwenden.

Sie können beispielsweise mit mehr Epochen und mit einer größeren Chargengröße trainieren, wie zum Beispiel:

model = question_answer.create(train_data, model_spec=spec, epochs=5, batch_size=64)

Ändern Sie die Modellarchitektur

Sie können die Basis , um Ihre Datenzüge auf das Modell ändern , indem model_spec . Um beispielsweise zum BERT-Base-Modell zu wechseln, führen Sie Folgendes aus:

spec = model_spec.get('bert_qa')

Die restlichen Schritte sind gleich.

Passen Sie die Quantisierung nach dem Training am TensorFlow Lite-Modell an

Post-Training - Quantisierung ist eine Umwandlungstechnik , die Modellgröße und Inferenz Latenz zu reduzieren, während auch CPU und Hardware - Beschleuniger - Inferenz Geschwindigkeit verbessert wird , mit einer wenig Verschlechterung der Modellgenauigkeit. Daher wird es häufig verwendet, um das Modell zu optimieren.

Die Model Maker-Bibliothek wendet beim Exportieren des Modells eine standardmäßige Quantisierungsmethode nach dem Training an. Wenn Sie nach dem Training Quantisierung anpassen möchten, unterstützt Model Maker mehrere Optionen nach dem Training Quantisierung QuantizationConfig auch. Nehmen wir die float16-Quantisierung als Beispiel. Definieren Sie zunächst die Quantisierungskonfiguration.

config = QuantizationConfig.for_float16()

Dann exportieren wir das TensorFlow Lite-Modell mit einer solchen Konfiguration.

model.export(export_dir='.', tflite_filename='model_fp16.tflite', quantization_config=config)

Weiterlesen

Sie können unsere lesen BERT Frage & Antwort Beispiel zu lernen , technische Details. Weitere Informationen finden Sie unter: