BERT 自然言語分類器の統合

Task Library のBertNLClassifierAPI は、入力テキストをさまざまなカテゴリに分類するNLClassifierとよく似ていますが、この API は、TFLite モデルの外で Wordpiece および Sentencepiece トークン化を必要とする Bert 関連モデル専用に設計されています。

BertNLClassifier API の主な機能

単一の文字列を入力として受け取り、その文字列で分類を実行し、分類の結果として <Label、Score> のペアを出力します。
入力するテキストに対してグラフ外で Wordpiece または Sentencepiece トークン化を実行します。

サポートされている BertNLClassifier モデル

以下のモデルは、BertNLClassifier API と互換性があります。

TensorFlow Lite Model Maker for text Classfication が作成した Bert モデル。
モデルの互換性要件を満たすカスタムモデル。

Java で推論を実行する

ステップ 1: Gradle の依存関係とその他の設定をインポートする

.tfliteモデルファイルを、モデルが実行される Android モジュールのアセットディレクトリにコピーします。ファイルを圧縮しないように指定し、TensorFlow Lite ライブラリをモジュールのbuild.gradleファイルに追加します。

android {
    // Other settings

    // Specify tflite file should not be compressed for the app apk
    aaptOptions {
        noCompress "tflite"
    }

}

dependencies {
    // Other dependencies

    // Import the Task Text Library dependency (NNAPI is included)
    implementation 'org.tensorflow:tensorflow-lite-task-text:0.4.4'
}

注：Android Gradle プラグインのバージョン 4.1 以降、.tflite はデフォルトで noCompress リストに追加され、上記の aaptOptions は不要になりました。

ステップ 2: API を使用して推論を実行する

// Initialization
BertNLClassifierOptions options =
    BertNLClassifierOptions.builder()
        .setBaseOptions(BaseOptions.builder().setNumThreads(4).build())
        .build();
BertNLClassifier classifier =
    BertNLClassifier.createFromFileAndOptions(context, modelFile, options);

// Run inference
List<Category> results = classifier.classify(input);

詳細についてはソースコードをご覧ください。

Swift で推論を実行する

ステップ 1: CocoaPods をインポートする

Podfile に TensorFlowLiteTaskText ポッドを追加します

target 'MySwiftAppWithTaskAPI' do
  use_frameworks!
  pod 'TensorFlowLiteTaskText', '~> 0.4.4'
end

ステップ 2: API を使用して推論を実行する

// Initialization
let bertNLClassifier = TFLBertNLClassifier.bertNLClassifier(
      modelPath: bertModelPath)

// Run inference
let categories = bertNLClassifier.classify(text: input)

詳細についてはソースコードをご覧ください。

C++ で推論を実行する

// Initialization
BertNLClassifierOptions options;
options.mutable_base_options()->mutable_model_file()->set_file_name(model_path);
std::unique_ptr<BertNLClassifier> classifier = BertNLClassifier::CreateFromOptions(options).value();

// Run inference with your input, `input_text`.
std::vector<core::Category> categories = classifier->Classify(input_text);

詳細についてはソースコードをご覧ください。

Run inference in Python

ステップ 1: pip パッケージをインストールする

pip install tflite-support

ステップ 2: モデルの使用

# Imports
from tflite_support.task import text

# Initialization
classifier = text.BertNLClassifier.create_from_file(model_path)

# Run inference
text_classification_result = classifier.classify(text)

BertNLClassifier を構成するその他のオプションについては、ソースコードをご覧ください。

結果の例

これは、モデルメーカーの MobileBert モデルを使用した映画レビューの分類結果の例です。

入力: "it's a charming and often affecting journey"

出力:

category[0]: 'negative' : '0.00006'
category[1]: 'positive' : '0.99994'

独自のモデルとテストデータを使用して、シンプルな BertNLClassifier 用 CLI デモツールを試してみてください。

モデルの互換性要件

BetNLClassifier API では、 TFLite モデルメタデータを持つ TFLite モデルが必要です。

メタデータは次の要件を満たす必要があります。

Wordpiece/Sentencepiece Tokenizer の input_process_units
Tokenizer の出力用の「ids」、「mask」、「segment_ids」という名前の 3 つの入力テンソル
float32 型の 1 つの出力テンソル。オプションでラベルファイルが添付されている場合があります。ラベルファイルが添付されている場合、ファイルは 1 行に 1 つのラベルが付いたプレーンテキストファイルである必要があり、ラベルの数はモデルの出力としてカテゴリの数と一致する必要があります。