Keras 中的权重聚类示例

在 TensorFlow.org 上查看 在 Google Colab 中运行 在 GitHub 上查看源代码 下载笔记本

概述

欢迎阅读 TensorFlow Model Optimization Toolkit 中权重聚类的端到端示例。

其他页面

有关权重聚类的定义以及如何确定是否应使用权重聚类(包括支持的功能)的介绍,请参阅概述页面。

要快速找到您的用例(不局限于使用 16 个簇完全聚类模型)所需的 API,请参阅综合指南

目录

在本教程中,您将:

  1. 从头开始为 MNIST 数据集训练一个 tf.keras 模型。
  2. 通过应用权重聚类 API 对模型进行微调,并查看准确率。
  3. 通过聚类创建一个大小缩减至六分之一的 TF 和 TFLite 模型。
  4. 通过将权重聚类与训练后量化相结合,创建一个大小缩减至八分之一的 TFLite 模型。
  5. 查看从 TF 到 TFLite 的准确率持久性。

设置

您可以在本地 virtualenvColab 中运行此 Jupyter 笔记本。有关设置依赖项的详细信息,请参阅安装指南

 pip install -q tensorflow-model-optimization
import tensorflow as tf
from tensorflow import keras

import numpy as np
import tempfile
import zipfile
import os

在不使用聚类的情况下为 MNIST 训练 tf.keras 模型

# Load MNIST dataset
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images  = test_images / 255.0

# Define the model architecture.
model = keras.Sequential([
    keras.layers.InputLayer(input_shape=(28, 28)),
    keras.layers.Reshape(target_shape=(28, 28, 1)),
    keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu),
    keras.layers.MaxPooling2D(pool_size=(2, 2)),
    keras.layers.Flatten(),
    keras.layers.Dense(10)
])

# Train the digit classification model
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(
    train_images,
    train_labels,
    validation_split=0.1,
    epochs=10
)
Epoch 1/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.3002 - accuracy: 0.9167 - val_loss: 0.1301 - val_accuracy: 0.9625
Epoch 2/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.1251 - accuracy: 0.9639 - val_loss: 0.0870 - val_accuracy: 0.9773
Epoch 3/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0890 - accuracy: 0.9740 - val_loss: 0.0697 - val_accuracy: 0.9812
Epoch 4/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0725 - accuracy: 0.9786 - val_loss: 0.0643 - val_accuracy: 0.9828
Epoch 5/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0621 - accuracy: 0.9809 - val_loss: 0.0574 - val_accuracy: 0.9857
Epoch 6/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0549 - accuracy: 0.9837 - val_loss: 0.0580 - val_accuracy: 0.9852
Epoch 7/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0492 - accuracy: 0.9848 - val_loss: 0.0578 - val_accuracy: 0.9840
Epoch 8/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0440 - accuracy: 0.9869 - val_loss: 0.0614 - val_accuracy: 0.9833
Epoch 9/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0412 - accuracy: 0.9871 - val_loss: 0.0548 - val_accuracy: 0.9857
Epoch 10/10
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0375 - accuracy: 0.9887 - val_loss: 0.0577 - val_accuracy: 0.9855
<tensorflow.python.keras.callbacks.History at 0x7f3dcbe7d588>

评估基准模型并保存以备稍后使用

_, baseline_model_accuracy = model.evaluate(
    test_images, test_labels, verbose=0)

print('Baseline test accuracy:', baseline_model_accuracy)

_, keras_file = tempfile.mkstemp('.h5')
print('Saving model to: ', keras_file)
tf.keras.models.save_model(model, keras_file, include_optimizer=False)
Baseline test accuracy: 0.9807999730110168
Saving model to:  /tmp/tmpkenu8pu1.h5

通过聚类微调预训练模型

cluster_weights() API 应用于整个预训练模型,以演示它不仅能够在应用 zip 后有效缩减模型大小,还能保持良好的准确率。有关如何以最佳方式平衡用例的准确率和压缩率,请参阅综合指南中的每层示例。

定义模型并应用聚类 API

在将模型传递给聚类 API 之前,请确保它已经过训练并表现出可接受的准确率。

import tensorflow_model_optimization as tfmot

cluster_weights = tfmot.clustering.keras.cluster_weights
CentroidInitialization = tfmot.clustering.keras.CentroidInitialization

clustering_params = {
  'number_of_clusters': 16,
  'cluster_centroids_init': CentroidInitialization.LINEAR
}

# Cluster a whole model
clustered_model = cluster_weights(model, **clustering_params)

# Use smaller learning rate for fine-tuning clustered model
opt = tf.keras.optimizers.Adam(learning_rate=1e-5)

clustered_model.compile(
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  optimizer=opt,
  metrics=['accuracy'])

clustered_model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
cluster_reshape (ClusterWeig (None, 28, 28, 1)         0         
_________________________________________________________________
cluster_conv2d (ClusterWeigh (None, 26, 26, 12)        136       
_________________________________________________________________
cluster_max_pooling2d (Clust (None, 13, 13, 12)        0         
_________________________________________________________________
cluster_flatten (ClusterWeig (None, 2028)              0         
_________________________________________________________________
cluster_dense (ClusterWeight (None, 10)                20306     
=================================================================
Total params: 20,442
Trainable params: 54
Non-trainable params: 20,388
_________________________________________________________________

微调模型并根据基准评估准确率

使用聚类对模型进行 1 个周期的微调。

# Fine-tune model
clustered_model.fit(
  train_images,
  train_labels,
  batch_size=500,
  epochs=1,
  validation_split=0.1)
108/108 [==============================] - 0s 4ms/step - loss: 0.0547 - accuracy: 0.9807 - val_loss: 0.0804 - val_accuracy: 0.9760
<tensorflow.python.keras.callbacks.History at 0x7f3e4116ab70>

对于本示例,与基准相比,聚类后的测试准确率损失最小。

_, clustered_model_accuracy = clustered_model.evaluate(
  test_images, test_labels, verbose=0)

print('Baseline test accuracy:', baseline_model_accuracy)
print('Clustered test accuracy:', clustered_model_accuracy)
Baseline test accuracy: 0.9807999730110168
Clustered test accuracy: 0.9760000109672546

通过聚类创建大小缩减至六分之一的模型

strip_clustering 和应用标准压缩算法(例如通过 gzip)对于看到聚类压缩的好处必不可少。

首先,为 TensorFlow 创建一个可压缩模型。在这里,strip_clustering 会移除聚类仅在训练期间才需要的所有变量(例如用于存储簇形心和索引的 tf.Variable),否则这些变量会在推理期间增加模型大小。

final_model = tfmot.clustering.keras.strip_clustering(clustered_model)

_, clustered_keras_file = tempfile.mkstemp('.h5')
print('Saving clustered model to: ', clustered_keras_file)
tf.keras.models.save_model(final_model, clustered_keras_file, 
                           include_optimizer=False)
Saving clustered model to:  /tmp/tmpsc3jb7v8.h5

随后,为 TFLite 创建可压缩模型。您可以将聚类模型转换为可在目标后端上运行的格式。TensorFlow Lite 是可用于部署到移动设备的示例。

clustered_tflite_file = '/tmp/clustered_mnist.tflite'
converter = tf.lite.TFLiteConverter.from_keras_model(final_model)
tflite_clustered_model = converter.convert()
with open(clustered_tflite_file, 'wb') as f:
  f.write(tflite_clustered_model)
print('Saved clustered TFLite model to:', clustered_tflite_file)
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:111: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:111: Layer.updates (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
INFO:tensorflow:Assets written to: /tmp/tmp69qei5fh/assets
Saved clustered TFLite model to: /tmp/clustered_mnist.tflite

定义一个辅助函数,通过 gzip 实际压缩模型并测量压缩后的大小。

def get_gzipped_model_size(file):
  # It returns the size of the gzipped model in bytes.
  import os
  import zipfile

  _, zipped_file = tempfile.mkstemp('.zip')
  with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
    f.write(file)

  return os.path.getsize(zipped_file)

比较后可以发现,聚类使模型大小缩减至原来的六分之一

print("Size of gzipped baseline Keras model: %.2f bytes" % (get_gzipped_model_size(keras_file)))
print("Size of gzipped clustered Keras model: %.2f bytes" % (get_gzipped_model_size(clustered_keras_file)))
print("Size of gzipped clustered TFlite model: %.2f bytes" % (get_gzipped_model_size(clustered_tflite_file)))
Size of gzipped baseline Keras model: 78076.00 bytes
Size of gzipped clustered Keras model: 12728.00 bytes
Size of gzipped clustered TFlite model: 12126.00 bytes

通过将权重聚类与训练后量化相结合,创建一个大小缩减至八分之一的 TFLite 模型

您可以将训练后量化应用于聚类模型来获得更多好处。

converter = tf.lite.TFLiteConverter.from_keras_model(final_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()

_, quantized_and_clustered_tflite_file = tempfile.mkstemp('.tflite')

with open(quantized_and_clustered_tflite_file, 'wb') as f:
  f.write(tflite_quant_model)

print('Saved quantized and clustered TFLite model to:', quantized_and_clustered_tflite_file)
print("Size of gzipped baseline Keras model: %.2f bytes" % (get_gzipped_model_size(keras_file)))
print("Size of gzipped clustered and quantized TFlite model: %.2f bytes" % (get_gzipped_model_size(quantized_and_clustered_tflite_file)))
INFO:tensorflow:Assets written to: /tmp/tmpmzv1zby7/assets
INFO:tensorflow:Assets written to: /tmp/tmpmzv1zby7/assets
Saved quantized and clustered TFLite model to: /tmp/tmp5yu2mobb.tflite
Size of gzipped baseline Keras model: 78076.00 bytes
Size of gzipped clustered and quantized TFlite model: 9237.00 bytes

查看从 TF 到 TFLite 的准确率持久性

定义一个辅助函数,基于测试数据集评估 TFLite 模型。

def eval_model(interpreter):
  input_index = interpreter.get_input_details()[0]["index"]
  output_index = interpreter.get_output_details()[0]["index"]

  # Run predictions on every image in the "test" dataset.
  prediction_digits = []
  for i, test_image in enumerate(test_images):
    if i % 1000 == 0:
      print('Evaluated on {n} results so far.'.format(n=i))
    # Pre-processing: add batch dimension and convert to float32 to match with
    # the model's input data format.
    test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
    interpreter.set_tensor(input_index, test_image)

    # Run inference.
    interpreter.invoke()

    # Post-processing: remove batch dimension and find the digit with highest
    # probability.
    output = interpreter.tensor(output_index)
    digit = np.argmax(output()[0])
    prediction_digits.append(digit)

  print('\n')
  # Compare prediction results with ground truth labels to calculate accuracy.
  prediction_digits = np.array(prediction_digits)
  accuracy = (prediction_digits == test_labels).mean()
  return accuracy

评估已被聚类和量化的模型后,您将看到从 TensorFlow 持续到 TFLite 后端的准确率。

interpreter = tf.lite.Interpreter(model_content=tflite_quant_model)
interpreter.allocate_tensors()

test_accuracy = eval_model(interpreter)

print('Clustered and quantized TFLite test_accuracy:', test_accuracy)
print('Clustered TF test accuracy:', clustered_model_accuracy)
Evaluated on 0 results so far.
Evaluated on 1000 results so far.
Evaluated on 2000 results so far.
Evaluated on 3000 results so far.
Evaluated on 4000 results so far.
Evaluated on 5000 results so far.
Evaluated on 6000 results so far.
Evaluated on 7000 results so far.
Evaluated on 8000 results so far.
Evaluated on 9000 results so far.


Clustered and quantized TFLite test_accuracy: 0.9759
Clustered TF test accuracy: 0.9760000109672546

结论

在本教程中,您了解了如何使用 TensorFlow Model Optimization Toolkit API 创建聚类模型。更具体地说,您已经从头至尾完成了一个端到端示例,此示例为 MNIST 创建了一个大小缩减至原来的八分之一且准确率差异最小的模型。我们鼓励您试用这项新功能,这对于在资源受限的环境中进行部署特别重要。