Warning: This project is deprecated. TensorFlow Addons has stopped development, The project will only be providing minimal maintenance releases until May 2024. See the full announcement here or on github.

TensorFlow Addons 优化器：LazyAdam

在 TensorFlow.org 上查看

在 Google Colab 中运行

在 GitHub 中查看源代码

{img1下载笔记本

概述

此笔记本将演示如何使用 Addons 包中的 Lazy Adam 优化器。

LazyAdam

LazyAdam 是 Adam 优化器的一种变体，可以更高效地处理稀疏更新。原始的 Adam 算法为每个可训练变量维护两个移动平均累加器，这些累加器在每一步都会更新。此类为稀疏变量提供了更加懒惰的梯度更新处理。它仅更新当前批次中出现的稀疏变量索引的移动平均累加器，而不是更新所有索引的累加器。与原始的 Adam 优化器相比，它可以大幅提高某些应用的模型训练吞吐量。但是，它的语义与原始的 Adam 算法略有不同，这可能会产生不同的实验结果。

设置

pip install -q -U tensorflow-addons

import tensorflow as tf
import tensorflow_addons as tfa

# Hyperparameters
batch_size=64
epochs=10

构建模型

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, input_shape=(784,), activation='relu', name='dense_1'),
    tf.keras.layers.Dense(64, activation='relu', name='dense_2'),
    tf.keras.layers.Dense(10, activation='softmax', name='predictions'),
])

准备数据

# Load MNIST dataset as NumPy arrays
dataset = {}
num_validation = 10000
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Preprocess the data
x_train = x_train.reshape(-1, 784).astype('float32') / 255
x_test = x_test.reshape(-1, 784).astype('float32') / 255

训练和评估

只需用新的 TFA 优化器替换典型的 Keras 优化器

# Compile the model
model.compile(
    optimizer=tfa.optimizers.LazyAdam(0.001),  # Utilize TFA optimizer
    loss=tf.keras.losses.SparseCategoricalCrossentropy(),
    metrics=['accuracy'])

# Train the network
history = model.fit(
    x_train,
    y_train,
    batch_size=batch_size,
    epochs=epochs)

Epoch 1/10
938/938 [==============================] - 2s 2ms/step - loss: 0.3188 - accuracy: 0.9079
Epoch 2/10
938/938 [==============================] - 2s 2ms/step - loss: 0.1316 - accuracy: 0.9607
Epoch 3/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0970 - accuracy: 0.9706
Epoch 4/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0783 - accuracy: 0.9760
Epoch 5/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0622 - accuracy: 0.9805
Epoch 6/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0516 - accuracy: 0.9843
Epoch 7/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0431 - accuracy: 0.9863
Epoch 8/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0392 - accuracy: 0.9873
Epoch 9/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0332 - accuracy: 0.9893
Epoch 10/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0283 - accuracy: 0.9909

# Evaluate the network
print('Evaluate on test data:')
results = model.evaluate(x_test, y_test, batch_size=128, verbose = 2)
print('Test loss = {0}, Test acc: {1}'.format(results[0], results[1]))

Evaluate on test data:
79/79 - 0s - loss: 0.0843 - accuracy: 0.9765
Test loss = 0.08429064601659775, Test acc: 0.9764999747276306