Warning: This project is deprecated. TensorFlow Addons has stopped development,
The project will only be providing minimal maintenance releases until May 2024. See the full
announcement here or on
github.
TensorFlow Addons Optimizers: LazyAdam
Stay organized with collections
Save and categorize content based on your preferences.
Overview
This notebook will demonstrate how to use the lazy adam optimizer from the Addons package.
LazyAdam
LazyAdam is a variant of the Adam optimizer that handles sparse updates more efficiently.
The original Adam algorithm maintains two moving-average accumulators for
each trainable variable; the accumulators are updated at every step.
This class provides lazier handling of gradient updates for sparse
variables. It only updates moving-average accumulators for sparse variable
indices that appear in the current batch, rather than updating the
accumulators for all indices. Compared with the original Adam optimizer,
it can provide large improvements in model training throughput for some
applications. However, it provides slightly different semantics than the
original Adam algorithm, and may lead to different empirical results.
Setup
pip install -U tensorflow-addons
import tensorflow as tf
import tensorflow_addons as tfa
# Hyperparameters
batch_size=64
epochs=10
Build the Model
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, input_shape=(784,), activation='relu', name='dense_1'),
tf.keras.layers.Dense(64, activation='relu', name='dense_2'),
tf.keras.layers.Dense(10, activation='softmax', name='predictions'),
])
Prepare the Data
# Load MNIST dataset as NumPy arrays
dataset = {}
num_validation = 10000
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
# Preprocess the data
x_train = x_train.reshape(-1, 784).astype('float32') / 255
x_test = x_test.reshape(-1, 784).astype('float32') / 255
Train and Evaluate
Simply replace typical keras optimizers with the new tfa optimizer
# Compile the model
model.compile(
optimizer=tfa.optimizers.LazyAdam(0.001), # Utilize TFA optimizer
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy'])
# Train the network
history = model.fit(
x_train,
y_train,
batch_size=batch_size,
epochs=epochs)
Epoch 1/10
938/938 [==============================] - 2s 2ms/step - loss: 0.3141 - accuracy: 0.9086
Epoch 2/10
938/938 [==============================] - 2s 2ms/step - loss: 0.1447 - accuracy: 0.9574
Epoch 3/10
938/938 [==============================] - 2s 2ms/step - loss: 0.1064 - accuracy: 0.9681
Epoch 4/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0835 - accuracy: 0.9751
Epoch 5/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0665 - accuracy: 0.9798
Epoch 6/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0566 - accuracy: 0.9827
Epoch 7/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0472 - accuracy: 0.9852
Epoch 8/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0412 - accuracy: 0.9869
Epoch 9/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0353 - accuracy: 0.9890
Epoch 10/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0320 - accuracy: 0.9901
# Evaluate the network
print('Evaluate on test data:')
results = model.evaluate(x_test, y_test, batch_size=128, verbose = 2)
print('Test loss = {0}, Test acc: {1}'.format(results[0], results[1]))
Evaluate on test data:
79/79 - 0s - loss: 0.0990 - accuracy: 0.9741 - 236ms/epoch - 3ms/step
Test loss = 0.09902079403400421, Test acc: 0.9740999937057495
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2023-05-26 UTC.
[null,null,["Last updated 2023-05-26 UTC."],[],[],null,["# TensorFlow Addons Optimizers: LazyAdam\n\n\u003cbr /\u003e\n\n|-------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|\n| [View on TensorFlow.org](https://www.tensorflow.org/addons/tutorials/optimizers_lazyadam) | [Run in Google Colab](https://colab.research.google.com/github/tensorflow/addons/blob/master/docs/tutorials/optimizers_lazyadam.ipynb) | [View source on GitHub](https://github.com/tensorflow/addons/blob/master/docs/tutorials/optimizers_lazyadam.ipynb) | [Download notebook](https://storage.googleapis.com/tensorflow_docs/addons/docs/tutorials/optimizers_lazyadam.ipynb) |\n\nOverview\n--------\n\nThis notebook will demonstrate how to use the lazy adam optimizer from the Addons package.\n\nLazyAdam\n--------\n\n\u003e LazyAdam is a variant of the Adam optimizer that handles sparse updates more efficiently.\n\u003e The original Adam algorithm maintains two moving-average accumulators for\n\u003e each trainable variable; the accumulators are updated at every step.\n\u003e This class provides lazier handling of gradient updates for sparse\n\u003e variables. It only updates moving-average accumulators for sparse variable\n\u003e indices that appear in the current batch, rather than updating the\n\u003e accumulators for all indices. Compared with the original Adam optimizer,\n\u003e it can provide large improvements in model training throughput for some\n\u003e applications. However, it provides slightly different semantics than the\n\u003e original Adam algorithm, and may lead to different empirical results.\n\nSetup\n-----\n\n pip install -U tensorflow-addons\n\n import tensorflow as tf\n import tensorflow_addons as tfa\n\n # Hyperparameters\n batch_size=64\n epochs=10\n\nBuild the Model\n---------------\n\n model = tf.keras.Sequential([\n tf.keras.layers.Dense(64, input_shape=(784,), activation='relu', name='dense_1'),\n tf.keras.layers.Dense(64, activation='relu', name='dense_2'),\n tf.keras.layers.Dense(10, activation='softmax', name='predictions'),\n ])\n\nPrepare the Data\n----------------\n\n # Load MNIST dataset as NumPy arrays\n dataset = {}\n num_validation = 10000\n (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()\n\n # Preprocess the data\n x_train = x_train.reshape(-1, 784).astype('float32') / 255\n x_test = x_test.reshape(-1, 784).astype('float32') / 255\n\nTrain and Evaluate\n------------------\n\nSimply replace typical keras optimizers with the new tfa optimizer \n\n # Compile the model\n model.compile(\n optimizer=tfa.optimizers.LazyAdam(0.001), # Utilize TFA optimizer\n loss=tf.keras.losses.SparseCategoricalCrossentropy(),\n metrics=['accuracy'])\n\n # Train the network\n history = model.fit(\n x_train,\n y_train,\n batch_size=batch_size,\n epochs=epochs)\n\n```\nEpoch 1/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.3141 - accuracy: 0.9086\nEpoch 2/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.1447 - accuracy: 0.9574\nEpoch 3/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.1064 - accuracy: 0.9681\nEpoch 4/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.0835 - accuracy: 0.9751\nEpoch 5/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.0665 - accuracy: 0.9798\nEpoch 6/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.0566 - accuracy: 0.9827\nEpoch 7/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.0472 - accuracy: 0.9852\nEpoch 8/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.0412 - accuracy: 0.9869\nEpoch 9/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.0353 - accuracy: 0.9890\nEpoch 10/10\n938/938 [==============================] - 2s 2ms/step - loss: 0.0320 - accuracy: 0.9901\n``` \n\n # Evaluate the network\n print('Evaluate on test data:')\n results = model.evaluate(x_test, y_test, batch_size=128, verbose = 2)\n print('Test loss = {0}, Test acc: {1}'.format(results[0], results[1]))\n\n```\nEvaluate on test data:\n79/79 - 0s - loss: 0.0990 - accuracy: 0.9741 - 236ms/epoch - 3ms/step\nTest loss = 0.09902079403400421, Test acc: 0.9740999937057495\n```"]]