مشاهده در TensorFlow.org | در Google Colab اجرا شود | مشاهده منبع در GitHub | دانلود دفترچه یادداشت |
در این آموزش، ما قصد داریم به آموزش مدل بازیابی همان است که ما در انجام اساسی بازیابی آموزش، اما با استراتژی توزیع.
ما قصد داریم به:
- داده های ما را دریافت کنید و آن ها را به یک مجموعه آموزشی و آزمایشی تقسیم کنید.
- دو GPU مجازی و TensorFlow MirroredStrategy را راه اندازی کنید.
- پیاده سازی یک مدل بازیابی با استفاده از MirroredStrategy.
- آن را با MirrorredStrategy هماهنگ کنید و آن را ارزیابی کنید.
واردات
بیایید ابتدا واردات خود را از سر راه برداریم.
pip install -q tensorflow-recommenders
pip install -q --upgrade tensorflow-datasets
import os
import pprint
import tempfile
from typing import Dict, Text
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds
import tensorflow_recommenders as tfrs
آماده سازی مجموعه داده
ما آماده مجموعه داده در دقیقا به همان شیوه که ما در انجام اساسی بازیابی آموزش.
# Ratings data.
ratings = tfds.load("movielens/100k-ratings", split="train")
# Features of all the available movies.
movies = tfds.load("movielens/100k-movies", split="train")
for x in ratings.take(1).as_numpy_iterator():
pprint.pprint(x)
for x in movies.take(1).as_numpy_iterator():
pprint.pprint(x)
ratings = ratings.map(lambda x: {
"movie_title": x["movie_title"],
"user_id": x["user_id"],
})
movies = movies.map(lambda x: x["movie_title"])
tf.random.set_seed(42)
shuffled = ratings.shuffle(100_000, seed=42, reshuffle_each_iteration=False)
train = shuffled.take(80_000)
test = shuffled.skip(80_000).take(20_000)
movie_titles = movies.batch(1_000)
user_ids = ratings.batch(1_000_000).map(lambda x: x["user_id"])
unique_movie_titles = np.unique(np.concatenate(list(movie_titles)))
unique_user_ids = np.unique(np.concatenate(list(user_ids)))
unique_movie_titles[:10]
{'bucketized_user_age': 45.0, 'movie_genres': array([7]), 'movie_id': b'357', 'movie_title': b"One Flew Over the Cuckoo's Nest (1975)", 'raw_user_age': 46.0, 'timestamp': 879024327, 'user_gender': True, 'user_id': b'138', 'user_occupation_label': 4, 'user_occupation_text': b'doctor', 'user_rating': 4.0, 'user_zip_code': b'53211'} 2021-10-14 11:16:44.748468: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead. {'movie_genres': array([4]), 'movie_id': b'1681', 'movie_title': b'You So Crazy (1994)'} 2021-10-14 11:16:45.396856: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead. array([b"'Til There Was You (1997)", b'1-900 (1994)', b'101 Dalmatians (1996)', b'12 Angry Men (1957)', b'187 (1997)', b'2 Days in the Valley (1996)', b'20,000 Leagues Under the Sea (1954)', b'2001: A Space Odyssey (1968)', b'3 Ninjas: High Noon At Mega Mountain (1998)', b'39 Steps, The (1935)'], dtype=object)
دو GPU مجازی راه اندازی کنید
اگر شتابدهندههای GPU را به Colab خود اضافه نکردهاید، لطفاً زمان اجرا Colab را قطع کنید و این کار را هماکنون انجام دهید. برای اجرای کد زیر به GPU نیاز داریم:
gpus = tf.config.list_physical_devices("GPU")
if gpus:
# Create 2 virtual GPUs with 1GB memory each
try:
tf.config.set_logical_device_configuration(
gpus[0],
[tf.config.LogicalDeviceConfiguration(memory_limit=1024),
tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
logical_gpus = tf.config.list_logical_devices("GPU")
print(len(gpus), "Physical GPU,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Virtual devices must be set before GPUs have been initialized
print(e)
strategy = tf.distribute.MirroredStrategy()
Virtual devices cannot be modified after being initialized INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',) INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
پیاده سازی یک مدل
ما در بر user_model، movie_model، سنجش و کار را در همان راه پیاده سازی که ما در انجام اساسی بازیابی آموزش، اما ما آنها را در حوزه استراتژی توزیع بسته بندی:
embedding_dimension = 32
with strategy.scope():
user_model = tf.keras.Sequential([
tf.keras.layers.StringLookup(
vocabulary=unique_user_ids, mask_token=None),
# We add an additional embedding to account for unknown tokens.
tf.keras.layers.Embedding(len(unique_user_ids) + 1, embedding_dimension)
])
movie_model = tf.keras.Sequential([
tf.keras.layers.StringLookup(
vocabulary=unique_movie_titles, mask_token=None),
tf.keras.layers.Embedding(len(unique_movie_titles) + 1, embedding_dimension)
])
metrics = tfrs.metrics.FactorizedTopK(
candidates=movies.batch(128).map(movie_model)
)
task = tfrs.tasks.Retrieval(
metrics=metrics
)
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',). INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
اکنون میتوانیم همه آنها را در یک مدل قرار دهیم. این دقیقا همان در پایه بازیابی آموزش.
class MovielensModel(tfrs.Model):
def __init__(self, user_model, movie_model):
super().__init__()
self.movie_model: tf.keras.Model = movie_model
self.user_model: tf.keras.Model = user_model
self.task: tf.keras.layers.Layer = task
def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
# We pick out the user features and pass them into the user model.
user_embeddings = self.user_model(features["user_id"])
# And pick out the movie features and pass them into the movie model,
# getting embeddings back.
positive_movie_embeddings = self.movie_model(features["movie_title"])
# The task computes the loss and the metrics.
return self.task(user_embeddings, positive_movie_embeddings)
تناسب و ارزیابی
اکنون مدل را در محدوده استراتژی توزیع نمونه سازی و کامپایل می کنیم.
توجه داشته باشید که ما با استفاده از بهینه ساز آدم در اینجا به جای Adagrad در پایه بازیابی آموزش از Adagrad در اینجا پشتیبانی نمی شود.
with strategy.scope():
model = MovielensModel(user_model, movie_model)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.1))
سپس دادههای آموزش و ارزیابی را به هم بزنید، دستهای و کش کنید.
cached_train = train.shuffle(100_000).batch(8192).cache()
cached_test = test.batch(4096).cache()
سپس مدل را آموزش دهید:
model.fit(cached_train, epochs=3)
2021-10-14 11:16:50.692190: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:461] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed. Epoch 1/3 10/10 [==============================] - 8s 328ms/step - factorized_top_k/top_1_categorical_accuracy: 5.0000e-05 - factorized_top_k/top_5_categorical_accuracy: 8.2500e-04 - factorized_top_k/top_10_categorical_accuracy: 0.0025 - factorized_top_k/top_50_categorical_accuracy: 0.0220 - factorized_top_k/top_100_categorical_accuracy: 0.0537 - loss: 70189.8047 - regularization_loss: 0.0000e+00 - total_loss: 70189.8047 Epoch 2/3 10/10 [==============================] - 3s 329ms/step - factorized_top_k/top_1_categorical_accuracy: 3.3750e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0113 - factorized_top_k/top_10_categorical_accuracy: 0.0251 - factorized_top_k/top_50_categorical_accuracy: 0.1268 - factorized_top_k/top_100_categorical_accuracy: 0.2325 - loss: 66736.4560 - regularization_loss: 0.0000e+00 - total_loss: 66736.4560 Epoch 3/3 10/10 [==============================] - 3s 332ms/step - factorized_top_k/top_1_categorical_accuracy: 0.0012 - factorized_top_k/top_5_categorical_accuracy: 0.0198 - factorized_top_k/top_10_categorical_accuracy: 0.0417 - factorized_top_k/top_50_categorical_accuracy: 0.1834 - factorized_top_k/top_100_categorical_accuracy: 0.3138 - loss: 64871.2997 - regularization_loss: 0.0000e+00 - total_loss: 64871.2997 <keras.callbacks.History at 0x7fb74c479190>
می توانید از گزارش آموزشی ببینید که TFRS از هر دو GPU مجازی استفاده می کند.
در نهایت، میتوانیم مدل خود را در مجموعه آزمایشی ارزیابی کنیم:
model.evaluate(cached_test, return_dict=True)
2021-10-14 11:17:05.371963: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:461] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed. 5/5 [==============================] - 4s 193ms/step - factorized_top_k/top_1_categorical_accuracy: 5.0000e-05 - factorized_top_k/top_5_categorical_accuracy: 0.0013 - factorized_top_k/top_10_categorical_accuracy: 0.0043 - factorized_top_k/top_50_categorical_accuracy: 0.0639 - factorized_top_k/top_100_categorical_accuracy: 0.1531 - loss: 32404.8092 - regularization_loss: 0.0000e+00 - total_loss: 32404.8092 {'factorized_top_k/top_1_categorical_accuracy': 4.999999873689376e-05, 'factorized_top_k/top_5_categorical_accuracy': 0.0013000000035390258, 'factorized_top_k/top_10_categorical_accuracy': 0.00430000014603138, 'factorized_top_k/top_50_categorical_accuracy': 0.06385000050067902, 'factorized_top_k/top_100_categorical_accuracy': 0.1530500054359436, 'loss': 29363.98046875, 'regularization_loss': 0, 'total_loss': 29363.98046875}
این بازیابی با آموزش استراتژی توزیع به پایان می رسد.