此页面由 Cloud Translation API 翻译。
Switch to English

保存和加载Keras模型

在TensorFlow.org上查看 在GitHub上查看源代码 下载笔记本

介绍

Keras模型包含多个组件:

  • 一种体系结构或配置,它指定模型包含哪些层以及如何连接它们。
  • 一组权重值(“模型状态”)。
  • 优化器(通过编译模型定义)。
  • 一组损耗和度量(通过编译模型或调用add_loss()add_metric() )。

Keras API使得可以将这些片段一次保存到磁盘,或者仅选择性地保存其中一些:

  • 将所有内容以TensorFlow SavedModel格式(或旧的Keras H5格式)保存到单个存档中。这是标准做法。
  • 仅保存架构/配置,通常保存为JSON文件。
  • 仅保存权重值。通常在训练模型时使用。

让我们看一下这些选项中的每一个:什么时候使用其中一个?它们如何工作?

保存和加载的简短答案

如果您只有10秒钟的时间阅读本指南,则需要了解以下内容。

保存Keras模型:

 model = ...  # Get model (Sequential, Functional Model, or Model subclass)
model.save('path/to/location')
 

重新加载模型:

 from tensorflow import keras
model = keras.models.load_model('path/to/location')
 

现在,让我们看一下细节。

建立

 import numpy as np
import tensorflow as tf
from tensorflow import keras
 

整个模型的保存和加载

您可以将整个模型保存到单个工件中。它将包括:

  • 模型的架构/配置
  • 模型的权重值(在训练过程中获悉)
  • 调用了模型的编译信息(如果compile()
  • 优化器及其状态(如果有)(这使您可以从离开的地方重新开始训练)

蜜蜂

您可以使用两种格式将整个模型保存到磁盘: TensorFlow SavedModel格式较旧的Keras H5格式 。推荐的格式为SavedModel。当您使用model.save()时,这是默认值。

您可以通过以下方式切换到H5格式:

  • save_format='h5'传递给save()
  • 将以.h5.keras结尾的文件名传递给save()

SavedModel格式

例:

 def get_model():
    # Create a simple model.
    inputs = keras.Input(shape=(32,))
    outputs = keras.layers.Dense(1)(inputs)
    model = keras.Model(inputs, outputs)
    model.compile(optimizer="adam", loss="mean_squared_error")
    return model


model = get_model()

# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

# Calling `save('my_model')` creates a SavedModel folder `my_model`.
model.save("my_model")

# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_model")

# Let's check:
np.testing.assert_allclose(
    model.predict(test_input), reconstructed_model.predict(test_input)
)

# The reconstructed model is already compiled and has retained the optimizer
# state, so training can resume:
reconstructed_model.fit(test_input, test_target)
 
4/4 [==============================] - 0s 1ms/step - loss: 1.1917
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: my_model/assets
4/4 [==============================] - 0s 1ms/step - loss: 1.0581

<tensorflow.python.keras.callbacks.History at 0x7f13700096d8>

SavedModel包含什么

调用model.save('my_model')创建一个名为my_model的文件夹,其中包含以下内容:

ls my_model
assets  saved_model.pb  variables

模型架构和训练配置(包括优化器,损失和指标)存储在saved_model.pb 。权重保存在variables/目录中。

有关SavedModel格式的详细信息,请参见SavedModel指南( 磁盘上的SavedModel格式

SavedModel如何处理自定义对象

保存模型及其层时,SavedModel格式存储类名称, 调用函数 ,损失和权重(以及配置(如果已实现))。调用函数定义模型/层的计算图。

在没有模型/层配置的情况下,调用函数用于创建一个像原始模型一样存在的模型,该模型可以进行训练,评估并用于推理。

尽管如此,在编写自定义模型或图层类时定义get_configfrom_config方法始终是一个好习惯。这样,您便可以在需要时轻松地更新计算。有关更多信息,请参见关于自定义对象的部分。

以下是从SavedModel格式加载自定义图层而不覆盖config方法时发生的情况的示例。

 class CustomModel(keras.Model):
    def __init__(self, hidden_units):
        super(CustomModel, self).__init__()
        self.dense_layers = [keras.layers.Dense(u) for u in hidden_units]

    def call(self, inputs):
        x = inputs
        for layer in self.dense_layers:
            x = layer(x)
        return x


model = CustomModel([16, 16, 10])
# Build the model by calling it
input_arr = tf.random.uniform((1, 5))
outputs = model(input_arr)
model.save("my_model")

# Delete the custom-defined model class to ensure that the loader does not have
# access to it.
del CustomModel

loaded = keras.models.load_model("my_model")
np.testing.assert_allclose(loaded(input_arr), outputs)

print("Original model:", model)
print("Loaded model:", loaded)
 
INFO:tensorflow:Assets written to: my_model/assets
WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
Original model: <__main__.CustomModel object at 0x7f1370081550>
Loaded model: <tensorflow.python.keras.saving.saved_model.load.CustomModel object at 0x7f1328722e48>

如上面的示例所示,加载程序动态创建一个行为类似于原始模型的新模型类。

Keras H5格式

Keras还支持保存单个HDF5文件,其中包含模型的体系结构,权重值和compile()信息。它是SavedModel的轻量替代方案。

例:

 model = get_model()

# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

# Calling `save('my_model.h5')` creates a h5 file `my_model.h5`.
model.save("my_h5_model.h5")

# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_h5_model.h5")

# Let's check:
np.testing.assert_allclose(
    model.predict(test_input), reconstructed_model.predict(test_input)
)

# The reconstructed model is already compiled and has retained the optimizer
# state, so training can resume:
reconstructed_model.fit(test_input, test_target)
 
4/4 [==============================] - 0s 1ms/step - loss: 4.1064
4/4 [==============================] - 0s 1ms/step - loss: 3.8469

<tensorflow.python.keras.callbacks.History at 0x7f14171934a8>

局限性

与SavedModel格式相比,H5文件中没有包含两件事:

  • 通过model.add_loss()model.add_metric()添加的外部损失和指标不会保存(与SavedModel不同)。如果您的模型上有这样的损失和指标,并且想要恢复训练,则需要在加载模型后重新添加这些损失。请注意,这不适用于通过self.add_loss()self.add_metric() 图层内部创建的损耗/度量。只要该层被加载,这些损耗和度量就被保留,因为它们是该层的call方法的一部分。
  • 自定义对象 (如自定义图层)的计算图不包含在保存的文件中。在加载时,Keras将需要访问这些对象的Python类/函数以重建模型。请参阅自定义对象

保存架构

模型的配置(或体系结构)指定了模型包含的层以及如何连接这些层*。如果您具有模型的配置,则可以使用权重的新初始化状态创建模型,而无需编译信息。

*请注意,这仅适用于使用功能性API或顺序API而非子类模型定义的模型。

顺序模型或功能性API模型的配置

这些类型的模型是显式的层图:它们的配置始终以结构化形式提供。

蜜蜂

get_config()from_config()

调用config = model.get_config()将返回一个包含模型配置的Python字典。然后可以通过Sequential.from_config(config) (对于Sequential模型)或Model.from_config(config) (对于功能API模型)来重建同一模型。

相同的工作流程也适用于任何可序列化层。

图层示例:

 layer = keras.layers.Dense(3, activation="relu")
layer_config = layer.get_config()
new_layer = keras.layers.Dense.from_config(layer_config)
 

顺序模型示例:

 model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
config = model.get_config()
new_model = keras.Sequential.from_config(config)
 

功能模型示例:

 inputs = keras.Input((32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
config = model.get_config()
new_model = keras.Model.from_config(config)
 

to_json()tf.keras.models.model_from_json()

这类似于get_config / from_config ,不同之处from_config它将模型转换为JSON字符串,然后可以在不使用原始模型类的情况下进行加载。它也特定于模型,并不适用于图层。

例:

 model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
json_config = model.to_json()
new_model = keras.models.model_from_json(json_config)
 

自定义对象

模型和层

子类化模型和层的体系结构在__init__call方法中定义。它们被认为是Python字节码,无法序列化为与JSON兼容的配置-您可以尝试序列化字节码(例如,通过pickle ),但这是完全不安全的,这意味着您的模型无法加载到其他系统上。

为了保存/加载具有自定义图层的模型或子类化模型,您应该覆盖get_config和可选的from_config方法。另外,您应该使用注册自定义对象,以便Keras知道它。

自定义功能

自定义函数(例如,激活丢失或初始化)不需要get_config方法。只要将函数名称注册为自定义对象,该函数名称就足以加载。

仅加载TensorFlow图

可以加载Keras生成的TensorFlow图。如果这样做,则无需提供任何custom_objects 。您可以这样做:

 model.save("my_model")
tensorflow_graph = tf.saved_model.load("my_model")
x = np.random.uniform(size=(4, 32)).astype(np.float32)
predicted = tensorflow_graph(x).numpy()
 
INFO:tensorflow:Assets written to: my_model/assets

请注意,此方法有几个缺点:

  • 出于可追溯性的原因,您应该始终可以访问所使用的自定义对象。您不想将无法重新创建的模型投入生产。
  • tf.saved_model.load返回的对象不是tf.saved_model.load模型。因此,它不是那么容易使用。例如,您将无法访问.predict().fit()

即使不鼓励使用它,它也可以为您提供帮助,例如,如果您处于困境中,例如,丢失了自定义对象的代码,或者在使用tf.keras.models.load_model()加载模型时tf.keras.models.load_model()问题。

您可以在页面上找到有关tf.saved_model.load更多信息tf.saved_model.load

定义配置方法

规格:

  • get_config应该返回一个JSON可序列化的字典,以便与get_config节省架构和模型的API兼容。
  • from_config(config)classmethod )应该返回从配置创建的新图层或模型对象。默认实现返回cls(**config)

例:

 class CustomLayer(keras.layers.Layer):
    def __init__(self, a):
        self.var = tf.Variable(a, name="var_a")

    def call(self, inputs, training=False):
        if training:
            return inputs * self.var
        else:
            return inputs

    def get_config(self):
        return {"a": self.var.numpy()}

    # There's actually no need to define `from_config` here, since returning
    # `cls(**config)` is the default behavior.
    @classmethod
    def from_config(cls, config):
        return cls(**config)


layer = CustomLayer(5)
layer.var.assign(2)

serialized_layer = keras.layers.serialize(layer)
new_layer = keras.layers.deserialize(
    serialized_layer, custom_objects={"CustomLayer": CustomLayer}
)
 

注册自定义对象

Keras记录了哪个类生成了配置。在上面的示例中, tf.keras.layers.serialize生成定制层的序列化形式:

 {'class_name': 'CustomLayer', 'config': {'a': 2} }
 

Keras保留了所有内置层,模型,优化器和度量标准类的主列表,该列表用于查找正确的类以调用from_config 。如果找不到该类,则会引发Value Error: Unknown layerValue Error: Unknown layer )。有几种方法可以将自定义类注册到此列表中:

  1. 在加载函数中设置custom_objects参数。 (请参见上面“定义配置方法”部分中的示例)
  2. tf.keras.utils.custom_object_scopetf.keras.utils.CustomObjectScope
  3. tf.keras.utils.register_keras_serializable

自定义层和功能示例

 class CustomLayer(keras.layers.Layer):
    def __init__(self, units=32, **kwargs):
        super(CustomLayer, self).__init__(**kwargs)
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,), initializer="random_normal", trainable=True
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

    def get_config(self):
        config = super(CustomLayer, self).get_config()
        config.update({"units": self.units})
        return config


def custom_activation(x):
    return tf.nn.tanh(x) ** 2


# Make a model with the CustomLayer and custom_activation
inputs = keras.Input((32,))
x = CustomLayer(32)(inputs)
outputs = keras.layers.Activation(custom_activation)(x)
model = keras.Model(inputs, outputs)

# Retrieve the config
config = model.get_config()

# At loading time, register the custom objects with a `custom_object_scope`:
custom_objects = {"CustomLayer": CustomLayer, "custom_activation": custom_activation}
with keras.utils.custom_object_scope(custom_objects):
    new_model = keras.Model.from_config(config)
 

内存中模型克隆

您还可以通过tf.keras.models.clone_model()进行模型的内存克隆。这等效于获取配置,然后从其配置中重新创建模型(因此它不保留编译信息或图层权重值)。

例:

 with keras.utils.custom_object_scope(custom_objects):
    new_model = keras.models.clone_model(model)
 

仅保存和加载模型的权重值

您可以选择仅保存和加载模型的权重。在以下情况下这可能很有用:

  • 您只需要模型即可进行推断:在这种情况下,您无需重新开始训练,因此您不需要编译信息或优化器状态。
  • 您正在进行迁移学习:在这种情况下,您将使用现有模型的状态来训练新模型,因此您不需要现有模型的编译信息。

内存中重量传递的API

可以使用get_weightsset_weights在不同对象之间复制权重:

下面的例子。

在内存中将权重从一层转移到另一层

 def create_layer():
    layer = keras.layers.Dense(64, activation="relu", name="dense_2")
    layer.build((None, 784))
    return layer


layer_1 = create_layer()
layer_2 = create_layer()

# Copy weights from layer 2 to layer 1
layer_2.set_weights(layer_1.get_weights())
 

在内存中将权重从一个模型转移到具有兼容体系结构的另一个模型

 # Create a simple functional model
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")

# Define a subclassed model with the same architecture
class SubclassedModel(keras.Model):
    def __init__(self, output_dim, name=None):
        super(SubclassedModel, self).__init__(name=name)
        self.output_dim = output_dim
        self.dense_1 = keras.layers.Dense(64, activation="relu", name="dense_1")
        self.dense_2 = keras.layers.Dense(64, activation="relu", name="dense_2")
        self.dense_3 = keras.layers.Dense(output_dim, name="predictions")

    def call(self, inputs):
        x = self.dense_1(inputs)
        x = self.dense_2(x)
        x = self.dense_3(x)
        return x

    def get_config(self):
        return {"output_dim": self.output_dim, "name": self.name}


subclassed_model = SubclassedModel(10)
# Call the subclassed model once to create the weights.
subclassed_model(tf.ones((1, 784)))

# Copy weights from functional_model to subclassed_model.
subclassed_model.set_weights(functional_model.get_weights())

assert len(functional_model.weights) == len(subclassed_model.weights)
for a, b in zip(functional_model.weights, subclassed_model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())
 

无状态层的情况

由于无状态层不会更改权重的顺序或数量,因此即使存在额外/缺少的无状态层,模型也可以具有兼容的体系结构。

 inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")

inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)

# Add a dropout layer, which does not contain any weights.
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model_with_dropout = keras.Model(
    inputs=inputs, outputs=outputs, name="3_layer_mlp"
)

functional_model_with_dropout.set_weights(functional_model.get_weights())
 

用于将权重保存到磁盘并重新加载的API

可以通过以下格式调用model.save_weights将权重保存到磁盘:

  • TensorFlow检查点
  • HDF5

model.save_weights的默认格式为TensorFlow检查点。有两种方法可以指定保存格式:

  1. save_format参数:将值设置为save_format="tf"save_format="h5"
  2. path参数:如果路径以.h5.hdf5 ,则使用HDF5格式。除非设置了save_format否则其他后缀将导致TensorFlow检查点。

还可以选择将权重作为内存中的numpy数组进行检索。每个API都有其优缺点,下面将详细介绍。

TF Checkpoint格式

例:

 # Runnable example
sequential_model = keras.Sequential(
    [
        keras.Input(shape=(784,), name="digits"),
        keras.layers.Dense(64, activation="relu", name="dense_1"),
        keras.layers.Dense(64, activation="relu", name="dense_2"),
        keras.layers.Dense(10, name="predictions"),
    ]
)
sequential_model.save_weights("ckpt")
load_status = sequential_model.load_weights("ckpt")

# `assert_consumed` can be used as validation that all variable values have been
# restored from the checkpoint. See `tf.train.Checkpoint.restore` for other
# methods in the Status object.
load_status.assert_consumed()
 
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f1416793ba8>

格式详情

TensorFlow Checkpoint格式使用对象属性名称保存和恢复权重。例如,考虑tf.keras.layers.Dense层。该层包含两个权重: dense.kerneldense.bias 。将图层保存为tf格式后,生成的检查点将包含键"kernel""bias"及其对应的权重值。有关更多信息,请参见TF Checkpoint指南中的“加载力学”

请注意,属性/图形边缘是以父对象中使用的名称而不是变量的名称命名的 。在下面的示例中考虑CustomLayer 。变量CustomLayer.var保存与"var"为重点的一部分,而不是"var_a"

 class CustomLayer(keras.layers.Layer):
    def __init__(self, a):
        self.var = tf.Variable(a, name="var_a")


layer = CustomLayer(5)
layer_ckpt = tf.train.Checkpoint(layer=layer).save("custom_layer")

ckpt_reader = tf.train.load_checkpoint(layer_ckpt)

ckpt_reader.get_variable_to_dtype_map()
 
{'save_counter/.ATTRIBUTES/VARIABLE_VALUE': tf.int64,
 '_CHECKPOINTABLE_OBJECT_GRAPH': tf.string,
 'layer/var/.ATTRIBUTES/VARIABLE_VALUE': tf.int32}

转移学习的例子

本质上,只要两个模型具有相同的体系结构,它们就可以共享相同的检查点。

例:

 inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(10, name="predictions")(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")

# Extract a portion of the functional model defined in the Setup section.
# The following lines produce a new model that excludes the final output
# layer of the functional model.
pretrained = keras.Model(
    functional_model.inputs, functional_model.layers[-1].input, name="pretrained_model"
)
# Randomly assign "trained" weights.
for w in pretrained.weights:
    w.assign(tf.random.normal(w.shape))
pretrained.save_weights("pretrained_ckpt")
pretrained.summary()

# Assume this is a separate program where only 'pretrained_ckpt' exists.
# Create a new functional model with a different output dimension.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = keras.layers.Dense(5, name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="new_model")

# Load the weights from pretrained_ckpt into model.
model.load_weights("pretrained_ckpt")

# Check that all of the pretrained weights have been loaded.
for a, b in zip(pretrained.weights, model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())

print("\n", "-" * 50)
model.summary()

# Example 2: Sequential model
# Recreate the pretrained model, and load the saved weights.
inputs = keras.Input(shape=(784,), name="digits")
x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
pretrained_model = keras.Model(inputs=inputs, outputs=x, name="pretrained")

# Sequential example:
model = keras.Sequential([pretrained_model, keras.layers.Dense(5, name="predictions")])
model.summary()

pretrained_model.load_weights("pretrained_ckpt")

# Warning! Calling `model.load_weights('pretrained_ckpt')` won't throw an error,
# but will *not* work as expected. If you inspect the weights, you'll see that
# none of the weights will have loaded. `pretrained_model.load_weights()` is the
# correct method to call.
 
Model: "pretrained_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
digits (InputLayer)          [(None, 784)]             0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_2 (Dense)              (None, 64)                4160      
=================================================================
Total params: 54,400
Trainable params: 54,400
Non-trainable params: 0
_________________________________________________________________

 --------------------------------------------------
Model: "new_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
digits (InputLayer)          [(None, 784)]             0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_2 (Dense)              (None, 64)                4160      
_________________________________________________________________
predictions (Dense)          (None, 5)                 325       
=================================================================
Total params: 54,725
Trainable params: 54,725
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
pretrained (Model)           (None, 64)                54400     
_________________________________________________________________
predictions (Dense)          (None, 5)                 325       
=================================================================
Total params: 54,725
Trainable params: 54,725
Non-trainable params: 0
_________________________________________________________________

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f1416704278>

通常建议对构建模型使用相同的API。如果在“顺序”和“功能”或“功能和子类”之间切换,则始终重建预训练模型并将预训练权重加载到该模型。

下一个问题是,如果模型架构完全不同,如何将权重保存并加载到不同的模型中?解决方案是使用tf.train.Checkpoint保存和还原确切的图层/变量。

例:

 # Create a subclassed model that essentially uses functional_model's first
# and last layers.
# First, save the weights of functional_model's first and last dense layers.
first_dense = functional_model.layers[1]
last_dense = functional_model.layers[-1]
ckpt_path = tf.train.Checkpoint(
    dense=first_dense, kernel=last_dense.kernel, bias=last_dense.bias
).save("ckpt")

# Define the subclassed model.
class ContrivedModel(keras.Model):
    def __init__(self):
        super(ContrivedModel, self).__init__()
        self.first_dense = keras.layers.Dense(64)
        self.kernel = self.add_variable("kernel", shape=(64, 10))
        self.bias = self.add_variable("bias", shape=(10,))

    def call(self, inputs):
        x = self.first_dense(inputs)
        return tf.matmul(x, self.kernel) + self.bias


model = ContrivedModel()
# Call model on inputs to create the variables of the dense layer.
_ = model(tf.ones((1, 784)))

# Create a Checkpoint with the same structure as before, and load the weights.
tf.train.Checkpoint(
    dense=model.first_dense, kernel=model.kernel, bias=model.bias
).restore(ckpt_path).assert_consumed()
 
WARNING:tensorflow:From <ipython-input-21-eec1d28bc826>:15: Layer.add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f1416713358>

HDF5格式

HDF5格式包含按图层名称分组的权重。权重是通过将可训练权重列表与不可训练权重列表(与layer.weights相同)连接而排序的列表。因此,如果模型具有与保存在检查点中相同的图层和可训练状态,则可以使用hdf5检查点。

例:

 # Runnable example
sequential_model = keras.Sequential(
    [
        keras.Input(shape=(784,), name="digits"),
        keras.layers.Dense(64, activation="relu", name="dense_1"),
        keras.layers.Dense(64, activation="relu", name="dense_2"),
        keras.layers.Dense(10, name="predictions"),
    ]
)
sequential_model.save_weights("weights.h5")
sequential_model.load_weights("weights.h5")
 

请注意,当模型包含嵌套图层时,更改layer.trainable可能导致不同的layer.weights排序。

 class NestedDenseLayer(keras.layers.Layer):
    def __init__(self, units, name=None):
        super(NestedDenseLayer, self).__init__(name=name)
        self.dense_1 = keras.layers.Dense(units, name="dense_1")
        self.dense_2 = keras.layers.Dense(units, name="dense_2")

    def call(self, inputs):
        return self.dense_2(self.dense_1(inputs))


nested_model = keras.Sequential([keras.Input((784,)), NestedDenseLayer(10, "nested")])
variable_names = [v.name for v in nested_model.weights]
print("variables: {}".format(variable_names))

print("\nChanging trainable status of one of the nested layers...")
nested_model.get_layer("nested").dense_1.trainable = False

variable_names_2 = [v.name for v in nested_model.weights]
print("\nvariables: {}".format(variable_names_2))
print("variable ordering changed:", variable_names != variable_names_2)
 
variables: ['nested/dense_1/kernel:0', 'nested/dense_1/bias:0', 'nested/dense_2/kernel:0', 'nested/dense_2/bias:0']

Changing trainable status of one of the nested layers...

variables: ['nested/dense_2/kernel:0', 'nested/dense_2/bias:0', 'nested/dense_1/kernel:0', 'nested/dense_1/bias:0']
variable ordering changed: True

转移学习的例子

从HDF5加载预训练的权重时,建议将权重加载到原始检查点模型中,然后将所需的权重/图层提取到新模型中。

例:

 def create_functional_model():
    inputs = keras.Input(shape=(784,), name="digits")
    x = keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
    x = keras.layers.Dense(64, activation="relu", name="dense_2")(x)
    outputs = keras.layers.Dense(10, name="predictions")(x)
    return keras.Model(inputs=inputs, outputs=outputs, name="3_layer_mlp")


functional_model = create_functional_model()
functional_model.save_weights("pretrained_weights.h5")

# In a separate program:
pretrained_model = create_functional_model()
pretrained_model.load_weights("pretrained_weights.h5")

# Create a new model by extracting layers from the original model:
extracted_layers = pretrained_model.layers[:-1]
extracted_layers.append(keras.layers.Dense(5, name="dense_3"))
model = keras.Sequential(extracted_layers)
model.summary()
 
Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_2 (Dense)              (None, 64)                4160      
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 325       
=================================================================
Total params: 54,725
Trainable params: 54,725
Non-trainable params: 0
_________________________________________________________________