此页面由 Cloud Translation API 翻译。
Switch to English

顺序模型

在TensorFlow.org上查看 在GitHub上查看源代码 下载笔记本

建立

 import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
 

何时使用顺序模型

Sequential模型适用于简单的层堆栈,其中每一层都有一个输入张量和一个输出张量

从原理上讲,以下Sequential模型:

 # Define Sequential model with 3 layers
model = keras.Sequential(
    [
        layers.Dense(2, activation="relu", name="layer1"),
        layers.Dense(3, activation="relu", name="layer2"),
        layers.Dense(4, name="layer3"),
    ]
)
# Call model on a test input
x = tf.ones((3, 3))
y = model(x)
 

等效于以下功能:

 # Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")

# Call layers on a test input
x = tf.ones((3, 3))
y = layer3(layer2(layer1(x)))
 

顺序模型不适用于以下情况

  • 您的模型有多个输入或多个输出
  • 您的任何一层都有多个输入或多个输出
  • 您需要进行图层共享
  • 您需要非线性拓扑(例如,残余连接,多分支模型)

创建一个顺序模型

您可以通过将图层列表传递给Sequential构造函数来创建Sequential模型:

 model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)
 

可通过layers属性访问其layers

 model.layers
 
[<tensorflow.python.keras.layers.core.Dense at 0x7f37ffe66668>,
 <tensorflow.python.keras.layers.core.Dense at 0x7f37f553fc50>,
 <tensorflow.python.keras.layers.core.Dense at 0x7f37680de2b0>]

您还可以通过add()方法增量创建一个顺序模型:

 model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))
 

请注意,还有一个相应的pop()方法可以删除图层:顺序模型的行为非常类似于图层列表。

 model.pop()
print(len(model.layers))  # 2
 
2

还要注意,Sequential构造函数接受name参数,就像Keras中的任何层或模型一样。这对于用语义上有意义的名称注释TensorBoard图很有用。

 model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))
 

预先指定输入形状

通常,Keras中的所有图层都需要知道其输入的形状,以便能够创建其权重。因此,当您创建这样的图层时,最初没有权重:

 layer = layers.Dense(3)
layer.weights  # Empty
 
[]

由于权重的形状取决于输入的形状,因此会在首次调用输入时创建其权重:

 # Call layer on a test input
x = tf.ones((1, 4))
y = layer(x)
layer.weights  # Now it has weights, of shape (4, 3) and (3,)
 
[<tf.Variable 'dense_6/kernel:0' shape=(4, 3) dtype=float32, numpy=
 array([[-0.8131663 , -0.49988765, -0.02397203],
        [-0.3190418 ,  0.01101786,  0.85226357],
        [-0.602435  , -0.10381919,  0.63280225],
        [-0.3388477 ,  0.11859643, -0.10677373]], dtype=float32)>,
 <tf.Variable 'dense_6/bias:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

当然,这也适用于顺序模型。当实例化没有输入形状的顺序模型时,它不是“构建”的:它没有权重(调用model.weights导致错误,仅说明这一点)。权重是在模型首次看到一些输入数据时创建的:

 model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)  # No weights at this stage!

# At this point, you can't do this:
# model.weights

# You also can't do this:
# model.summary()

# Call the model on a test input
x = tf.ones((1, 4))
y = model(x)
print("Number of weights after calling the model:", len(model.weights))  # 6
 
Number of weights after calling the model: 6

一旦“构建”了模型,就可以调用其summary()方法以显示其内容:

 model.summary()
 
Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_7 (Dense)              multiple                  10        
_________________________________________________________________
dense_8 (Dense)              multiple                  9         
_________________________________________________________________
dense_9 (Dense)              multiple                  16        
=================================================================
Total params: 35
Trainable params: 35
Non-trainable params: 0
_________________________________________________________________

但是,在逐步构建顺序模型时,能够显示到目前为止的模型摘要(包括当前输出形状)非常有用。在这种情况下,您应该通过将Input对象传递给模型来启动模型,以便它从一开始就知道其输入形状:

 model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))

model.summary()
 
Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_10 (Dense)             (None, 2)                 10        
=================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________

请注意, Input对象不会显示为model.layers一部分,因为它不是图层:

 model.layers
 
[<tensorflow.python.keras.layers.core.Dense at 0x7f37680deb00>]

一个简单的替代方法是只将input_shape参数传递给您的第一层:

 model = keras.Sequential()
model.add(layers.Dense(2, activation="relu", input_shape=(4,)))

model.summary()
 
Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_11 (Dense)             (None, 2)                 10        
=================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________

使用这样的预定义输入形状构建的模型始终具有权重(甚至在查看任何数据之前),并且始终具有定义的输出形状。

通常,建议的最佳做法是始终事先指定顺序模型的输入形状(如果您知道它是什么)。

常见的调试工作流程: add() + summary()

在构建新的顺序体系结构时,使用add()和频繁打印模型摘要来逐步堆叠图层很有用。例如,这使您可以监视Conv2DMaxPooling2D图层堆栈如何对图像特征图进行下采样:

 model = keras.Sequential()
model.add(keras.Input(shape=(250, 250, 3)))  # 250x250 RGB images
model.add(layers.Conv2D(32, 5, strides=2, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))

# Can you guess what the current output shape is at this point? Probably not.
# Let's just print it:
model.summary()

# The answer was: (40, 40, 32), so we can keep downsampling...

model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))

# And now?
model.summary()

# Now that we have 4x4 feature maps, time to apply global max pooling.
model.add(layers.GlobalMaxPooling2D())

# Finally, we add a classification layer.
model.add(layers.Dense(10))
 
Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 123, 123, 32)      2432      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 121, 121, 32)      9248      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 40, 40, 32)        0         
=================================================================
Total params: 11,680
Trainable params: 11,680
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 123, 123, 32)      2432      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 121, 121, 32)      9248      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 40, 40, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 38, 38, 32)        9248      
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 36, 36, 32)        9248      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 12, 12, 32)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 10, 10, 32)        9248      
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 8, 8, 32)          9248      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 32)          0         
=================================================================
Total params: 48,672
Trainable params: 48,672
Non-trainable params: 0
_________________________________________________________________

很实用吧?

拥有模型后该怎么办

一旦模型架构准备就绪,您将需要:

使用顺序模型进行特征提取

一旦建立了顺序模型,它的行为就类似于功能API模型 。这意味着每个图层都具有inputoutput属性。这些属性可用于执行整洁的事情,例如快速创建一个模型以提取顺序模型中所有中间层的输出:

 initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=[layer.output for layer in initial_model.layers],
)

# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)
 

这是一个类似的示例,仅从一层中提取要素:

 initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu", name="my_intermediate_layer"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=initial_model.get_layer(name="my_intermediate_layer").output,
)
# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)
 

使用顺序模型转移学习

转移学习包括冻结模型中的底层并仅训练顶层。如果您不熟悉它,请务必阅读我们的转移学习指南

这是涉及顺序模型的两个常见的迁移学习蓝图。

首先,假设您有一个顺序模型,并且要冻结除最后一层之外的所有层。在这种情况下,您只需在model.layers进行迭代,并在除最后一层之外的每一层上设置layer.trainable = False 。像这样:

 model = keras.Sequential([
    keras.Input(shape=(784))
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(10),
])

# Presumably you would want to first load pre-trained weights.
model.load_weights(...)

# Freeze all layers except the last one.
for layer in model.layers[:-1]:
  layer.trainable = False

# Recompile and train (this will only update the weights of the last layer).
model.compile(...)
model.fit(...)
 

另一个常见的蓝图是使用顺序模型来堆叠预先训练的模型和一些新初始化的分类层。像这样:

 # Load a convolutional base with pre-trained weights
base_model = keras.applications.Xception(
    weights='imagenet',
    include_top=False,
    pooling='avg')

# Freeze the base model
base_model.trainable = False

# Use a Sequential model to add a trainable classifier on top
model = keras.Sequential([
    base_model,
    layers.Dense(1000),
])

# Compile & train
model.compile(...)
model.fit(...)
 

如果您进行转学,您可能会经常使用这两种模式。

这就是您需要了解的顺序模型的全部内容!

要了解有关在Keras中构建模型的更多信息,请参见: