![]() |
ALBERT (https://arxiv.org/abs/1810.04805) text encoder network.
tfm.nlp.networks.AlbertEncoder(
vocab_size,
embedding_width=128,
hidden_size=768,
num_layers=12,
num_attention_heads=12,
max_sequence_length=512,
type_vocab_size=16,
intermediate_size=3072,
activation=tfm.utils.activations.gelu
,
dropout_rate=0.1,
attention_dropout_rate=0.1,
initializer=tf.keras.initializers.TruncatedNormal(stddev=0.02),
dict_outputs=False,
**kwargs
)
This network implements the encoder described in the paper "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations" (https://arxiv.org/abs/1909.11942).
Compared with BERT (https://arxiv.org/abs/1810.04805), ALBERT refactorizes embedding parameters into two smaller matrices and shares parameters across layers.
The default values for this object are taken from the ALBERT-Base implementation described in the paper.
Methods
call
call(
inputs, training=None, mask=None
)
Calls the model on new inputs and returns the outputs as tensors.
In this case call()
just reapplies
all ops in the graph to the new inputs
(e.g. build a new computational graph from the provided inputs).
Args | |
---|---|
inputs
|
Input tensor, or dict/list/tuple of input tensors. |
training
|
Boolean or boolean scalar tensor, indicating whether to
run the Network in training mode or inference mode.
|
mask
|
A mask or list of masks. A mask can be either a boolean tensor or None (no mask). For more details, check the guide here. |
Returns | |
---|---|
A tensor if there is a single output, or a list of tensors if there are more than one outputs. |
get_embedding_table
get_embedding_table()