The RetinaNet model class.
tfm.vision.models.RetinaNetModel(
backbone: tf.keras.Model,
decoder: tf.keras.Model,
head: tf.keras.layers.Layer,
detection_generator: tf.keras.layers.Layer,
min_level: Optional[int] = None,
max_level: Optional[int] = None,
num_scales: Optional[int] = None,
aspect_ratios: Optional[List[float]] = None,
anchor_size: Optional[float] = None,
**kwargs
)
Args |
backbone
|
tf.keras.Model a backbone network.
|
decoder
|
tf.keras.Model a decoder network.
|
head
|
RetinaNetHead , the RetinaNet head.
|
detection_generator
|
the detection generator.
|
min_level
|
Minimum level in output feature maps.
|
max_level
|
Maximum level in output feature maps.
|
num_scales
|
A number representing intermediate scales added
on each level. For instances, num_scales=2 adds one additional
intermediate anchor scales [2^0, 2^0.5] on each level.
|
aspect_ratios
|
A list representing the aspect raito
anchors added on each level. The number indicates the ratio of width to
height. For instances, aspect_ratios=[1.0, 2.0, 0.5] adds three anchors
on each scale level.
|
anchor_size
|
A number representing the scale of size of the base
anchor to the feature stride 2^level.
|
**kwargs
|
keyword arguments to be passed.
|
Attributes |
backbone
|
|
checkpoint_items
|
Returns a dictionary of items to be additionally checkpointed.
|
decoder
|
|
detection_generator
|
|
head
|
|
Methods
call
View source
call(
images: Union[tf.Tensor, Sequence[tf.Tensor]],
image_shape: Optional[tf.Tensor] = None,
anchor_boxes: Optional[Mapping[str, tf.Tensor]] = None,
output_intermediate_features: bool = False,
training: bool = None
) -> Mapping[str, tf.Tensor]
Forward pass of the RetinaNet model.
Args |
images
|
Tensor or a sequence of Tensor , the input batched images to
the backbone network, whose shape(s) is [batch, height, width, 3]. If it
is a sequence of Tensor , we will assume the anchors are generated
based on the shape of the first image(s).
|
image_shape
|
Tensor , the actual shape of the input images, whose shape
is [batch, 2] where the last dimension is [height, width]. Note that
this is the actual image shape excluding paddings. For example, images
in the batch may be resized into different shapes before padding to the
fixed size.
|
anchor_boxes
|
a dict of tensors which includes multilevel anchors.
- key:
str , the level of the multilevel predictions.
- values:
Tensor , the anchor coordinates of a particular feature
level, whose shape is [height_l, width_l, num_anchors_per_location].
|
output_intermediate_features
|
bool indicating whether to return the
intermediate feature maps generated by backbone and decoder.
|
training
|
bool , indicating whether it is in training mode.
|
Returns |
scores
|
a dict of tensors which includes scores of the predictions.
- key:
str , the level of the multilevel predictions.
- values:
Tensor , the box scores predicted from a particular feature
level, whose shape is
[batch, height_l, width_l, num_classes * num_anchors_per_location].
|
boxes
|
a dict of tensors which includes coordinates of the predictions.
key: str , the level of the multilevel predictions.
values: Tensor , the box coordinates predicted from a particular
feature level, whose shape is
[batch, height_l, width_l, 4 * num_anchors_per_location].
|
attributes
|
a dict of (attribute_name, attribute_predictions). Each
attribute prediction is a dict that includes:
key: str , the level of the multilevel predictions.
values: Tensor , the attribute predictions from a particular
feature level, whose shape is
[batch, height_l, width_l, att_size * num_anchors_per_location].
|