Thanks for tuning in to Google I/O. View all sessions on demandWatch on demand

tfm.vision.layers.MultilevelROIGenerator

Proposes RoIs for the second stage processing.

pre_nms_top_k An int of the number of top scores proposals to be kept before applying NMS.
pre_nms_score_threshold A float of the score threshold to apply before applying NMS. Proposals whose scores are below this threshold are thrown away.
pre_nms_min_size_threshold A float of the threshold of each side of the box (w.r.t. the scaled image). Proposals whose sides are below this threshold are thrown away.
nms_iou_threshold A float in [0, 1], the NMS IoU threshold.
num_proposals An int of the final number of proposals to generate.
test_pre_nms_top_k An int of the number of top scores proposals to be kept before applying NMS in testing.
test_pre_nms_score_threshold A float of the score threshold to apply before applying NMS in testing. Proposals whose scores are below this threshold are thrown away.
test_pre_nms_min_size_threshold A float of the threshold of each side of the box (w.r.t. the scaled image) in testing. Proposals whose sides are below this threshold are thrown away.
test_nms_iou_threshold A float in [0, 1] of the NMS IoU threshold in testing.
test_num_proposals An int of the final number of proposals to generate in testing.
use_batched_nms A bool of whether or not use tf.image.combined_non_max_suppression.
**kwargs Additional keyword arguments passed to Layer.

Methods

call

View source

Proposes RoIs given a group of candidates from different FPN levels.

The following describes the steps:

  1. For each individual level: a. Apply sigmoid transform if specified. b. Decode boxes if specified. c. Clip boxes if specified. d. Filter small boxes and those fall outside image if specified. e. Apply pre-NMS filtering including pre-NMS top k and score thresholding. f. Apply NMS.
  2. Aggregate post-NMS boxes from each level.
  3. Apply an overall top k to generate the final selected RoIs.

Args
raw_boxes A dict with keys representing FPN levels and values representing box tenors of shape [batch, feature_h, feature_w, num_anchors * 4].
raw_scores A dict with keys representing FPN levels and values representing logit tensors of shape [batch, feature_h, feature_w, num_anchors].
anchor_boxes A dict with keys representing FPN levels and values representing anchor box tensors of shape [batch, feature_h * feature_w * num_anchors, 4].
image_shape A tf.Tensor of shape [batch, 2] where the last dimension are [height, width] of the scaled image.
training A bool that indicates whether it is in training mode.

Returns
roi_boxes A tf.Tensor of shape [batch, num_proposals, 4], the proposed ROIs in the scaled image coordinate.
roi_scores A tf.Tensor of shape [batch, num_proposals], scores of the proposed ROIs.