tfm.vision.spatial_transform_ops.multilevel_crop_and_resize

Crop and resize on multilevel feature pyramid.

Generate the (output_size, output_size) set of pixels for each input box by first locating the box into the correct feature level, and then cropping and resizing it using the correspoding feature map of that level.

features A dictionary with key as pyramid level and value as features. The features are in shape of [batch_size, height_l, width_l, num_filters].
boxes A 3-D Tensor of shape [batch_size, num_boxes, 4]. Each row represents a box with [y1, x1, y2, x2] in un-normalized coordinates.
output_size A scalar to indicate the output crop size.
sample_offset a float number in [0, 1] indicates the subpixel sample offset from grid point.

A 5-D tensor representing feature crop of shape [batch_size, num_boxes, output_size, output_size, num_filters].