tfm.vision.preprocess_ops.random_crop_image_with_boxes_and_labels
Crops a random slice from the input image.
tfm.vision.preprocess_ops.random_crop_image_with_boxes_and_labels(
img,
boxes,
labels,
min_scale,
aspect_ratio_range,
min_overlap_params,
max_retry
)
The function will correspondingly recompute the bounding boxes and filter out
outside boxes and their labels.
References:
[1] End-to-End Object Detection with Transformers
https://arxiv.org/abs/2005.12872
The preprocessing steps:
- Sample a minimum IoU overlap.
- For each trial, sample the new image width, height, and top-left corner.
- Compute the IoUs of bounding boxes with the cropped image and retry if
the maximum IoU is below the sampled threshold.
- Find boxes whose centers are in the cropped image.
- Compute new bounding boxes in the cropped region and only select those
boxes' labels.
Args |
img
|
a 'Tensor' of shape [height, width, 3] representing the input image.
|
boxes
|
a 'Tensor' of shape [N, 4] representing the ground-truth bounding
boxes with (ymin, xmin, ymax, xmax).
|
labels
|
a 'Tensor' of shape [N,] representing the class labels of the boxes.
|
min_scale
|
a 'float' in [0.0, 1.0) indicating the lower bound of the random
scale variable.
|
aspect_ratio_range
|
a list of two 'float' that specifies the lower and upper
bound of the random aspect ratio.
|
min_overlap_params
|
a list of four 'float' representing the min value, max
value, step size, and offset for the minimum overlap sample.
|
max_retry
|
an 'int' representing the number of trials for cropping. If it is
exhausted, no cropping will be performed.
|
Returns |
img
|
a Tensor representing the random cropped image. Can be the
original image if max_retry is exhausted.
|
boxes
|
a Tensor representing the bounding boxes in the cropped image.
|
labels
|
a Tensor representing the new bounding boxes' labels.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-02-02 UTC.
[null,null,["Last updated 2024-02-02 UTC."],[],[]]