Request: ability to reject augmented images #1788

td-anne · 2024-06-14T11:27:44Z

Feature description

I am trying to use augmentation to train an object detection model. I would like to ensure that all my bounding boxes are 100% visible in the result - but that the model is not presented with a training image where the object is 99% visible but there is no bounding box so it is penalised for claiming a detection. (In fact it would be acceptable to pass augmented images where the bounding box was 100% included or 0% included, but nothing in between.)

Concretely, I would like to be able to use min_visibility or a custom criterion to say "sorry this augmented image is invalid, try another".

Motivation and context

Presenting a machine learning model with an augmented image where the bounding box has been dropped results in teaching the model that the object of interest is not present in the image. Presenting it with an image with the bounding box teaches the model that the object is present and recognisable. For complex objects, it is very difficult to tell in an automated way whether a shrunken bounding box contains enough information that the model should recognise it. Therefore I would like to not present the model with such questionable examples.

Possible implementation

Provide an optional argument of a callable that can inspect the results of augmentation and return True if the augmented image is acceptable and False otherwise. If False, retry augmentation until True is achieved. This might not cover situations like Cutout where an internal part of the bounding box was removed.

Alternatively code could be integrated with min_visibility, providing an alternative handling where the augmentation is retried if min_visibility is not met instead of the bounding box being dropped.

Alternatives

The use could simply not apply any operations that risk removing part of a bounding box; this limits the range of transforms that can be applied (for example Cutout can't be used at all).

RandomBboxSafeCrop provides this functionality, but only for cropping.

The user could write a custom wrapper for the albumentations transform generator and interpose it between albumentations and the machine learning framework that they are using; this will not have access to information about which transforms were applied, or whether the bounding box was changed, but in combination with min_visibility it could probably be made to work, albeit in a framework-dependent way.

The text was updated successfully, but these errors were encountered:

td-anne added the enhancement New feature or request label Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request: ability to reject augmented images #1788

Request: ability to reject augmented images #1788

td-anne commented Jun 14, 2024

Request: ability to reject augmented images #1788

Request: ability to reject augmented images #1788

Comments

td-anne commented Jun 14, 2024

Feature description

Motivation and context

Possible implementation

Alternatives