You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use augmentation to train an object detection model. I would like to ensure that all my bounding boxes are 100% visible in the result - but that the model is not presented with a training image where the object is 99% visible but there is no bounding box so it is penalised for claiming a detection. (In fact it would be acceptable to pass augmented images where the bounding box was 100% included or 0% included, but nothing in between.)
Concretely, I would like to be able to use min_visibility or a custom criterion to say "sorry this augmented image is invalid, try another".
Motivation and context
Presenting a machine learning model with an augmented image where the bounding box has been dropped results in teaching the model that the object of interest is not present in the image. Presenting it with an image with the bounding box teaches the model that the object is present and recognisable. For complex objects, it is very difficult to tell in an automated way whether a shrunken bounding box contains enough information that the model should recognise it. Therefore I would like to not present the model with such questionable examples.
Possible implementation
Provide an optional argument of a callable that can inspect the results of augmentation and return True if the augmented image is acceptable and False otherwise. If False, retry augmentation until True is achieved. This might not cover situations like Cutout where an internal part of the bounding box was removed.
Alternatively code could be integrated with min_visibility, providing an alternative handling where the augmentation is retried if min_visibility is not met instead of the bounding box being dropped.
Alternatives
The use could simply not apply any operations that risk removing part of a bounding box; this limits the range of transforms that can be applied (for example Cutout can't be used at all).
RandomBboxSafeCrop provides this functionality, but only for cropping.
The user could write a custom wrapper for the albumentations transform generator and interpose it between albumentations and the machine learning framework that they are using; this will not have access to information about which transforms were applied, or whether the bounding box was changed, but in combination with min_visibility it could probably be made to work, albeit in a framework-dependent way.
The text was updated successfully, but these errors were encountered:
Feature description
I am trying to use augmentation to train an object detection model. I would like to ensure that all my bounding boxes are 100% visible in the result - but that the model is not presented with a training image where the object is 99% visible but there is no bounding box so it is penalised for claiming a detection. (In fact it would be acceptable to pass augmented images where the bounding box was 100% included or 0% included, but nothing in between.)
Concretely, I would like to be able to use
min_visibility
or a custom criterion to say "sorry this augmented image is invalid, try another".Motivation and context
Presenting a machine learning model with an augmented image where the bounding box has been dropped results in teaching the model that the object of interest is not present in the image. Presenting it with an image with the bounding box teaches the model that the object is present and recognisable. For complex objects, it is very difficult to tell in an automated way whether a shrunken bounding box contains enough information that the model should recognise it. Therefore I would like to not present the model with such questionable examples.
Possible implementation
Provide an optional argument of a callable that can inspect the results of augmentation and return True if the augmented image is acceptable and False otherwise. If False, retry augmentation until True is achieved. This might not cover situations like Cutout where an internal part of the bounding box was removed.
Alternatively code could be integrated with
min_visibility
, providing an alternative handling where the augmentation is retried ifmin_visibility
is not met instead of the bounding box being dropped.Alternatives
The use could simply not apply any operations that risk removing part of a bounding box; this limits the range of transforms that can be applied (for example Cutout can't be used at all).
RandomBboxSafeCrop provides this functionality, but only for cropping.
The user could write a custom wrapper for the albumentations transform generator and interpose it between albumentations and the machine learning framework that they are using; this will not have access to information about which transforms were applied, or whether the bounding box was changed, but in combination with
min_visibility
it could probably be made to work, albeit in a framework-dependent way.The text was updated successfully, but these errors were encountered: