Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CutAndPaste #1297

Closed
wants to merge 12 commits into from
Closed

Conversation

i-aki-y
Copy link
Contributor

@i-aki-y i-aki-y commented Sep 23, 2022

About PR

This PR tries to implement Cut And Paste Augmentation "Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation".
It includes multiple image blending methods given in "Poisson Image Editing".

See also: #1225

Blending Demo

The A.paste function supports the following four image blending method.

BaseAndPasteImages
BlendComparison

Figure: The result of different methods: GAUSSIAN, NORMAL_CLONE, MIXED_CLONE, MONOCHROME_TRANSFER (Left to right)

CutAndPaste Transform Demo

This is an example of the CutAndPaste Transform usage.

transform = A.Compose(
    [
        A.CutAndPaste(
            p=1.0, 
            paste_image_dir=object_dir,
            get_label_from_path=get_label_from_path,
            num_object_limit=(1, 3),
            blend_method="GAUSSIAN"
        )
    ],
    bbox_params=A.BboxParams(format='coco', label_fields=["labels"])
)

It requires paste_image_dir parameters, which include objects to be pasted. The transform randomly selects multiple files (objects) from the paste_image_dir directory. So in this version, the user needs to prepare images so that each image includes a single object.

The transform also requires a function parameter get_label_from_path, which returns label information from a given object file path. Since the transform could not know the label information of randomly selected objects, the get_label_from_path is used to extract label information from the object file path. This means that the user should include label information in the object path. And provide the function to extract the information from the path.

The followings are the results of different trials with a fixed base image.
RandomExamples_no_annots

RandomExamples_with_annots

Sample Notebook

You can reproduce the same result in this notebook on Colab:
https://colab.research.google.com/drive/1sFCAhS8FTyp7dLIdJgUJwbB5JnGBYoq3

Note and Limitation

1. The user needs to prepare object files as RGBA PNG images.

As described above, the user needs to prepare object files (images to be pasted) in advance.

2. The object file path should include label information

Since the transform identifies the label information from the object path, the user needs to include label information in the object path and provide the function get_label_from_pathas a parameter, which extracts label information from the path.
For example, when the object path is like a path/to/object/{object_id}_{label_id}.png, a get_label_from_path could be:

get_label_from_path = lambda image_path: int(image_path.stem.split("_")[-1])

3. Returned masks become binary masks

Even if the input masks are non-binary masks, the transformed masks are binary masks. (The values are 0 or 1)
I think I can remove this limitation with extra work.

About Implementation

1. A new rotation method, rotate_bound is added.

A new rotation function, rotate_bound is introduced to rotate object images.
While the standard rotate function causes unwanted crops, the rotate_bound expands the input's shape depending on the rotation. See the following examples.

CompareRotation

2. Augmentation to the masks are actually done inside the get_params_dependent_on_targets

Augmented masks are needed to calculate the bbox augmentation. But I could not find a better place where both the bboxes and masks are accessible except for get_params_dependent_on_targets.
So I implement the masks augmentation in the get_params_dependent_on_targets


  • 2022-10-18: Update description and examples.

@i-aki-y i-aki-y changed the title [WIP] Add CutAndPaste Add CutAndPaste Oct 19, 2022
@i-aki-y
Copy link
Contributor Author

i-aki-y commented Oct 19, 2022

Most of the work has been completed.
@Dipet @ternaus I would appreciate it if you reviewed this.

@i-aki-y
Copy link
Contributor Author

i-aki-y commented Nov 1, 2022

Sorry for the additional correction.
The current implementation assumes that the number of all objects in an image is less than 255("the length of target masks" + "the number of pasted objects" < 255).
So, I added some validation code and documentation about the limitation.
I think the number 255 is sufficiently large for a typical usecase, but if needed, it can be increased.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants