Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can --image-weights and DDP training be compatible by building image-weighted dataset? #13415

Open
1 task done
Scorbinwen opened this issue Nov 13, 2024 · 4 comments
Open
1 task done
Labels
detect Object Detection issues, PR's enhancement New feature or request question Further information is requested

Comments

@Scorbinwen
Copy link

Search before asking

Question

I have searched the YOLOv5 issues and discussions e.g.#3275 , the lastest official code seems still be incompatible for "--image-weights" and DDP training, but it's needed for my task when my dataset is highly class-unbalanced.
So I implement an image-weighted dataset by estimating repeat times for images:

def expand_indices(indices, image_weight):
    expanded_indices = []
    for idx, weight in zip(indices, image_weight):
        # count is repeat times
        count = int(weight)
        expanded_indices.extend([idx] * count)

    return expanded_indices

class LoadImagesAndLabelsAndMasks(LoadImagesAndLabels):  # for training/testing
    def __init__(
            self,
            path,
            img_size=640,
            batch_size=16,
            num_classes=None,
            augment=False,
            hyp=None,
            rect=False,
            cache_images=False,
            single_cls=False,
            stride=32,
            pad=0,
            prefix="",
            downsample_ratio=1,
            overlap=False,
            usegt=False,
            use_gray=False,
            gt_type='input-concat'
    ):
        super().__init__(path, img_size, batch_size, augment, hyp, rect, cache_images, single_cls,
                         stride, pad, prefix, usegt, gt_type)
        self.downsample_ratio = downsample_ratio
        self.overlap = overlap
        if num_classes is not None:
            self.num_classes = num_classes
            cw = labels_to_class_weights(self.labels, self.num_classes) * (1 - np.zeros(self.num_classes)) ** 2 # class weights
            iw = labels_to_image_weights(self.labels, nc=self.num_classes, class_weights=cw)  # image weights
            min_iw = torch.min(iw)
            iw = torch.round(iw / min_iw)
            repeat_tensor = lambda rep_list: expand_indices(rep_list, iw.int().tolist()) if rep_list is not None else None
            self.indices = repeat_tensor(self.indices)
            self.ims = repeat_tensor(self.ims)
            self.im_files = repeat_tensor(self.im_files)
            self.npy_files = repeat_tensor(self.npy_files)
            self.labels = repeat_tensor(self.labels)
            self.segments = repeat_tensor(self.segments)

I hope someone can help me double check the implementation, if it's ok, I will be grad to contribute to the Yolov5 community.

Additional

No response

@Scorbinwen Scorbinwen added the question Further information is requested label Nov 13, 2024
@UltralyticsAssistant UltralyticsAssistant added detect Object Detection issues, PR's enhancement New feature or request labels Nov 13, 2024
@UltralyticsAssistant
Copy link
Member

👋 Hello @Scorbinwen, thank you for your interest in YOLOv5 🚀! Your question about combining --image-weights with DDP training is indeed a thoughtful one, especially considering your need to handle a highly class-unbalanced dataset.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it. For implementation-related questions like yours, providing comprehensive context helps us and others in the community to assist you better.

If this is a custom training ❓ Question, make sure to include any relevant dataset image examples and training logs. Additionally, it might be helpful to ensure you're following our tips for achieving the best training results.

Requirements

Ensure you have Python>=3.8.0 with all required libraries installed, including PyTorch>=1.8.

Environments

YOLOv5 can be run in several environments:

  • Notebooks with free GPU like Gradient, Colab, and Kaggle.
  • Google Cloud Deep Learning VM.
  • Amazon Deep Learning AMI.
  • Docker Image.

Each of these options comes with dependencies such as CUDA, CUDNN, and others preinstalled.

Status

Check the YOLOv5 CI for the current status of Continuous Integration tests. This helps to confirm that the code is functioning correctly on various systems.

This is an automated response, but an Ultralytics engineer will also assist you soon. Feel free to engage with the community in the discussions tab for more insights and support. 😊

@pderrenger
Copy link
Member

@Scorbinwen your approach to implementing an image-weighted dataset for DDP training looks promising. However, please ensure that your modifications align with the latest YOLOv5 updates to avoid compatibility issues. If you believe your implementation is robust and beneficial, consider submitting a pull request for review by the community. This way, it can be evaluated and potentially integrated into the main repository. Thank you for your contribution!

@Scorbinwen
Copy link
Author

@Scorbinwen your approach to implementing an image-weighted dataset for DDP training looks promising. However, please ensure that your modifications align with the latest YOLOv5 updates to avoid compatibility issues. If you believe your implementation is robust and beneficial, consider submitting a pull request for review by the community. This way, it can be evaluated and potentially integrated into the main repository. Thank you for your contribution!

Thank you for your reply, I'll check my implementation.

@pderrenger
Copy link
Member

@Scorbinwen you're welcome! When submitting your PR, please include benchmark results comparing DDP training with and without image weights to demonstrate the performance impact. Looking forward to reviewing your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
detect Object Detection issues, PR's enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants