🐾 Prompt-Free Conditional Diffusion for Multi-object Image Augmentation

IJCAI 2025

News

2025-07: 🔥 Code and weights are released!
2025-04: 🤗 Our paper is accepted by IJCAI 2025.

Prepare Environment

conda create -n env_name python=3.11 -y
conda activate env_name
pip install -r requirements.txt

Prepare Dataset

Download the COCO dataset.
Unzip the dataset and place it in the data folder.
Change the data_root in dataset/coco.py to the path of your dataset.

Prepare Pipeline

Run inference/inference_image_variation_sdxl.py to organize the pre-trained model weights.

Training Scripts

To train the model, you can use the provided training script:

bash scripts/sdxl512/train.sh

This will start the training process with the specified configuration. You can modify the train.sh script to adjust the training parameters, such as batch size, learning rate, and number of epochs.

We also provide the trained model weights for the COCO dataset. You can download them from here.

Evaluation

Downstream Task Evaluation

Generate training split images using the trained model:

bash scripts/generate/generate_train.sh

Label the generated images using Grounding DINO and SAM:

bash scripts/labeling/label_train.sh
bash scripts/labeling/label_train_seg.sh

Train the downstream task model using the labeled images:

Use Detectron2 to train the downstream task model.

Generation Quality Evaluation

Generate validation split images using the trained model:

bash scripts/generate/generate_val.sh

Evaluate the generated images using FID, Diversity Score(LPIPS), Image Quantity Score(IQS):

python utils/metrics.py 
python iqs/evaluation.py

Note: To evaluate with YOLOv8, you need to modify the dataset structure to match the YOLOv8 format.

Gradio Demo

To run the Gradio demo, you can use the following command:

python app_gradio.py --checkpoint_path path/to/ckpt

Acknowledgements

This repository is built upon the following projects:

Thanks to the authors for their great work!

Citation

If you find this code useful for your research, please consider citing our paper:

@article{wang2025prompt,
  title={Prompt-Free Conditional Diffusion for Multi-object Image Augmentation},
  author={Wang, Haoyu and Zhang, Lei and Wei, Wei and Ding, Chen and Zhang, Yanning},
  booktitle={Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI)},
  year={2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
data		data
dataset		dataset
grounded_sam		grounded_sam
inference		inference
iqs		iqs
models		models
pipelines		pipelines
scripts		scripts
src		src
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app_gradio.py		app_gradio.py
requirments.txt		requirments.txt
train_image_variation_sdxl.py		train_image_variation_sdxl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🐾 Prompt-Free Conditional Diffusion for Multi-object Image Augmentation

News

Prepare Environment

Prepare Dataset

Prepare Pipeline

Training Scripts

Evaluation

Downstream Task Evaluation

Generation Quality Evaluation

Gradio Demo

Acknowledgements

Citation

About

Uh oh!

Languages

License

00why00/PFCD

Folders and files

Latest commit

History

Repository files navigation

🐾 Prompt-Free Conditional Diffusion for Multi-object Image Augmentation

News

Prepare Environment

Prepare Dataset

Prepare Pipeline

Training Scripts

Evaluation

Downstream Task Evaluation

Generation Quality Evaluation

Gradio Demo

Acknowledgements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages