by Zhi-Yi Chin
This repository is implementation of homework2 for IOC5008 Selected Topics in Visual Recognition using Deep Learning course in 2021 fall semester at National Yang Ming Chiao Tung University.
In this homework, we participate in the SVHN detection competition hosted on CodaLab. The Street View House Numbers (SVHN) dataset contains 33,402 training images and 13,068 testing images. We are required to train not only an accurate but fast digit detector. The submission format should follow COCO results. To test the detection model's speed, we must benchmark the detection model in the Google Colab environment and screenshot the results.
You can download a copy of all the files in this repository by cloning this repository:
git clone https://github.com/joycenerd/yolov5-svhn-detection.git
You need to have Anaconda or Miniconda already installed in your environment. To install requirements:
conda env create --name detect python=3
conda activate detect
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch
cd yolov5
pip install -r requirements.txt
You can download the raw data after you have registered the challenge mention above.
python mat2yolo.py --data-root <path_to_data_root_dir>
- input: your data root directory, inside this directory you should have
train/
which saves all the training images anddigitStruct.mat
which is the original label file. - output:
<path_to_data_root_dir>/labels/all_train/
-> inside this folder there will have text files with the name same as the training image name, they are YOLO format annotations.
python train_valid_split.py --data-root <path_to_data_root_dir> --ratio 0.2
- input: same as last step plus the output of last step
- output:
<path_to_data_root_dir>/images/
: inside this folder will have two subfoldertrain/
(training images) andvalid/
(validation images).<path_to_data_root_dir>/labels/train/
: text files that contain training labels<path_to_data_root_dir>/labels/valid/
: text files that contain validation labels
Got to yolov5/data/custom-data.yml
and modified path
, train
, val
and test
path
You should have Graphics card to train the model. For your reference, we trained on 2 NVIDIA RTX 1080Ti for 14 hours.
Recommended training command:
cd yolov5
python train.py --weights <yolo5s.pt_file> --cfg models/yolov5s.yaml --data data/custom-data.yaml --epochs 150 --cache --device <gpu_ids> --workers 4 --project <train_log_dir> --save-period 5
There are more setting arguments you can tune in yolov5/train.py
, our recommendation is first stick with default setting.
- input: pre-trained
yolo5s.pt
downloaded from https://github.com/ultralytics/yolov5/releases/tag/v6.0 - output: logging directory
<train_log_dir>
. Note: if this is your first experiment there will be a subdirectory nameexp/
,exp2
if this is your second experiment and so on. Inside this logging directory you can find:weights/
: All the training checkpoints will be saved inside here. Checkpoints is saved every 5 epochs andbest.pt
save the current best model andlast.pt
save the latest model.- tensorboard event
- Some miscellaneous information about the data and current hyperparameter
You can validate your training results by the following recommendation command:
cd yolov5
python val.py --data data/custom-data.yaml --weights <ckpt_path> --device <gpu_ids> --project <val_log_dir>
- input: your model checkpoint path
You can do detection on the testing set by the following recommendation commend:
cd yolov5
python detect.py --weights <ckpt_path> --source <test_data_dir_path> --save-txt --device <gpu_id> --save-conf --nosave
- input:
- trained model checkpoint
- testing images
- output:
yolov5/runs/detect/exp<X>/labels/
will be generated, inside this folder will have text files with the same name as the testing images, and inside each text file is the detection results of the correspoding testing image in YOLO format.
There is another way that you don't need to do post-processing afterward:
cd yolov5
python val.py --data data/custom-data.yaml --weights <ckpt_path> --device <gpu_id> --project <test_log_dir> --task test --save-txt --save-conf --save-json
- input: training model checkpoint
- output:
test_log_dir/exp<X>/<ckpt_name>.json
-> this is the COCO format detection result of the test set.
Turn YOLO format detection results into COCO format.
python yolo2coco.py --yolo-path <detect_label_dir>
- input: detection results in the testing step.
- output:
answer.json
Run this command to compress your submission file:
zip answer.zip answer.json
You can upload answer.zip
to the challenge. Then you can get your testing score.
Go to Releases. Under My YOLOv5s model download yolov5_best.pt
. This pre-trained model get score 0.4217 on the SVHN testing set.
To reproduce our results, run this command:
cd yolov5
python val.py --data data/custom-data.yaml --weights <yolov5_best.pt_path> --device <gpu_id> --project <test_log_dir> --task test --save-txt --save-conf --save-json
Open inference.ipynb
using Google Colab and follow the instruction in it.
To reproduce our submission without retraining, do the following steps
- Getting the code
- Install the dependencies
- Download the data and data pre-processing
- Download pre-trained models
- Inference
- Submit the results
- Testing score:
conf_thres | 0.25 | 0.01 | 0.001 |
---|---|---|---|
score | 0.4067 | 0.4172 | 0.4217 |
- Detection speed: 22.7ms per image
We thank the authors of these repositories:
If you find our work useful in your project, please cite:
@misc{
title = {yolov5-schn-detection},
author = {Zhi-Yi Chin},
url = {https://github.com/joycenerd/yolov5-schn-detection},
year = {2021}
}
If you'd like to contribute, or have any suggestions, you can contact us at [email protected] or open an issue on this GitHub repository.
All contributions welcome! All content in this repository is licensed under the MIT license.