Skip to content

csce585-mlsystems/FancyBear

Repository files navigation

HRNet Swimming Pose Estimation

Overview

This project focuses on adapting a High-Resolution Network (HRNet) for pose estimation in an underwater swimming environment. Pose estimation underwater presents unique challenges, including light refraction, motion distortion, and occlusions caused by water turbulence. Leveraging HRNet's ability to maintain high-resolution representations throughout the process, this project aims to overcome these obstacles and achieve accurate keypoint detection.

Data Collection

The dataset for this project was collected from the University of South Carolina Swim and Dive Team. The athletes' performances were recorded during training sessions to generate a comprehensive dataset for underwater pose estimation. You can learn more about the swim and dive team here.
The dataset used for training can be found in the data directory. it contains a few dataset directories, each from a different video. each dataset will have a directory for the frames and the COCO format JSON annotation file. each frame is annotated with 13 keypoints to represent swimmer biomechanics.

Implementation Details

This project was developed based on the research paper "Deep High-Resolution Representation Learning for Human Pose Estimation" by Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. The paper introduces HRNet, which maintains high-resolution representations throughout the process, achieving superior accuracy in pose estimation tasks. This project's implementation draws inspiration from Bin Xiao's official HRNet repository, which is available on GitHub here.

Installation

  1. Clone this repository to your local machine:

    git clone https://github.com/csce585-mlsystems/FancyBear.git
  2. Navigate into the project directory:

    cd FancyBear/Swimming_Pose_Estimation
  3. Install the required packages:

    pip install -r requirements.txt

Usage

1. Testing the Model on a Swimming Video

First, open Tester.py and add the model and input video file paths in lines 182 and 188.
Next, run the Tester.py script.

python train_model.py

This script will estimate keypoints found in the video and output a new video with the estimated keypoints.

2. Training the Model

Training the model will require NVidia GPU. In order to modify the learning rate, batch size, and the number of epochs go to lines 142 and 145. After a new minimum has been found, the trainer will save the model as a checkpoint.pth.
To initialize the training, run the hrnetTrainer.py

python hrnetTrainer.py

3. System Testers

There are a few scripts to test various components of the system.

Test the Model on One Frame:

The hrnetTester.pt script will test the accuracy of the model by loading a single frame on a desired model and showing it with the corresponding keypoints, the target ground truth for each keypoint (GT keypoint n), the confidence level for each predicted keypoint, the ground truth for each predicted keypoint (Pred Keypoint n), and the Euclidean distance from the target keypoint to the predicted keypoint.
First, open hrnetTester.pt and add the desired model and frame for testing on lines 165 and 174. Next run `hrnetTester.py.

python hrnetTester.py

Test the Data Augmentation Functionality:

The testerAugmentation.py script will test three data augmentation techniques, Horizontal Flip, Rotation, and Translation. The function will select a random frame from the dataset and apply the augmentations, showing the before and after frames with the corresponding keypoints.
To run the augmentation tester:

python testerAugmentation.py

Test the Data Loader:

The testerDATALOADER.py script will test the functionality of the dataloader by loading the dataset and displaying a batch of 4 frames with their name and corresponding keypoints.
To run the dataloader tester:

python testerDATALOADER.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages