SKETCH

This is the official implementation of "Doodle to Detect: A Goofy but Powerful Approach to Skeleton-based Hand Gesture Recognition".

Overview

SKETCH is a novel skeleton-based hand gesture recognition framework that directly transforms raw 4D skeleton sequences into interpretable visual graph representations. It introduces a learnable Dynamic Range Embedding (DRE) to preserve motion magnitudes and applies per-window normalization for robustness to viewpoint and body variations. Leveraging pre-trained vision backbones, SKETCH achieves state-of-the-art accuracy and generalization on SHREC’19 and SHREC’22 benchmarks.

Figure 1. Overall architecture of the SKETCH.

Development Progress

Core Implementation

Demo script
Add training pipeline (SHREC'22)
Release pretrained model (SHREC'22)
Add post processing pipeline (SHREC'22)

Results

Figure 2. Visualization of joint-level attention for gestures in the SHREC’22 dataset.

Comparison on SHREC’19

Method	DR↑	FP↓	Time (s)	FPS	JCD	FD	Adj	Ref	Plot
PSUMNet	0.64	0.22	0.0250	40	✓	✓
MS-G3D	0.69	0.25	0.0303	33	✓
SeS-GCN	0.75	0.12	0.0020	500			✓
SW 3-cent	0.76	0.19	0.0030	333		✓		✓
DSTA	0.81	0.08	0.0088	114	✓
DG-STA	0.81	0.07	0.0042	238			✓
DDNet	0.82	0.10	0.0022	455	✓	✓
uDeepGRU	0.85	0.10	0.0030	333		✓		✓
OO-dMVMT	0.88	0.05	0.0058	172	✓	✓
DS-GCN	0.80	0.05	-	-			✓	✓
BlockGCN	0.83	0.04	-	-		✓	✓	✓
ProtoGCN	0.86	0.05	0.0334	30			✓	✓
SKETCH (V-L-16-384)	0.90	0.03	0.0176	57					✓
SKETCH (S-S-4-7-224)	0.88	0.04	0.0039	256					✓
SKETCH (S-B-4-12-384)	0.91	0.03	0.0091	110					✓
SKETCH (S-L-16-384)	0.92	0.02	0.0142	70					✓
SKETCH (S-L-16-384) + Aug	0.93	0.01	0.0142	70					✓

Comparison on SHREC’22

Method	DR↑	FP↓	JI↑	Delay (fr.)	Time (s)	JCD	FD	Adj	Plot
DeepGRU	0.26	0.25	0.21	8.0	0.0031
DG-STA	0.51	0.32	0.40	8.0	0.0042			✓
SeS-GCN	0.60	0.16	0.53	8.0	0.0018			✓
PSUMNet	0.62	0.24	0.52	8.0	0.0244	✓	✓
MS-G3D	0.68	0.21	0.57	8.0	0.0293	✓		✓
Stronger	0.72	0.34	0.59	14.8	0.1000	✓	✓
DSTA	0.73	0.24	0.61	8.0	0.0092	✓		✓
2ST-GCN+5F	0.74	0.23	0.61	13.3	0.0021	✓	✓
TN-FSM+JD	0.77	0.23	0.63	10.0	0.0046	✓		✓
Causal TCN	0.80	0.29	0.68	19.0	0.0280			✓
DDNet	0.88	0.16	0.78	8.0	0.0022	✓	✓
OO-dMVMT	0.92	0.09	0.85	8.0	0.0041	✓	✓
SKETCH (S-B-4-12-384)	0.91	0.06	0.86	8.0	0.0097				✓
SKETCH (S-L-4-12-384)	0.92	0.07	0.87	8.0	0.0124				✓
SKETCH (S-L-4-12-384) + Aug	0.95	0.06	0.91	8.0	0.0124				✓

Installation

conda create -n sketch python=3.7 -y
conda activate sketch
git clone https://github.com/capableofanything/SKETCH.git
cd SKETCH
pip install -r requirements.txt

Dataset Structure

SHREC'22

The dataset should be organized as follows:

SKETCH/
└── shrec22/
    └── SHREC2022/        # SHREC'22 dataset root
        ├── shrec2022_training_set/
        │   ├── annotations.txt
        │   ├── 1.txt
        │   ├── 2.txt
        │   └── ...
        └── shrec2022_test_set/
            ├── annotations.txt
            ├── 1.txt
            ├── 2.txt
            └── ...

Demo Script

The demo.py script generates a small sample of SHREC'22 windows and corresponding stacked coordinate images.

python demo.py --num_files 2 --window 16 --stride 1

Full Pipeline Execution

To reproduce the full SHREC'22 workflow, run the following scripts in order:

python prepare_window/shrec22_prepare_window.py
python draw/shrec22_draw_3stack.py
python train.py
python post_processing\shrec22\post_processing.py

Pretrained Models

We provide pretrained weights for SHREC'22.
You can download them from the following link:

🔗 Download Pretrained Models (Google Drive)

Community & Contribution

We encourage everyone to try applying SKETCH to other human action recognition tasks
and share your results or insights with the community!

If you adapt or extend our framework, feel free to open an issue,
start a discussion, or submit a pull request — we’d love to hear from you.

Citation

@inproceedings{handoodle,
  title={Doodle to Detect: A Goofy but Powerful Approach to Skeleton-based Hand Gesture Recognition},
  author={Han, Sang Hoon and Lee, Seonho and Nam, Hyeok and Park, Jae Hyeon and Cha, Min Hee and Kim, Min Geol and Lee, Hyunse and Ahn, Sangyeon and Cho, Sung In and others},
  booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}
}

License

This project is licensed under the MIT License.

Contact

For questions, collaborations, or feedback, please contact:
SangHoon Han (Co-first Authors) — [email protected]
Seonho Lee (Co-first Authors) — [email protected]
Hyeok Nam (Co-first Authors) — [email protected]
Sung In Cho (Corresponding Author) — [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.idea		.idea
assets		assets
post_processing/shrec22		post_processing/shrec22
shrec22		shrec22
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SKETCH

Table of Contents

Overview

Development Progress

Core Implementation

Results

Comparison on SHREC’19

Comparison on SHREC’22

Installation

Dataset Structure

SHREC'22

Demo Script

Full Pipeline Execution

Pretrained Models

Community & Contribution

Citation

License

Contact

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

capableofanything/SKETCH

Folders and files

Latest commit

History

Repository files navigation

SKETCH

Table of Contents

Overview

Development Progress

Core Implementation

Results

Comparison on SHREC’19

Comparison on SHREC’22

Installation

Dataset Structure

SHREC'22

Demo Script

Full Pipeline Execution

Pretrained Models

Community & Contribution

Citation

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages