Gesture Model FPGA Pipeline

Overview

This repository contains the software side of a gesture-recognition pipeline that targets a custom SoC on the Arty S7-25 FPGA. A webcam on my Macbook Pro collects hand landmarks with MediaPipe, a lightweight MLP classifies gestures, and the quantized model is exported so the FPGA can drive a BiStable robot via an ESP32 WiFi link. The long-term goal is a secure, low-latency loop where the FPGA validates a four-gesture passcode through an enclave block, updates a host-facing display, and streams motion commands to the robot.

System Pipeline

Capture – gesture-pipelines/gesture-webcam.py streams frames, extracts 21-point landmarks, and soft-validates predictions with the quantized weights.
Train – training/mlp-training.py fits the MLP on curated datasets (data/landmarks_filtered.csv) to refresh models/gesture_mlp_model.h5.
Quantize/Export – scripts/quantizing.py converts per-layer CSV weights into models/quantized_weights.bin for FPGA consumption.
Deploy – FPGA logic consumes the binary weights, runs inference, evaluates the gesture passcode in a secure enclave, and relays unlock + control signals to the robot through the ESP32 peripheral.

Repository Layout

data/ – Landmark CSV datasets (landmarks.csv, landmarks_filtered.csv).
data-modifications/ – CSV utilities for combining, pruning, and labeling landmark data.
training/ – Model training and evaluation scripts (mlp-training.py, gesture-logger.py).
gesture-pipelines/ – MediaPipe-based webcam and still-image prototypes.
models/ – Trained assets (gesture_mlp_model.h5, gesture_recognizer.task, quantized_weights.bin, layer_*_{weights,bias}.csv).
scripts/ – Utility scripts (quantizing.py).
images/ – Sample gesture reference images.

Setup

Use Python 3.10+ and create a virtual environment (python -m venv gesture-env).
Activate the environment (source gesture-env/bin/activate on macOS/Linux).

Install dependencies:

pip install -r requirements.txt  # or install mediapipe, tensorflow, scikit-learn, opencv-python, matplotlib, pandas, numpy

Plug in a webcam and verify MediaPipe access before running the pipelines.

Model Training & Evaluation

Run the MLP training loop from the repository root:

python training/mlp-training.py

The script stratifies the dataset, computes class weights, and reports validation accuracy. It also writes models/scaler_params.npz, which captures the normalization statistics used during training—keep this file with the exported weights so inference on the FPGA or host matches your preprocessing. Keep accuracy above 0.90 to maintain reliable unlock sequences; adjust preprocessing or class weights if performance drops. Use training/gesture-logger.py to collect new samples and extend the dataset when onboarding new gestures.

Quantization & Deployment Artifacts

After training, regenerate the FPGA-ready weights:

python scripts/quantizing.py

This script expects the latest layer_* CSV exports in the models/ directory and rewrites models/quantized_weights.bin. Consume this binary inside your HDL/SoC project to initialize BRAM or ROM blocks. Track the checksum or git hash of each binary when flashing the Arty S7-25 to keep the hardware configuration auditable.

Live Gesture Testing

Use the webcam prototype to verify predictions end-to-end:

python gesture-pipelines/gesture-webcam.py

The script loads models/gesture_recognizer.task, applies the saved scaler parameters, feeds frames through MediaPipe, and evaluates the quantized weights in Python. Confirm latency and class stability here before synthesizing FPGA builds. When experimenting with new display peripherals, mirror FPGA output in the console to speed up debugging.

FPGA Integration Notes

Reserve BRAM for the three dense layers and plan a streaming interface for 21 landmark triplets produced by the host.
The unlock flow requires buffering four gestures; implement a state machine that mirrors the secure enclave logic.
Use the ESP32 WROOM module as a WiFi co-processor—define a narrow command protocol (FORWARD, LEFT, etc.) and expose diagnostics on UART for bring-up.
Plan to surface FPGA status (locked/unlocked, last gesture, radio link health) back to the Mac for operator visibility.

Roadmap

Short-term objectives include documenting the passcode enclave interface, scripting an automated export flow (train → quantize → package), and measuring end-to-end latency. Mid-term, add HDL testbenches for the MLP core and integrate display peripherals. Long-term, secure the communication path (host ↔ FPGA ↔ ESP32) and finalize robot control behaviors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gesture Model FPGA Pipeline

Overview

System Pipeline

Repository Layout

Setup

Model Training & Evaluation

Quantization & Deployment Artifacts

Live Gesture Testing

FPGA Integration Notes

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data-modifications		data-modifications
data		data
gesture-pipelines		gesture-pipelines
images		images
models		models
scripts		scripts
training		training
.gitignore		.gitignore
Full-Proj-README.md		Full-Proj-README.md
README.md		README.md

Pooshman/Gesture-Model

Folders and files

Latest commit

History

Repository files navigation

Gesture Model FPGA Pipeline

Overview

System Pipeline

Repository Layout

Setup

Model Training & Evaluation

Quantization & Deployment Artifacts

Live Gesture Testing

FPGA Integration Notes

Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages