SightSpeech / Sight-to-Speech

A React app that reads words aloud from a camera feed, controlled by hand gestures from legally blind users.
Based on MediaPipe, OCR, and custom gesture detection, this tool empowers visually impaired users to access visual text in their environment.

✨ Features

Real-time camera capture and processing
Hand gesture recognition (e.g. “point left”, “O”, open palm)
OCR (text recognition) on camera frames
Text-to-speech output to read recognized words aloud
Lightweight fallback and buffering to avoid flicker errors

🏗 Architecture & Components

Component	Responsibility
Frontend (React / Next.js / “use client”)	Captures video, draws landmarks, sends gestures
Gesture Recognizer (MediaPipe Tasks–Vision)	Detects hand landmarks & base gesture categories
Custom Gesture Overrides	Rules-based detection for “O”, “point left”, etc.
Stable Gesture Buffering	Avoids flicker by requiring consistent predictions
Keypress Simulation	Emits synthetic key events mapped to gestures
Backend / OCR / TTS (Flask or similar)	Processes camera frames, runs OCR, reads text aloud

🛠️ Setup & Run

Clone the repo

git clone https://github.com/groffbo/sight-to-speech.git
cd sight-to-speech

Clone the repo

python3 -m venv venv
source venv/bin/activate   # On macOS/Linux
venv\Scripts\activate      # On Windows PowerShell

Clone the repo

pip install --upgrade pip
pip install -r requirements.txt

Clone the repo
```
python3 app.py
```
Clone the repo
```
npm install
npm run dev
```
or yarn dev depending on your setup.
Clone the repo
Go to http://localhost:3000 (or whatever port your frontend uses). Allow camera access.

🖐 Gesture Mapping (Default)

Gestures	Descriptions / Use
`Open_Palm`	Start
`Closed_Fist`	Description
`Pointing_Up`	Tab
`Pointing_Left`	Backwards Tab

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
backend		backend
sightspeech		sightspeech
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SightSpeech / Sight-to-Speech

✨ Features

🏗 Architecture & Components

🛠️ Setup & Run

🖐 Gesture Mapping (Default)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SightSpeech / Sight-to-Speech

✨ Features

🏗 Architecture & Components

🛠️ Setup & Run

🖐 Gesture Mapping (Default)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages