SignSense - Real-Time Sign Language Recognition

SignSense is a real-time American Sign Language (ASL) recognition desktop application built in Python. Designed as a precise, developer-focused tooling interface, it uses a live webcam feed to detect, classify, and track static alphabet signs.

The visual identity is minimal, functional, and clean. It features interactive overlay panels, realtime probability confidence scoring, left-and-right hand tracking, and a dedicated Word Builder mode for text assembly. Built without any external ML models, its recognition engine relies on deep 3D MediaPipe hand landmark spatial tracking and heuristic geometric rules.

Built With (Technology Stack)

This project is built using the following core technologies:

Python 3.12: The core programming language used for the application logic. (Note: Python 3.12 is explicitly required because newer versions of MediaPipe on Python 3.13+ drop support for the legacy solutions API used in this project).
OpenCV (opencv-python): Used for interfacing with the webcam, capturing real-time video frames, and rendering the custom heads-up display (HUD) overlay and interface elements.
Google MediaPipe: Utilizes the legacy mediapipe.solutions.hands API to extract 21 3D landmarks of a hand in real-time. Pinned to <0.10.15 to ensure API and namespace compatibility.
NumPy: Used for high-performance mathematical operations, matrix manipulations, and Euclidean distance calculations between hand landmarks.
uv (Astral): Recommended for lightning-fast virtual environment creation and dependency resolution.

Step-by-Step Installation Guide

Follow these steps to get the application running on your local machine:

1. Prerequisites

Ensure you have the following installed on your system:

Python 3.12 (highly recommended to avoid MediaPipe compatibility issues)
A working webcam connected to your computer

2. Clone the Repository

git clone https://github.com/yourusername/SignSense.git
cd SignSense

3. Create a Virtual Environment

It is highly recommended to isolate the project dependencies. If you use uv, run:

uv venv --python 3.12

Alternatively, using standard Python venv:

python3.12 -m venv .venv

4. Activate the Environment

Windows (PowerShell):
```
.\.venv\Scripts\Activate.ps1
```
macOS / Linux:
```
source .venv/bin/activate
```

5. Install Dependencies

Install the required packages strictly from the requirements.txt to avoid API breaking changes:

pip install -r requirements.txt

(Or if using uv: uv pip install -r requirements.txt)

6. Run the Application

Start the ASL recognizer by passing the Python script to your environment:

python sign_language_recognition.py

Note: The application requires camera permissions from your OS. It defaults to camera index 0.

⌨️ Controls & Keyboard Shortcuts

While the webcam window is focused, you can use the following standard keyboard shortcuts:

Key	Action
`Q` / `ESC`	Quit the application cleanly
`TAB`	Toggle Word Builder Mode (text assembly) on/off
`SPACE`	Pause feed (Standard) OR Add word to sentence (Word Builder Mode)
`BACKSPACE`	Clear history (Standard) OR Delete last letter (Word Builder Mode)
`S`	Save a timestamped screenshot of the current frame to `./screenshots/`

Supported Signs Reference

The application supports most static ASL alphabet letters and some common gestures.

Category	Letters
Single finger	I (pinky), D (index), X (hooked index)
Two fingers	H, U, V, R, K, L
Three fingers	W
Four fingers	B
Full hand	C, O, E, S, A
Thumb combos	Y, L, T, G
Pinches	F, O, D
Special	ILY 🤟 (thumb + index + pinky)

⚠️ Note: Dynamic signs (like J and Z), which require fluid hand motion to signify correctly, are formally excluded from this model as it exclusively analyzes static geometrical frames.

⚠️ Known Limitations & Ambiguities

While this application successfully mitigates flat planar tracking issues by utilizing MediaPipe's true 3D spatial (Z-axis) vector measurements and left-hand mirroring, consider the following limitations:

Finger Crossovers: The R sign is approximated best-effort and will classify largely based on the index and middle fingers extending, which mimics the H and U states heavily.
Camera Angle: For maximum accuracy, keep your hand squared and flat directed to the camera so that lateral overlapping (e.g., crossing thumbs) is visibly clear to the MediaPipe tracker.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
__pycache__		__pycache__
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
sign_language_recognition.py		sign_language_recognition.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SignSense - Real-Time Sign Language Recognition

Built With (Technology Stack)

Step-by-Step Installation Guide

1. Prerequisites

2. Clone the Repository

3. Create a Virtual Environment

4. Activate the Environment

5. Install Dependencies

6. Run the Application

⌨️ Controls & Keyboard Shortcuts

Supported Signs Reference

⚠️ Known Limitations & Ambiguities

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SignSense - Real-Time Sign Language Recognition

Built With (Technology Stack)

Step-by-Step Installation Guide

1. Prerequisites

2. Clone the Repository

3. Create a Virtual Environment

4. Activate the Environment

5. Install Dependencies

6. Run the Application

⌨️ Controls & Keyboard Shortcuts

Supported Signs Reference

⚠️ Known Limitations & Ambiguities

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages