Skip to content

sprc9034/Aegis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aegis AI: Cryptographic & Machine Learning Fraud Prevention for PwD Certificates

Aegis AI is a secure, dual-authority verification system designed to eliminate disability certificate fraud in high-stakes competitive examinations (such as JEE and NEET).

By combining zero-trust cryptographic vaults with Natural Language Processing (NLP), Machine Learning (ML), and real-time Computer Vision, Aegis AI addresses key loopholes including:

Photographic/Photoshop Forgeries: Bypassed by encrypting certificates with high-entropy keys managed through a secure central database.

Clinical Inflation (Exaggerating minor impairments to cross the 40% benchmark): Audited using an NLP-driven RandomForestRegressor trained on medical board standards to predict true impairment percentage from raw clinical text.

Impersonation/Scribe Abuse: Flagged dynamically in real-time by a secondary MediaPipe-based behavioral tracking script.

🏗️ System Architecture Flow

[ Medical Authority (Port 5001) ] ────────► Generates secure PDF ───► Logs key to vault.json │ ┌──────────────────────────────────────────────────────────────────────────┘ │ ▼ [ Candidate Portal ] ──────────────────────► Uploads encrypted PDF & App ID │ ┌──────────────────────────────────────────────────────────────────────────┘ │ ▼ [ Exam Authority (Port 5002) ] ────────────► Fetches key, Decrypts PDF in-memory ► OCR Text Extraction (Tesseract) ► ML Model prediction (Random Forest) ► Flags anomalies instantly if deviation > 15%

📁 Repository Structure

Aegis_AI/ ├── exam_authority/ │ ├── dashboard.html # Exam verification portal UI │ └── main.py # Port 5002: Unified server & ML forensics engine ├── medical_authority/ │ ├── cert_generate.py # ReportLab and PyPDF2 encryption logic │ ├── dashboard.html # Hospital certificate generation UI │ └── main.py # Port 5001: Certificate generator server ├── poppler-26.02.0/ # Local Poppler binaries (Windows layout) ├── app.py # OCR Backend Flask Server ├── behavior_analysis.py # MediaPipe Computer Vision tracker ├── medical_cases.csv # Professional clinical training dataset (80+ cases) ├── vault.json # Shared key database (automatically generated) ├── requirements.txt # Python dependencies └── README.md # Project documentation (this file)

🛠️ Prerequisites & Installation

To run Aegis AI, your system requires both Python libraries and underlying system-level OCR/PDF rendering binaries.

  1. System Dependencies (Crucial)

A. Tesseract OCR (Text Extraction Engine)

Windows: Download and run the 64-bit installer from the UB-Mannheim Tesseract Wiki.

Ensure it is installed to the default path: C:\Program Files\Tesseract-OCR\tesseract.exe.

macOS: Install via Homebrew:

brew install tesseract

Linux (Ubuntu/Debian):

sudo apt-get install tesseract-ocr

B. Poppler (PDF-to-Image Conversion)

Windows:

The system expects a local Poppler folder named poppler-26.02.0 in your root project directory.

Make sure the executable file pdftoppm.exe is located at poppler-26.02.0/Library/bin/pdftoppm.exe or poppler-26.02.0/bin/pdftoppm.exe.

macOS: Install via Homebrew:

brew install poppler

Linux (Ubuntu/Debian):

sudo apt-get install poppler-utils

  1. Python Environment Setup

Clone the repository and navigate to your project directory:

git clone https://github.com/your-username/Aegis_AI.git cd Aegis_AI

Create a virtual environment:

python -m venv venv

Activate the virtual environment:

Windows (PowerShell):

.\venv\Scripts\activate

macOS / Linux:

source venv/bin/activate

Install the mandatory project requirements:

pip install -r requirements.txt

(Optional) If you plan to run the Computer Vision tracking script (behavior_analysis.py), install its dependencies:

pip install opencv-python mediapipe

🚦 How to Run the Project

Step 1: Start the Medical Authority Server (Port 5001)

In your first terminal tab (with your virtual environment active):

python medical_authority/main.py

What this does: Initializes the medical issuance pipeline, ready to generate secure password-protected certificates and write decryption passwords to vault.json.

Step 2: Start the Exam Authority Server (Port 5002)

In your second terminal tab (with your virtual environment active):

python exam_authority/main.py

What this does: Automatically loads the clinical dataset from medical_cases.csv, trains the TF-IDF feature extractor and RandomForestRegressor, and opens up the verification API endpoint.

🔄 Step-by-Step User Workflow

Certificate Generation:

Open medical_authority/dashboard.html in your web browser.

Fill out the form fields with a candidate's information, disability type, percentage, and clinical description.

Click Generate & Encrypt.

Copy the returned Application ID (e.g., AEG-123456) and download the secure PDF certificate.

Run the Forensic Audit:

Open exam_authority/dashboard.html in your browser.

Upload the downloaded PDF certificate.

Paste the copied Application ID.

Click Run AI Forensic Check.

The backend will fetch the matching decryption key from vault.json, decrypt the file, run OCR, extract data, and output whether the document is VERIFIED or FLAGGED (deviation > 15%).

Behavioral Proctoring Check (Optional):

Run the pose-tracking file in your terminal:

python behavior_analysis.py

It will open your webcam feed and dynamically flag high-velocity joint movements on the candidate's monitored "impaired" arm relative to the certificate threshold.

🛡️ Troubleshooting

Poppler / PDF-to-Image Errors: If you get an error saying Unable to get page count, make sure your local Poppler folder is named exactly poppler-26.02.0 and that it contains the Library/bin folder.

Tesseract Errors: If the application fails to locate Tesseract, verify that tesseract.exe is in your standard Windows program files directory (C:\Program Files\Tesseract-OCR\tesseract.exe). If you installed it elsewhere, modify the variable at the top of exam_authority/main.py:

pytesseract.pytesseract.tesseract_cmd = r'Your\Custom\Path\tesseract.exe'

NumPy/JSON Serialization Error: Ensure you are using the latest version of exam_authority/main.py where all statistical metrics are cast to native Python types (int, float, bool) before being serialized to JSON.

About

A ML based faud detection system for PwD certificates

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors