A comprehensive medical report processing system that uses multiple AI agents to parse, analyze, and simplify medical reports for patients.
- Multi-Agent Pipeline: 7 specialized AI agents working together
- OCR Support: Extract text from images and PDFs
- Medical Report Processing: Parse structured medical data
- Patient-Friendly Translation: Convert medical jargon to simple language
- Safety Checking: Identify potential risks and inconsistencies
- Doctor Review: Simulate medical professional oversight
- Modern Web Interface: React-based frontend with real-time processing
- 🩺 DrStructuraAgent - Parses raw medical inputs into structured JSON
- 🧾 ReportGeneratorAgent - Generates professional medical reports
- 💬 LaymanTranslatorAgent - Simplifies medical jargon for patients
- 🌐 RegionalTranslatorAgent - Translates to patient-friendly language
⚠️ SafetyCheckerAgent - Checks for safety and critical alerts- 👨⚕️ DoctorReviewAgent - Doctor reviews and approves the report
- 🧠 GenieDocOrchestrator - Orchestrates the entire pipeline
- React TypeScript application
- Real-time file upload and processing
- Progress tracking and result visualization
- Tabbed interface for different report views
- Python 3.8+
- Node.js 16+
- Tesseract OCR
- Gemini API Key
-
Install Python dependencies:
cd backend pip install -r requirements.txt -
Run setup script:
python setup.py
-
Configure API Key:
- Get your Gemini API key from Google AI Studio
- Edit
backend/.envfile and replaceyour_gemini_api_key_herewith your actual API key
-
Install Tesseract OCR:
- Windows: Download from UB-Mannheim
- macOS:
brew install tesseract - Ubuntu:
sudo apt-get install tesseract-ocr
-
Start the backend:
python run_backend.py
The backend will run on
http://localhost:8000
-
Install Node.js dependencies:
cd frontend npm install -
Start the frontend:
npm start
The frontend will run on
http://localhost:3000
Create a .env file in the backend directory:
# Required: Get from https://makersuite.google.com/app/apikey
GEMINI_API_KEY=your_actual_api_key_here
# Optional
DEBUG=TrueGET /api/health- Health checkPOST /api/ocr_extract- Extract text from images/PDFsPOST /api/agent/dr_structura- Parse medical dataPOST /api/agent/report_generator- Generate reportsPOST /api/agent/layman_translator- Simplify languagePOST /api/agent/regional_translator- Patient-friendly translationPOST /api/agent/safety_checker- Safety analysisPOST /api/agent/doctor_review- Medical reviewPOST /api/agent/master_agent- Orchestrate pipeline
-
"Please configure GEMINI_API_KEY"
- Make sure you have a valid Gemini API key in
backend/.env - Get your key from Google AI Studio
- Make sure you have a valid Gemini API key in
-
"Tesseract not found"
- Install Tesseract OCR for your operating system
- Make sure it's in your system PATH
-
"Module not found" errors
- Run
pip install -r requirements.txtin the backend directory - Make sure you're using Python 3.8+
- Run
-
Frontend can't connect to backend
- Ensure backend is running on port 8000
- Check that proxy is configured in
frontend/package.json - Verify CORS settings in
backend/main.py
-
OCR extraction fails
- Check file format (supports: .txt, .pdf, .jpg, .jpeg, .png, .bmp, .tiff)
- Ensure Tesseract is properly installed
- Check file size (large files may timeout)
Set DEBUG=True in your .env file to see detailed error messages.
google comm/
├── backend/
│ ├── agents/ # AI agent modules
│ ├── main.py # FastAPI application
│ ├── requirements.txt # Python dependencies
│ ├── setup.py # Setup script
│ └── .env # Environment variables
├── frontend/
│ ├── src/
│ │ ├── components/ # React components
│ │ └── App.tsx # Main application
│ ├── package.json # Node.js dependencies
│ └── tsconfig.json # TypeScript config
└── run_backend.py # Backend launcher
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
This project is licensed under the MIT License.
This is a demonstration system and should not be used for actual medical diagnosis or treatment. Always consult with qualified healthcare professionals for medical advice.