Enterprise-grade RAG (Retrieval-Augmented Generation) application built on Azure. Upload documents, ingest and index them into Azure AI Search, then chat with your data through a conversational AI that answers strictly from your sources — with citations.
Based on Microsoft's Chat With Your Data Solution Accelerator, customized for production use.
Four deployable components:
| Component | Stack | Role |
|---|---|---|
| Web App | React (Vite/TypeScript) + Flask (Python) | Chat UI and conversation API |
| Admin App | Streamlit (Python) | Data ingestion, exploration, deletion, prompt config |
| Backend | Azure Functions (Python) | Document processing pipeline, embedding generation |
| Infrastructure | Azure Bicep (IaC) | Full Azure resource provisioning via azd |
Users → Web App (Flask + React) → Azure OpenAI (GPT-3.5/4)
→ Azure AI Search (hybrid vector + keyword)
→ Azure Blob Storage (raw documents)
Admin → Streamlit App → Azure Functions → Document Intelligence (OCR)
→ Azure AI Search (indexing)
→ Azure Blob Storage
- Multi-format ingestion — PDF, DOCX, TXT, HTML, Markdown, JPG, PNG, URLs
- Multiple chunking strategies — Layout-based, page-based, paragraph-based, fixed-size-overlap
- Hybrid search — Vector + keyword search with optional semantic ranking via Azure AI Search
- Citation-grounded answers — Every response traces back to source documents with
[docN]references - Three orchestration strategies (pluggable via env var):
openai_function— OpenAI function callingsemantic_kernel— Microsoft Semantic Kernellangchain— LangChain agent
- Two conversation flows:
custom— Full RAG pipeline (chunking → embedding → retrieval → generation)byod— Azure OpenAI "on your data" API with Azure Search as data source
- Integrated vectorization — Optional Azure AI Search integrated vectorization (indexer + skillset)
- Content safety — Azure AI Content Safety for filtering harmful queries/responses
- Post-answer fact-check — Optional validation that the generated answer aligns with sources
- Speech-to-text — Azure Speech Services for voice input
- Conversation logging — Interactions and token usage logged to a dedicated search index
- Observability — Azure Application Insights + OpenTelemetry
- Config-driven — Prompts, chunking, processors, orchestration strategy all defined in JSON config
Backend: Python 3.10+, Flask, Streamlit, Azure OpenAI SDK, Azure AI Search, Azure Form Recognizer, LangChain, Semantic Kernel, OpenTelemetry
Frontend: React 18, Vite, TypeScript, Fluent UI, React Markdown, Azure Speech SDK
Infrastructure: Azure Bicep, Azure Developer CLI (azd), Docker Compose, App Service, Azure Functions
Testing: pytest, Cypress (E2E), Vitest (frontend), pre-commit hooks (flake8 + black)
code/
├── app.py, create_app.py # Flask web app (chat API + static files)
├── frontend/ # React chat UI (Vite + TypeScript)
├── backend/
│ ├── Admin.py # Streamlit admin app
│ ├── pages/ # Streamlit pages (ingest, explore, delete, config)
│ └── batch/ # Azure Functions (document processing)
│ └── utilities/ # Core: chunking, loading, embedding, search, orchestration
├── tests/ # Unit + functional tests
infra/ # Azure Bicep IaC
docker/ # Dockerfiles + docker-compose
data/ # Sample documents for testing
docs/ # Documentation and ADRs
cp .env.sample .env # Fill in Azure resource values
make docker-compose-upServices: web on :8080, admin on :8081, backend on :8082.
poetry install
cd code/frontend && npm install
make build-frontend
make unittestazd auth login
azd upProvisions all Azure resources (App Service, Functions, AI Search, OpenAI, Key Vault, Storage, etc.) and deploys all three services.
The Streamlit admin app provides four pages:
- Ingest Data — Upload and process documents into the search index
- Explore Data — Inspect how documents were chunked and indexed
- Delete Data — Remove indexed documents
- Configuration — Adjust prompts, logging, and orchestration settings
All rights reserved.
Cherif Benham