Regulation RAG Multi-Tenant Chatbot — AI-powered chatbot for corporate regulation documents with citation-based answers.
HK-chatbot2 is a Retrieval-Augmented Generation (RAG) chatbot designed for querying corporate regulation PDFs. It provides accurate, citation-backed answers by searching through indexed documents and generating responses grounded in the source material.
- Smart PDF Parsing — Multi-strategy parser: text extraction → OCR fallback → table detection → VLM image description
- RAG with Citations — Every answer includes source references
[REF: document p.page] - Multi-Tenant — Isolated vector stores per tenant for different departments or organizations
- OCR Support — Scanned PDFs parsed via RapidOCR (Korean + English, ONNX-based, no GPU required)
- Table Extraction — Tables detected and preserved as structured markdown via pdfplumber
- VLM Integration — Charts and figures described by Vision-Language Model (Ollama cloud)
- Docker/Colima Ready — Single-command deployment, optimized for low-resource environments
- Korean IME Support — Proper hangul composition handling in chat input
- Heungkuk Life CI — Official corporate favicon applied
┌─────────────────────────────────────────────────────┐
│ Docker Container │
│ ┌──────────────┐ ┌────────────────────────────┐ │
│ │ React SPA │ │ FastAPI Backend │ │
│ │ (Vite+TW) │ │ │ │
│ │ │ │ ┌─────────┐ ┌───────────┐ │ │
│ │ /chat │◄─┤ │ RAG │ │ Smart PDF │ │ │
│ │ /documents │ │ │ Engine │ │ Parser │ │ │
│ │ │ │ │(LlamaIdx)│ │(OCR/VLM) │ │ │
│ └──────────────┘ │ └────┬────┘ └───────────┘ │ │
│ │ │ │ │
│ │ ┌────▼────┐ ┌───────────┐ │ │
│ │ │ChromaDB │ │ HuggingFace│ │ │
│ │ │(Vector) │ │ Embedding │ │ │
│ │ └─────────┘ └───────────┘ │ │
│ └────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
│
┌────▼─────┐
│ Ollama │ (Host machine)
│ LLM/VLM │
└──────────┘
| Layer | Technology |
|---|---|
| Frontend | React 18 + Vite + Tailwind CSS |
| Backend | FastAPI + LlamaIndex + CitationQueryEngine |
| Vector Store | ChromaDB (persistent, per-tenant collections) |
| Embeddings | HuggingFace intfloat/multilingual-e5-small (in-container) |
| LLM | Ollama (cloud or local models) |
| PDF Parsing | pypdf + pdfplumber + RapidOCR + Ollama VLM |
| Deployment | Docker Compose (Colima compatible) |
- Docker (via Colima or Docker Desktop)
- Ollama running on host machine
ollama pull nemotron-3-super:cloud
# 1. Clone
git clone https://github.com/biztalk72/HK-chatbot2.git
cd HK-chatbot2
# 2. Place PDF files in data/
cp /path/to/your/regulations/*.pdf data/
# 3. Build & run
docker compose up --build -d
# 4. Open browser
open http://localhost:8000After startup, index your PDF documents:
# Index all PDFs (OCR enabled by default)
curl -X POST http://localhost:8000/api/ingest \
-H 'Content-Type: application/json' \
-d '{"tenant_id": "default"}'
# Index with VLM for image/chart descriptions
curl -X POST http://localhost:8000/api/ingest \
-H 'Content-Type: application/json' \
-d '{"use_ocr": true, "use_vlm": true}'
# Index specific files only
curl -X POST http://localhost:8000/api/ingest \
-H 'Content-Type: application/json' \
-d '{"filenames": ["regulation_A.pdf", "regulation_B.pdf"]}'Or use the web UI: navigate to /documents and click "전체 인덱싱 시작".
- Documents (
/documents) — View indexed documents (title, summary, page count), trigger indexing - Chat (
/chat) — Ask questions about regulations, get citation-backed answers
Q: 개인정보 처리에 관한 규정은?
A: 개인정보 처리에 관하여... [REF: 개인정보보호규정.pdf p.5]
The parser automatically selects the best strategy per page:
| Strategy | Trigger | Tool |
|---|---|---|
| Text | Standard digital PDF | pypdf |
| OCR | Scanned pages (< 50 chars extracted) | RapidOCR (ONNX) |
| Table | Tables detected on page | pdfplumber → markdown |
| VLM | Images/charts detected + VLM enabled | Ollama vision model |
Each parsed page includes metadata: parse_method, parse_quality, has_tables, has_images.
| Method | Path | Description |
|---|---|---|
GET |
/api/health |
System status + Ollama connection |
GET |
/api/tenants |
List all tenants |
POST |
/api/tenants |
Create a new tenant |
GET |
/api/documents?tenant_id=X |
List indexed docs (title, summary, pages) |
GET |
/api/ingest/files |
List available PDFs in data/ |
POST |
/api/ingest |
Ingest PDFs (chunking + vectorizing) |
POST |
/api/chat |
RAG query with citation-backed answer |
{
"tenant_id": "default",
"filenames": null,
"use_ocr": true,
"use_vlm": false
}{
"tenant_id": "default",
"question": "연차휴가 규정은?",
"history": []
}HK-chatbot2/
├── data/ # PDF documents (mounted read-only)
├── server/
│ ├── main.py # FastAPI app entry point
│ ├── config.py # All configuration & env vars
│ ├── requirements.txt # Python dependencies
│ ├── rag/
│ │ ├── parsers.py # Smart PDF parser (text/OCR/table/VLM)
│ │ ├── ingestion.py # PDF → chunk → embed → ChromaDB pipeline
│ │ └── engine.py # RAG query engine with citations
│ ├── routers/
│ │ ├── chat.py # POST /api/chat
│ │ ├── documents.py # GET /api/documents
│ │ ├── ingest.py # POST /api/ingest
│ │ ├── health.py # GET /api/health
│ │ └── tenants.py # Tenant CRUD
│ └── tenants/
│ └── manager.py # JSON-based tenant management
├── frontend/
│ ├── src/
│ │ ├── App.tsx # Router (Chat / Documents)
│ │ ├── screens/
│ │ │ ├── Chat.tsx # Chat interface
│ │ │ └── Documents.tsx # Document list + indexing
│ │ └── components/
│ │ └── CitationBubble.tsx
│ ├── package.json
│ └── vite.config.ts
├── Dockerfile # Multi-stage build (Node + Python)
├── docker-compose.yml
├── CHANGELOG.md
├── README.md # English documentation
└── README.ko.md # Korean documentation
| Variable | Default | Description |
|---|---|---|
OLLAMA_BASE_URL |
http://host.docker.internal:11434 |
Ollama server URL |
LLM_MODEL |
qwen2.5:7b |
LLM for answer generation |
EMBED_MODEL |
intfloat/multilingual-e5-large |
HuggingFace embedding model |
VLM_MODEL |
glm-5:cloud |
Vision-Language model for images |
OCR_ENABLED |
true |
Enable OCR fallback for scanned PDFs |
VLM_ENABLED |
false |
Enable VLM for image/chart descriptions |
CHUNK_SIZE |
512 |
Text chunk size (tokens) |
CHUNK_OVERLAP |
64 |
Chunk overlap (tokens) |
SIMILARITY_TOP_K |
5 |
Top-K chunks retrieved per query |
CITATION_CHUNK_SIZE |
256 |
Citation chunk size |
# Backend
cd server
pip install -r requirements.txt
OLLAMA_BASE_URL=http://localhost:11434 \
uvicorn server.main:app --reload --port 8000
# Frontend (separate terminal)
cd frontend
npm install
npm run devcolima start --cpu 2 --memory 4
export DOCKER_HOST=unix://$HOME/.colima/default/docker.sock
docker compose up --build -dNote: With 2GB RAM, index PDFs in small batches (2–5 files) to avoid OOM.
- Mount
data/as read-only (:ro) — already configured - Named volumes for
chroma_dbandtenants_storepersistence - Configure specific CORS origins instead of
* - Consider adding authentication (Phase 3)
- Phase 1 — RAG pipeline with citation support
- Phase 1.5 — OCR + VLM smart PDF parsing
- Phase 1.6 — Heungkuk Life CI favicon, Korean IME fix, bilingual docs
- Phase 2 — QA log collection → LoRA fine-tuning
- Phase 3 — Auth/RBAC, per-tenant prompts, admin dashboard
MIT