MCP-Based Document Intelligence Platform

A production-grade Model Context Protocol (MCP) powered document intelligence system that enables tool-aware, prompt-driven, retrieval-augmented question answering over PDFs using OpenAI, ChromaDB, and a custom MCP client–server architecture.

This project demonstrates how modern LLM applications can expose capabilities (tools, resources, prompts) via MCP and allow intelligent clients to reason, retrieve, and respond dynamically.

🚀 What This Project Does

📄 Indexes PDF documents into a persistent Chroma vector store
🧠 Exposes document search as an MCP tool (query_document)
🔎 Performs semantic retrieval using OpenAI embeddings
🤖 Lets an LLM decide when to call tools vs answer directly
🧩 Supports MCP resources (readable PDFs)
🧠 Supports MCP prompt templates (deep analysis, extraction, etc.)
💬 Maintains multi-turn conversational memory
🔁 Implements a full OpenAI tool-calling loop
🖥 Runs fully locally via STDIO-based MCP transport

💡 Why This Matters

Traditional RAG systems tightly couple retrieval logic with the application.

This project demonstrates a protocol-first architecture where:

Capabilities are discoverable
Clients are model-agnostic
Tools, resources, and prompts are first-class primitives
LLMs can reason over what the system can do

This mirrors how enterprise agent platforms and multi-agent systems are being built today.

🏗️ High Level Architecture Diagram

flowchart LR
    User["👤 User<br/>(Terminal)"]

    subgraph Client["🧠 MCP Client"]
        CLI["client.py<br/>Chat Loop"]
        Memory["Conversation Memory<br/>(message_history)"]
        ToolLoop["OpenAI Tool Loop<br/>(function calling)"]
    end

    subgraph Server["🧩 MCP Server"]
        MCP["MCP Server<br/>(stdio)"]
        Tools["Tools<br/>query_document"]
        Resources["Resources<br/>PDFs"]
        Prompts["Prompt Templates"]
    end

    subgraph Vector["📦 Vector Store"]
        Chroma["ChromaDB<br/>(Persistent)"]
        Emb["OpenAI Embeddings"]
    end

    User --> CLI
    CLI --> ToolLoop
    ToolLoop --> MCP
    MCP --> Tools
    MCP --> Resources
    MCP --> Prompts
    Tools --> Chroma
    Chroma --> Emb

🏗️ Execution Sequence (End-to-End)

sequenceDiagram
    participant U as User
    participant C as MCP Client
    participant L as OpenAI LLM
    participant S as MCP Server
    participant V as ChromaDB

    U->>C: Ask a question
    C->>L: Send conversation + available tools
    L-->>C: Tool call decision (or direct answer)
    C->>S: Execute MCP tool (query_document)
    S->>V: Semantic search
    V-->>S: Top-K chunks
    S-->>C: Tool response
    C->>L: Send tool result
    L-->>C: Final grounded answer
    C-->>U: Display answer

📁 Project Structure

mcp-document-intelligence/
├── MCP_Setup.ipynb           # One-time ingestion: PDF → chunks → embeddings → Chroma
├── mcp_server.py             # MCP server exposing tools, resources, prompts
├── client.py                 # MCP client with OpenAI tool loop + chat UI
│
├── testing/
│   └── .gitkeep              # Placeholder (PDFs ignored by git)
├── .gitignore                # Ignores envs, chroma, PDFs, caches
│
├── pyproject.toml            # uv project config
├── uv.lock                   # Locked dependencies
└── README.md                 # Project documentation

🔄 End-to-End Pipeline

1️⃣ Document Ingestion (Offline)

Trigger

Triggered manually via notebook

Steps

Load PDF documents
Chunk documents into semantic segments
Generate embeddings using OpenAI
Persist vectors + metadata to ChromaDB

This step is decoupled from runtime querying.

2️⃣ MCP Server Initialization

When the server starts:

Registers tools, resources, and prompts.
Connects to the persistent Chroma collection.
Exposes everything via MCP descriptors.
Clients can discover capabilities dynamically.

3️⃣ Runtime Querying (Online)

User asks a question
Client builds conversation history
Client sends:
- Messages
- Available Tools
OpenAI decides:
- Answer directly or
- Call query_document
Tool executes via MCP
Results returned to LLM
Final grounded answer generated

✅ Tool-Aware Reasoning Example

Direct Answer (No Tool Call)

  Query: What is the capital of Telangana?

→ LLM answers directly

Tool-Based Answer

Query: What is the main topic discussed in the document?

→ LLM requests query_document
→ MCP executes semantic search
→ LLM grounds answer in retrieved chunks

🛠️ Prerequisites

Local Development

Python 3.11+
uv – fast Python package & environment manager
Git
OpenAI, OpenAI API key

⚙️ Setup Instructions

1️⃣ Clone the repository

git clone https://github.com/your-username/mcp-document-intelligence.git
cd mcp-document-intelligence

2️⃣ Create and activate a virtual environment

This project uses uv for fast and reproducible Python environments.

uv venv
source .venv/bin/activate

You should now see (.venv) in your terminal prompt.

3️⃣ Install dependencies

Install all required dependencies exactly as defined in pyproject.toml and uv.lock.

uv sync

4️⃣ Configure environment variables

Create a .env file inside the weather/ directory:

OPENAI_API_KEY=your_openai_api_key

5️⃣ Run ingestion (one-time)

uv run jupyter notebook MCP_Setup.ipynb

Start MCP server and Client

uv run python client.py mcp_server.py

Example query:

/prompts
/prompt deep_analysis methodology
/resources
/resource document://pdf/ft_guide
/tools

🚀 Future Enhancements

Multi-round tool execution loop
Streaming responses
Authenticated MCP endpoints
Web-based client (FastAPI / WebSockets)
Multi-agent orchestration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MCP-Based Document Intelligence Platform

This project demonstrates how modern LLM applications can expose capabilities (tools, resources, prompts) via MCP and allow intelligent clients to reason, retrieve, and respond dynamically.

🚀 What This Project Does

💡 Why This Matters

🏗️ High Level Architecture Diagram

🏗️ Execution Sequence (End-to-End)

📁 Project Structure

🔄 End-to-End Pipeline

1️⃣ Document Ingestion (Offline)

2️⃣ MCP Server Initialization

3️⃣ Runtime Querying (Online)

✅ Tool-Aware Reasoning Example

🛠️ Prerequisites

Local Development

⚙️ Setup Instructions

1️⃣ Clone the repository

2️⃣ Create and activate a virtual environment

3️⃣ Install dependencies

4️⃣ Configure environment variables

5️⃣ Run ingestion (one-time)

Start MCP server and Client

Example query:

🚀 Future Enhancements

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
testing		testing
.gitignore		.gitignore
MCP_Setup.ipynb		MCP_Setup.ipynb
README.md		README.md
client.py		client.py
mcp_server.py		mcp_server.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

MCP-Based Document Intelligence Platform

This project demonstrates how modern LLM applications can expose capabilities (tools, resources, prompts) via MCP and allow intelligent clients to reason, retrieve, and respond dynamically.

🚀 What This Project Does

💡 Why This Matters

🏗️ High Level Architecture Diagram

🏗️ Execution Sequence (End-to-End)

📁 Project Structure

🔄 End-to-End Pipeline

1️⃣ Document Ingestion (Offline)

2️⃣ MCP Server Initialization

3️⃣ Runtime Querying (Online)

✅ Tool-Aware Reasoning Example

🛠️ Prerequisites

Local Development

⚙️ Setup Instructions

1️⃣ Clone the repository

2️⃣ Create and activate a virtual environment

3️⃣ Install dependencies

4️⃣ Configure environment variables

5️⃣ Run ingestion (one-time)

Start MCP server and Client

Example query:

🚀 Future Enhancements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages