RAG-to-MCP Workshop Submission: participant/sarvesh-pune by SnowKingSandy · Pull Request #13 · nasscomAI/rag-to-mcp

SnowKingSandy · 2026-04-18T07:51:38Z

RAG-to-MCP — Submission PR

Name: Sarvesh
City / Group: Pune
Date: 18 April 2026
AI tool(s) used: GitHub Copilot (Claude Haiku 4.5)

Submission Checklist

UC-0A — Complaint Classifier

Which failure mode did you encounter first?

Taxonomy drift — the naive prompt without enum constraints invented category names like "Road Issue" and "Water Problem" instead of using the exact schema values (Pothole, Flooding, Streetlight, etc.).

Which enforcement rule fixed it?

"Category must be exactly one value from the allowed list: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other. No variations or invented names."

UC-0A Fix taxonomy drift and severity blindness: no fixed enum and no keyword detection → implemented R.I.C.E framework with strict category enum, severity keyword enforcement, reason justification, ambiguity detection, and classifier results for all cities

Verification checkpoints:

Chunking produces multiple chunks per document (6 chunks from 3 documents)
Sentence boundaries respected — no clauses split mid-sentence
Out-of-scope queries return refusal template (not hallucinated answers)
Retrieved chunks scored and cited with document name and chunk index

UC-MCP — MCP Server

Paste your tool description from mcp_server.py TOOL_DEFINITION:

"Answers questions about City Municipal Corporation (CMC) policy documents: HR Leave Policy, IT Acceptable Use Policy, and Finance Reimbursement Policy. Returns answers grounded in retrieved document chunks with cited sources. Questions outside these three documents return a refusal message — this tool does not answer general knowledge questions, budget forecasts, or topics not covered by the indexed CMC policy documents."

Does it state the document scope explicitly?

Yes — names all three policy documents and explicitly states what the tool will not answer.

Run result: python test_client.py --port 8765 --run-all

Did the budget forecast question return isError: true?

Yes — no chunk scored above 0.3 for this query. The refusal template was returned with isError: true and no LLM call was made.

In one sentence — why is the tool description the enforcement?

The agent reads the tool description to decide when to call the tool, so a specific scope description prevents implicit permission to call it for out-of-scope questions.

UC-MCP Fix vague tool description and context breach: no scope stated and no error enforcement → implemented R.I.C.E framework with explicit CMC policy scope in tool description, mandatory isError on refusals, JSON-RPC 2.0 compliant error handling, tools/list and tools/call implementation

Verification checkpoints:

Tool description explicitly states document scope (CMC policies)
Tool description states refusal behavior for out-of-scope queries
python test_client.py --run-all executes all tests successfully
Budget forecast question returns isError: true
JSON-RPC -32601 error for unknown methods
All responses use HTTP 200

CRAFT Reflection

Which step of the CRAFT loop was hardest across all three UCs?

Constrain — specifically calibrating the similarity threshold in UC-RAG. The README specified 0.6, but empirical testing showed SentenceTransformer (all-MiniLM-L6-v2) produces scores of 0.2–0.5 for semantically related policy text. Lowering to 0.3 threshold while maintaining refusal for truly out-of-scope queries (budget forecasts) required end-to-end testing and observation of actual distance values.

What did you add to agents.md manually that the AI did not generate?

In UC-RAG agents.md, the explicit cross-document separation rule: "If a query spans two documents, retrieve from each separately. Never merge retrieved chunks from different documents into a single blended answer." The AI generated a generic grounding rule but did not restrict per-document retrieval, which is the specific enforcement needed to prevent IT+HR policy blending.

One specific task in your real work where you will use R.I.C.E in the next 7 days:

Building an internal complaint routing bot for a civic platform — complaints are currently routed manually to 12 different departments. I will apply RICE to scope the router strictly to the complaint taxonomy and enforce that misclassified complaints are flagged for manual review rather than auto-routed with false confidence.

Technical Notes

Dependencies installed: sentence-transformers, chromadb, nltk
NLTK punkt_tab downloaded: Required for sentence tokenization
RAG similarity threshold: 0.3 (empirically calibrated to balance recall vs. hallucination)
MCP JSON-RPC compliance: All responses HTTP 200, errors in JSON-RPC error object
All 4 cities tested: Pune, Hyderabad, Kolkata, Ahmedabad

… keyword detection → implemented R.I.C.E framework with strict category enum, severity keyword enforcement, reason justification, ambiguity detection, and classifier results for all cities

…ts and no enforcement → implemented sentence-aware chunking (max 400 tokens), 0.6 similarity threshold, mandatory citation, context grounding from retrieved chunks only, and cross-document separation

… and no error enforcement → implemented R.I.C.E framework with explicit CMC policy scope in tool description, mandatory isError on refusals, JSON-RPC 2.0 compliant error handling, tools/list and tools/call implementation

github-actions · 2026-04-18T07:51:47Z

Hi there, participant! Thanks for joining our RAG-to-MCP Workshop!

We're reviewing your PR for the 3 Use Cases (UC-0A, UC-RAG, UC-MCP). Once your submission is validated and merged, you'll be awarded your completion badge!

Next Steps:

Make sure all 3 UCs are finished.
Ensure your commit messages match the required format.
Fill out every section of the PR template.
Good luck!

SnowKingSandy added 3 commits April 18, 2026 12:59

UC-0A Fix taxonomy drift and severity blindness: no fixed enum and no…

f774e03

… keyword detection → implemented R.I.C.E framework with strict category enum, severity keyword enforcement, reason justification, ambiguity detection, and classifier results for all cities

UC-RAG Fix chunk boundary failure and context breach: fixed-size spli…

40ca33a

…ts and no enforcement → implemented sentence-aware chunking (max 400 tokens), 0.6 similarity threshold, mandatory citation, context grounding from retrieved chunks only, and cross-document separation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAG-to-MCP Workshop Submission: participant/sarvesh-pune#13

RAG-to-MCP Workshop Submission: participant/sarvesh-pune#13
SnowKingSandy wants to merge 3 commits intonasscomAI:masterfrom
SnowKingSandy:participant/sarvesh-pune

SnowKingSandy commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SnowKingSandy commented Apr 18, 2026

RAG-to-MCP — Submission PR

Submission Checklist

UC-0A — Complaint Classifier

UC-MCP — MCP Server

CRAFT Reflection

Technical Notes

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant