Conversation
…ifier with strict allowed list
…aware chunking and 0.6 similarity threshold
… scope and isError refusal logic
|
Hi there, participant! Thanks for joining our RAG-to-MCP Workshop! We're reviewing your PR for the 3 Use Cases (UC-0A, UC-RAG, UC-MCP). Once your submission is validated and merged, you'll be awarded your completion badge! Next Steps:
|
There was a problem hiding this comment.
Pull request overview
Implements the workshop deliverables across UC-0A (complaint classifier), UC-RAG (RAG server), and UC-MCP (MCP wrapper), replacing the starter placeholders in agents.md/skills.md and adding runnable Python implementations and outputs.
Changes:
- Filled in YAML-based
agents.mdandskills.mddefinitions across UC-0A / UC-RAG / UC-MCP. - Implemented a heuristic complaint classifier and produced
results_pune.csv. - Implemented an initial ChromaDB + SentenceTransformer RAG server and a JSON-RPC MCP HTTP server.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| uc-rag/skills.md | Replaces placeholder skill definitions with concrete YAML for chunking and retrieval/answering. |
| uc-rag/rag_server.py | Adds a runnable RAG server (chunking, indexing, querying, Gemini call). |
| uc-rag/agents.md | Replaces placeholder agent YAML with RAG enforcement rules. |
| uc-mcp/skills.md | Replaces placeholder MCP skill definitions with concrete YAML. |
| uc-mcp/mcp_server.py | Implements JSON-RPC tools/list and tools/call HTTP handling. |
| uc-mcp/agents.md | Replaces placeholder MCP agent YAML with enforcement rules around scope and protocol. |
| uc-0a/skills.md | Replaces placeholder UC-0A skill definitions with concrete YAML. |
| uc-0a/results_pune.csv | Adds the expected classifier output CSV for Pune. |
| uc-0a/classifier.py | Implements heuristic complaint classification + batch CSV processing. |
| uc-0a/agents.md | Replaces placeholder UC-0A agent YAML with taxonomy/severity enforcement rules. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| chunks = self.chunk_text(content) | ||
| for i, chunk_text in enumerate(chunks): | ||
| chunk_id = f"{filename}_{i}" | ||
| embedding = self.encoder.encode(chunk_text).tolist() | ||
| self.collection.add( | ||
| ids=[chunk_id], | ||
| embeddings=[embedding], | ||
| metadatas=[{"doc_name": filename, "chunk_index": i}], | ||
| documents=[chunk_text] | ||
| ) |
There was a problem hiding this comment.
build_index reuses an existing collection and calls collection.add(...) with deterministic IDs ({filename}_{i}). Re-running --build-index will attempt to re-add existing IDs and can fail or create duplicates depending on Chroma settings. Consider deleting/recreating the collection (as uc-rag/stub_rag.py does) or using an upsert/update strategy before adding chunks.
| def query(self, user_query, threshold=0.6, naive=False): | ||
| """Retrieve and answer.""" | ||
| if naive: | ||
| print("Running NAIVE prompt (no RAG)...") | ||
| return self.call_llm(user_query, context="") |
There was a problem hiding this comment.
--naive mode is supposed to load all documents into the LLM context (per uc-rag/README.md) to demonstrate the failure modes. Currently it calls the LLM with an empty context string, so it won’t reproduce the intended naive baseline. Consider reading and concatenating the policy docs when naive=True (or implement a dedicated naive_query).
| def query(self, user_query, threshold=0.6, naive=False): | |
| """Retrieve and answer.""" | |
| if naive: | |
| print("Running NAIVE prompt (no RAG)...") | |
| return self.call_llm(user_query, context="") | |
| def _load_all_policy_documents(self): | |
| """Load and concatenate all policy documents for naive mode.""" | |
| candidate_dirs = [ | |
| os.path.join(os.getcwd(), "data"), | |
| os.path.join(os.getcwd(), "policies"), | |
| os.path.join(os.path.dirname(__file__), "data"), | |
| os.path.join(os.path.dirname(__file__), "policies"), | |
| ] | |
| for data_dir in candidate_dirs: | |
| if not os.path.isdir(data_dir): | |
| continue | |
| documents = [] | |
| for filename in sorted(os.listdir(data_dir)): | |
| if not filename.endswith(".txt"): | |
| continue | |
| path = os.path.join(data_dir, filename) | |
| with open(path, 'r', encoding='utf-8') as f: | |
| content = f.read().strip() | |
| if content: | |
| documents.append(f"Source: {filename}\n{content}") | |
| if documents: | |
| return "\n\n---\n\n".join(documents) | |
| return "" | |
| def query(self, user_query, threshold=0.6, naive=False): | |
| """Retrieve and answer.""" | |
| if naive: | |
| print("Running NAIVE prompt (all documents, no retrieval)...") | |
| naive_context = self._load_all_policy_documents() | |
| return self.call_llm(user_query, context=naive_context) |
| - "Every answer must cite the source document name and chunk index (e.g., HR_Policy[2])." | ||
| - "If no retrieved chunk scores above a similarity threshold of 0.6, you must output the official refusal template." | ||
| - "Answer must use only information present in the retrieved chunks; never add external context." | ||
| - "If a query spans two documents, retrieve and cite from each separately without merging they into a single blended rule." |
There was a problem hiding this comment.
Grammar issue in this enforcement rule: “without merging they into…” should be “without merging them into…”.
| - "If a query spans two documents, retrieve and cite from each separately without merging they into a single blended rule." | |
| - "If a query spans two documents, retrieve and cite from each separately without merging them into a single blended rule." |
| result = rag_query(question, llm_call=call_llm) | ||
|
|
There was a problem hiding this comment.
rag_query(...) is invoked without any exception handling. If the underlying RAG code raises (e.g., index missing), the MCP server will error and likely close the connection instead of returning an MCP-formatted isError: true response. Wrap the call in try/except and return a tool-call result with isError: true and the exception message (or a generic failure message).
| result = rag_query(question, llm_call=call_llm) | |
| try: | |
| result = rag_query(question, llm_call=call_llm) | |
| except Exception as exc: | |
| self.send_tool_call_error(req_id, f"RAG query failed: {exc}") | |
| return |
| is_error = result.get("refused", False) | ||
| content_text = result.get("answer", "No answer provided.") | ||
|
|
||
| response = { | ||
| "jsonrpc": "2.0", | ||
| "id": req_id, | ||
| "result": { | ||
| "content": [{"type": "text", "text": content_text}], | ||
| "isError": is_error | ||
| } |
There was a problem hiding this comment.
The MCP tools/call response drops citations entirely: stub_rag.query returns cited_chunks, but the server only returns answer text. The UC-MCP README expects “answer + cited sources”, so consider appending a formatted citations list to content_text (or otherwise surfacing cited_chunks in the text) so clients can verify grounding.
| chunks.append(". ".join(current_chunk) + ".") | ||
| current_chunk = [sentence] | ||
| current_length = sentence_len | ||
| else: | ||
| current_chunk.append(sentence) | ||
| current_length += sentence_len | ||
|
|
||
| if current_chunk: | ||
| chunks.append(". ".join(current_chunk) + ".") | ||
| return chunks |
There was a problem hiding this comment.
The chunk assembly always adds a trailing period (+ ".") even when the original sentence already ends with punctuation (e.g., text ending with "." and no ". " delimiter). This can produce doubled punctuation like "Hello.." and also converts !/? endings to .. Consider keeping the original sentence punctuation and joining with a space instead of forcing ..
| with open(path, 'r', encoding='utf-8') as f: | ||
| content = f.read() |
There was a problem hiding this comment.
build_index does not implement the error handling described in uc-rag/skills.md/README (skip unreadable/missing files). A single unreadable .txt will currently raise and abort indexing. Wrap file I/O in try/except, log the skipped filename, and continue indexing remaining documents.
| with open(path, 'r', encoding='utf-8') as f: | |
| content = f.read() | |
| try: | |
| with open(path, 'r', encoding='utf-8') as f: | |
| content = f.read() | |
| except OSError as e: | |
| print(f"Skipping unreadable or missing file {filename}: {e}") | |
| continue |
| if not relevant_chunks: | ||
| return f"This question is not covered in the retrieved policy documents. Retrieved chunks: None. Please contact the relevant department for guidance." | ||
|
|
||
| context = "\n\n---\n\n".join(relevant_chunks) | ||
| answer = self.call_llm(user_query, context) | ||
| return f"{answer}\n\nCitations: {', '.join(citations)}" | ||
|
|
There was a problem hiding this comment.
On refusal you return a string with Retrieved chunks: None, but the UC-RAG refusal template requires listing the retrieved chunk sources (even when refusing). Also, on success query() returns a formatted string rather than a structured {answer, cited_chunks, refused} object like the stub and uc-rag/skills.md describe, which makes it harder to integrate with UC-MCP later. Consider returning a dict and including the candidate sources in the refusal path.
| # 2. Filter & Format Context | ||
| relevant_chunks = [] | ||
| citations = [] | ||
| for i in range(len(results['ids'][0])): | ||
| distance = results['distances'][0][i] | ||
| # ChromaDB distances are L2; lower is better. | ||
| # We approximate cosine similarity here (for all-MiniLM-L6-v2, it's roughly 1 - dist/2) | ||
| similarity = 1 - (distance / 2) | ||
|
|
||
| if similarity >= threshold: | ||
| chunk_text = results['documents'][0][i] | ||
| metadata = results['metadatas'][0][i] | ||
| ref = f"{metadata['doc_name']}[{metadata['chunk_index']}]" | ||
| relevant_chunks.append(f"Source: {ref}\n{chunk_text}") | ||
| citations.append(ref) | ||
|
|
||
| if not relevant_chunks: | ||
| return f"This question is not covered in the retrieved policy documents. Retrieved chunks: None. Please contact the relevant department for guidance." | ||
|
|
||
| context = "\n\n---\n\n".join(relevant_chunks) | ||
| answer = self.call_llm(user_query, context) | ||
| return f"{answer}\n\nCitations: {', '.join(citations)}" | ||
|
|
There was a problem hiding this comment.
The RAG prompt/context is built by concatenating chunks from potentially multiple documents into one context block. This conflicts with the enforcement rule in uc-rag/agents.md (“If a query spans two documents… never merge retrieved chunks from different documents into one answer”). If you keep multi-doc retrieval, you’ll need logic/prompting that answers per-document or otherwise prevents cross-document blending.
| # 2. Filter & Format Context | |
| relevant_chunks = [] | |
| citations = [] | |
| for i in range(len(results['ids'][0])): | |
| distance = results['distances'][0][i] | |
| # ChromaDB distances are L2; lower is better. | |
| # We approximate cosine similarity here (for all-MiniLM-L6-v2, it's roughly 1 - dist/2) | |
| similarity = 1 - (distance / 2) | |
| if similarity >= threshold: | |
| chunk_text = results['documents'][0][i] | |
| metadata = results['metadatas'][0][i] | |
| ref = f"{metadata['doc_name']}[{metadata['chunk_index']}]" | |
| relevant_chunks.append(f"Source: {ref}\n{chunk_text}") | |
| citations.append(ref) | |
| if not relevant_chunks: | |
| return f"This question is not covered in the retrieved policy documents. Retrieved chunks: None. Please contact the relevant department for guidance." | |
| context = "\n\n---\n\n".join(relevant_chunks) | |
| answer = self.call_llm(user_query, context) | |
| return f"{answer}\n\nCitations: {', '.join(citations)}" | |
| # 2. Filter & Format Context per document to avoid cross-document blending | |
| chunks_by_doc = {} | |
| citations_by_doc = {} | |
| for i in range(len(results['ids'][0])): | |
| distance = results['distances'][0][i] | |
| # ChromaDB distances are L2; lower is better. | |
| # We approximate cosine similarity here (for all-MiniLM-L6-v2, it's roughly 1 - dist/2) | |
| similarity = 1 - (distance / 2) | |
| if similarity >= threshold: | |
| chunk_text = results['documents'][0][i] | |
| metadata = results['metadatas'][0][i] | |
| doc_name = metadata['doc_name'] | |
| ref = f"{doc_name}[{metadata['chunk_index']}]" | |
| chunks_by_doc.setdefault(doc_name, []).append(f"Source: {ref}\n{chunk_text}") | |
| citations_by_doc.setdefault(doc_name, []).append(ref) | |
| if not chunks_by_doc: | |
| return f"This question is not covered in the retrieved policy documents. Retrieved chunks: None. Please contact the relevant department for guidance." | |
| doc_answers = [] | |
| for doc_name, doc_chunks in chunks_by_doc.items(): | |
| context = "\n\n---\n\n".join(doc_chunks) | |
| answer = self.call_llm(user_query, context) | |
| doc_answers.append( | |
| f"Document: {doc_name}\n" | |
| f"{answer}\n\n" | |
| f"Citations: {', '.join(citations_by_doc[doc_name])}" | |
| ) | |
| return "\n\n====\n\n".join(doc_answers) |
| output: A dictionary with 'answer' and 'cited_chunks'. | ||
| error_handling: Detects RAG refusals and returns a formatted MCP error content with isError: true. |
There was a problem hiding this comment.
In UC-MCP, query_policy_documents is supposed to return MCP tool-call output ({"content": [...], "isError": bool}), not a raw {answer, cited_chunks} dict (see uc-mcp/README.md “Skills to Define”). Updating this keeps the skill definition aligned with the protocol your mcp_server.py implements.
| output: A dictionary with 'answer' and 'cited_chunks'. | |
| error_handling: Detects RAG refusals and returns a formatted MCP error content with isError: true. | |
| output: MCP tool-call output in the form {"content": [...], "isError": bool}, where successful responses include the answer and citations in content. | |
| error_handling: Detects RAG refusals and returns MCP tool-call output with formatted error content and isError: true. |
No description provided.