Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 23 additions & 8 deletions uc-0a/agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,32 @@
# 4. Paste the output below

role: >
[FILL IN]
You are an AI classifier for the City Operations team.
You receive hundreds of citizen complaints per week and are responsible for
accurately categorizing them.

intent: >
[FILL IN]
Read each complaint and output a category, priority, reason, and flag.
The output feeds the Director's dashboard every Monday.
You must ensure reliable, standardized outputs without hallucinated values.

context: >
[FILL IN]
You must avoid these five failure modes:
1. Taxonomy drift (same complaint type getting different categories).
2. Severity blindness (e.g., missing urgent complaints involving children/injuries like "Child fell near school").
3. Missing justification (no reason field in output).
4. Hallucinated sub-categories (outputting items not in schema, e.g. "Pedestrian Safety Incident").
5. False confidence on ambiguity (classifying ambiguous inputs confidently without NEEDS_REVIEW).

Classification Schema Fields and Allowed Values:
- category: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other
- priority: Urgent, Standard, Low
- reason: One sentence citing specific words from the description
- flag: NEEDS_REVIEW or blank

enforcement:
- "[FILL IN: category enum rule]"
- "[FILL IN: severity keyword rule — list the keywords]"
- "[FILL IN: reason field rule]"
- "[FILL IN: ambiguity refusal rule]"
- "[FILL IN: no invented categories rule]"
- "Category must be exactly one value from the allowed list: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other. No variations."
- "Priority must be Urgent if description contains any severity keyword: injury, child, school, hospital, ambulance, fire, hazard, fell, collapse."
- "Every output row must include a reason field of one sentence citing specific words from the description."
- "If category cannot be determined confidently — output category: Other and flag: NEEDS_REVIEW."
- "Never invent category names outside the allowed list."
84 changes: 82 additions & 2 deletions uc-0a/classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,97 @@
"""
import argparse
import csv
import re

SEVERITY_KEYWORDS = {"injury", "child", "school", "hospital", "ambulance", "fire", "hazard", "fell", "collapse"}
ALLOWED_CATEGORIES = {
"Pothole": ["pothole", "crater"],
"Flooding": ["flood", "water", "inundated"],
"Streetlight": ["light", "lamp", "dark"],
"Waste": ["waste", "trash", "garbage", "rubbish"],
"Noise": ["noise", "loud", "sound"],
"Road Damage": ["road", "crack", "surface"],
"Heritage Damage": ["heritage", "monument"],
"Heat Hazard": ["heat", "sun"],
"Drain Blockage": ["drain", "clog", "blockage", "sewer"],
}

def classify_complaint(row: dict) -> dict:
"""
Classify a single complaint row.
Returns dict with: complaint_id, category, priority, reason, flag
"""
raise NotImplementedError("Build this using your AI tool + agents.md")
description = row.get("description", "").lower()

# Error handling for short/vague
if len(description.split()) < 5:
return {
"complaint_id": row.get("complaint_id"),
"category": "Other",
"priority": "Standard",
"reason": "Description is too short or vague.",
"flag": "NEEDS_REVIEW"
}

# Check Severity
is_urgent = False
priority = "Standard"
for kw in SEVERITY_KEYWORDS:
if kw in description:
is_urgent = True
priority = "Urgent"
break

# Check Category
category = "Other"
for cat, aliases in ALLOWED_CATEGORIES.items():
if any(alias in description for alias in aliases):
category = cat
break

flag = "NEEDS_REVIEW" if category == "Other" else ""

# Reason
sentences = re.split(r'(?<=[.!?])\s+', row.get("description", "").strip())
reason_sentence = sentences[0] if sentences else row.get("description")
if is_urgent:
for s in sentences:
if any(kw in s.lower() for kw in SEVERITY_KEYWORDS):
reason_sentence = s.strip()
break
elif category != "Other":
for s in sentences:
if any(kw in s.lower() for kw in ALLOWED_CATEGORIES[category]):
reason_sentence = s.strip()
break

return {
"complaint_id": row.get("complaint_id"),
"category": category,
"priority": priority,
"reason": reason_sentence,
"flag": flag
}

def batch_classify(input_path: str, output_path: str):
"""Read input CSV, classify each row, write results CSV."""
raise NotImplementedError("Build this using your AI tool + agents.md")
with open(input_path, 'r', encoding='utf-8') as f:
reader = csv.DictReader(f)
rows = list(reader)

results = []
fieldnames = ["complaint_id", "category", "priority", "reason", "flag"]
for i, row in enumerate(rows):
try:
res = classify_complaint(row)
results.append(res)
except Exception as e:
print(f"Skipping malformed row {i}: {e}")

with open(output_path, 'w', encoding='utf-8', newline='') as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(results)

if __name__ == "__main__":
parser = argparse.ArgumentParser(description="UC-0A Complaint Classifier")
Expand Down
16 changes: 16 additions & 0 deletions uc-0a/results_pune.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
PM-202401,Pothole,Standard,Large pothole 60cm wide causing tyre damage.,
PM-202402,Pothole,Urgent,School children at risk during morning hours.,
PM-202406,Flooding,Standard,Underpass flooded knee-deep after 2hrs rain.,
PM-202408,Flooding,Standard,Bus stand flooded.,
PM-202410,Streetlight,Standard,Three consecutive streetlights out for 10 days.,
PM-202411,Streetlight,Urgent,Electrical hazard reported.,
PM-202413,Waste,Standard,Overflowing garbage bins near vegetable market.,
PM-202418,Other,Standard,Wedding venue playing music past midnight on weeknights.,NEEDS_REVIEW
PM-202419,Road Damage,Standard,Road surface cracked and sinking near utility work done 1 month ago.,
PM-202420,Other,Urgent,Risk of serious injury to cyclists.,NEEDS_REVIEW
PM-202427,Flooding,Standard,Bridge approach floods in 30mins of rain.,
PM-202428,Other,Standard,Dead animal not removed for 36 hours.,NEEDS_REVIEW
PM-202430,Streetlight,Standard,"Heritage street, lights out.",
PM-202433,Waste,Standard,Bulk waste from apartment renovation dumped on public road.,
PM-202446,Other,Urgent,Elderly resident fell last week.,NEEDS_REVIEW
16 changes: 8 additions & 8 deletions uc-0a/skills.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@

skills:
- name: classify_complaint
description: "[FILL IN]"
input: "[FILL IN]"
output: "[FILL IN]"
error_handling: "[FILL IN]"
description: "Reads a single citizen complaint and outputs a category, priority, reason, and flag according to the classification schema."
input: "One complaint row (dict with description, location fields)"
output: "Dictionary with category, priority, reason, flag"
error_handling: "If description is vague or short, output category as Other and flag as NEEDS_REVIEW"

- name: batch_classify
description: "[FILL IN]"
input: "[FILL IN]"
output: "[FILL IN]"
error_handling: "[FILL IN]"
description: "Reads a batch of complaints from an input CSV file, applies classification to each row, and writes the output to a results CSV file."
input: "Path to test CSV file"
output: "Path to results CSV file"
error_handling: "Malformed rows are logged and skipped, processing continues"
33 changes: 8 additions & 25 deletions uc-mcp/agents.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,15 @@
# agents.md — UC-MCP MCP Server
# INSTRUCTIONS:
# 1. Open your AI tool
# 2. Paste the full contents of uc-mcp/README.md
# 3. Use this prompt:
# "Read this UC README. Using the R.I.C.E framework, generate an
# agents.md YAML with four fields: role, intent, context, enforcement.
# The enforcement must include every rule listed under
# 'Enforcement Rules Your agents.md Must Include'.
# Output only valid YAML."
# 4. Paste the output below, replacing this placeholder
# 5. Pay special attention to enforcement rule 1 — the tool description
# must state exact document scope

role: >
[FILL IN: Who is this agent? What layer of the stack does it operate at?
Hint: an MCP server that exposes policy retrieval as a tool]
An MCP (Model Context Protocol) server operating at the integration layer that exposes policy retrieval as an external tool, allowing AI agents to discover and query the CMC policy documents securely and deterministically over a standard interface.

intent: >
[FILL IN: What does a correctly implemented MCP server produce?
Hint: JSON-RPC compliant responses, scoped tool description, correct refusals]
To expose a plain HTTP server that implements the `tools/list` and `tools/call` JSON-RPC methods, producing strictly JSON-RPC compliant responses, a highly-scoped tool description (`query_policy_documents`), and correct, deterministic refusals for out-of-scope inquiries.

context: >
[FILL IN: What does this server have access to?
Hint: RAG server results only — no direct LLM calls, no outside knowledge]
The server strictly has access to query results provided by the existing backend RAG server (for CMC HR, IT, and Finance policies). It cannot make direct, unrestricted LLM calls or rely on outside knowledge to answer inquiries.

enforcement:
- "[FILL IN: Tool description scope rule]"
- "[FILL IN: Refusal documentation rule]"
- "[FILL IN: inputSchema required field rule]"
- "[FILL IN: isError on failure rule]"
- "[FILL IN: HTTP 200 for all JSON-RPC responses rule]"
- "The tool description MUST explicitly state the exact document scope: CMC HR Leave Policy, IT Acceptable Use Policy, and Finance Reimbursement Policy."
- "The tool description MUST explicitly state what it cannot answer: any questions outside the stated three documents will return a refusal template."
- "The inputSchema MUST require `question` as a non-empty string property."
- "All error responses MUST use `isError: true` — never return an empty content array on failure."
- "The server MUST return HTTP 200 for all JSON-RPC responses including errors (transport-level errors use HTTP 4xx/5xx, while application-level errors use JSON-RPC error objects)."
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file added uc-mcp/chroma_db/chroma.sqlite3
Binary file not shown.
15 changes: 15 additions & 0 deletions uc-mcp/debug_db.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import sys, os
sys.path.append('../uc-rag')
from stub_rag import query, get_collection

q = "Who approves leave without pay?"
res = query(q)
print(f"Refused: {res['refused']}")
print(f"Answer: {res['answer']}")

collection = get_collection()
results = collection.query(query_embeddings=[[0]*384], n_results=10) # dummy query to see all
print("\nFirst 2 chunks in DB:")
for i in range(min(2, len(results['documents'][0]))):
print(f"ID: {results['ids'][0][i]}")
print(f"Doc: {results['documents'][0][i][:200]}...")
Loading