Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 19 additions & 18 deletions uc-0a/agents.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,28 @@
# agents.md — UC-0A Complaint Classifier
# INSTRUCTIONS:
# 1. Open your AI tool
# 2. Paste the full contents of uc-0a/README.md
# 3. Use this prompt:
# "Read this UC README. Using the R.I.C.E framework, generate an
# agents.md YAML with four fields: role, intent, context, enforcement.
# Enforcement must include every rule listed under
# 'Enforcement Rules Your agents.md Must Include'.
# Output only valid YAML."
# 4. Paste the output below

role: >
[FILL IN]
City Operations complaint classifier. Reads citizen complaint descriptions
and assigns a structured classification (category, priority, reason, flag)
according to a fixed schema. Operates within a closed taxonomy — no
creative interpretation, no invented labels.

intent: >
[FILL IN]
For each complaint, produce a dict with exactly four fields:
category (from the allowed enum), priority (Urgent/Standard/Low),
reason (one sentence citing specific words from the description),
and flag (NEEDS_REVIEW or blank). The output feeds the Director's
weekly dashboard and must be deterministic and auditable.

context: >
[FILL IN]
Input is one or more complaint rows from city-specific test CSVs
(data/city-test-files/test_[city].csv). Each row contains at minimum
a description and location. The classifier uses the policy documents
in data/policy-documents/ as reference for edge-case reasoning.
No external data sources or general knowledge may be used.

enforcement:
- "[FILL IN: category enum rule]"
- "[FILL IN: severity keyword rule — list the keywords]"
- "[FILL IN: reason field rule]"
- "[FILL IN: ambiguity refusal rule]"
- "[FILL IN: no invented categories rule]"
- "Category must be exactly one value from the allowed list: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other. No variations, synonyms, or invented sub-categories."
- "Priority must be Urgent if the description contains any severity keyword: injury, child, school, hospital, ambulance, fire, hazard, fell, collapse. Matching is case-insensitive."
- "Every output row must include a reason field containing one sentence that cites specific words from the complaint description."
- "If the category cannot be determined confidently, output category: Other and flag: NEEDS_REVIEW. Never guess on ambiguous input."
- "Never invent category names outside the allowed list. If the complaint does not fit any category, use Other."
140 changes: 132 additions & 8 deletions uc-0a/classifier.py
Original file line number Diff line number Diff line change
@@ -1,26 +1,150 @@
"""
UC-0A — Complaint Classifier
classifier.py — Starter file
classifier.py — Classifies city complaints using R.I.C.E enforcement rules.

Build this using your AI coding tool:
1. Share agents.md, skills.md, and uc-0a/README.md
2. Ask the AI to implement this file
3. Run: python3 classifier.py --input ../data/city-test-files/test_pune.csv \
--output results_pune.csv
Run: python3 classifier.py --input ../data/city-test-files/test_pune.csv \
--output results_pune.csv
"""
import argparse
import csv
import re

ALLOWED_CATEGORIES = [
"Pothole", "Flooding", "Streetlight", "Waste", "Noise",
"Road Damage", "Heritage Damage", "Heat Hazard", "Drain Blockage", "Other"
]

SEVERITY_KEYWORDS = [
"injur", "child", "school", "hospital", "ambulance",
"fire", "hazard", "fell", "collaps"
]

# Keyword patterns for category detection (order matters — first match wins)
CATEGORY_PATTERNS = [
("Pothole", [r"\bpotholes?\b"]),
("Flooding", [r"\bfloods?\b", r"\bflooded\b", r"\bflooding\b", r"\bstranded\b", r"\bknee-deep\b",
r"\brainwater\b", r"\bwaterlogg"]),
("Drain Blockage", [r"\bdrains?\b", r"\bmanhole\b", r"\bsewer\b", r"\bblocked drain\b", r"\bdraining\b"]),
("Streetlight", [r"\bstreetlights?\b", r"\bstreet lights?\b", r"\blights? out\b", r"\bflickering\b",
r"\bunlit\b", r"\bdarkness\b", r"\bsubstation\b", r"\bwiring\b"]),
("Waste", [r"\bgarbage\b", r"\bwaste\b", r"\brubbish\b", r"\boverflowing\b", r"\bdead animal\b",
r"\bnot removed\b", r"\bdumped\b"]),
("Noise", [r"\bnoise\b", r"\bmusic\b", r"\bloud\b", r"\bmidnight\b", r"\bdrilling\b",
r"\bidling\b", r"\bband\b.*\bplaying\b", r"\bplaying\b.*\b(band|11\s*pm|midnight)"]),
("Road Damage", [r"\broad.*crack\b", r"\bsinking\b", r"\bcracked\b", r"\bfootpath\b", r"\btiles? broken\b",
r"\bupturned\b", r"\bcollapsed?\b", r"\bcrater\b", r"\bbuckled\b", r"\bsubsided\b"]),
("Heritage Damage", [r"\bheritage\b", r"\bmonument\b", r"\bhistoric\b"]),
("Heat Hazard", [r"\bheat\b", r"\bsunstroke\b", r"\bheatwave\b", r"\bmelting\b", r"\bbubbling\b",
r"\b\d{2,}°?\s*c\b", r"\btemperatures?\b", r"\bfull sun\b"]),
]


def _detect_category(description: str) -> tuple[str, bool]:
"""Return (category, confident) based on keyword matching."""
desc_lower = description.lower()
for category, patterns in CATEGORY_PATTERNS:
for pattern in patterns:
if re.search(pattern, desc_lower):
return category, True
return "Other", False


def _detect_priority(description: str) -> str:
"""Return Urgent if any severity keyword is found, else Standard."""
desc_lower = description.lower()
for keyword in SEVERITY_KEYWORDS:
if re.search(rf"\b{keyword}", desc_lower):
return "Urgent"
return "Standard"


def _build_reason(description: str, category: str, priority: str) -> str:
"""Build a one-sentence reason citing specific words from the description."""
desc_lower = description.lower()

# Find which severity keywords triggered Urgent
triggered_keywords = [
kw for kw in SEVERITY_KEYWORDS
if re.search(rf"\b{kw}", desc_lower)
]

# Find which category pattern matched
matched_pattern_words = []
for cat, patterns in CATEGORY_PATTERNS:
if cat == category:
for pattern in patterns:
match = re.search(pattern, desc_lower)
if match:
matched_pattern_words.append(match.group())
break

parts = []
if matched_pattern_words:
parts.append(f"description contains '{matched_pattern_words[0]}' indicating {category}")
if triggered_keywords:
parts.append(f"severity keyword(s) [{', '.join(triggered_keywords)}] triggered Urgent priority")
if not parts:
parts.append("description did not match any known category pattern")

return ". ".join(parts).capitalize() + "."


def classify_complaint(row: dict) -> dict:
"""
Classify a single complaint row.
Returns dict with: complaint_id, category, priority, reason, flag
"""
raise NotImplementedError("Build this using your AI tool + agents.md")
complaint_id = row.get("complaint_id", "UNKNOWN")
description = row.get("description", "").strip()

# Vague/short descriptions -> Other + NEEDS_REVIEW
if len(description) < 10:
return {
"complaint_id": complaint_id,
"category": "Other",
"priority": "Standard",
"reason": "Description too short or vague for confident classification.",
"flag": "NEEDS_REVIEW"
}

category, confident = _detect_category(description)
priority = _detect_priority(description)
reason = _build_reason(description, category, priority)
flag = "" if confident else "NEEDS_REVIEW"

return {
"complaint_id": complaint_id,
"category": category,
"priority": priority,
"reason": reason,
"flag": flag
}


def batch_classify(input_path: str, output_path: str):
"""Read input CSV, classify each row, write results CSV."""
raise NotImplementedError("Build this using your AI tool + agents.md")
results = []
skipped = []

with open(input_path, newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
for i, row in enumerate(reader):
if "description" not in row or not row["description"].strip():
skipped.append(i + 2) # +2 for header + 0-index
continue
result = classify_complaint(row)
results.append(result)

fieldnames = ["complaint_id", "category", "priority", "reason", "flag"]
with open(output_path, "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(results)

print(f"Classified {len(results)} complaints.")
if skipped:
print(f"Skipped {len(skipped)} malformed rows (line numbers: {skipped})")


if __name__ == "__main__":
parser = argparse.ArgumentParser(description="UC-0A Complaint Classifier")
Expand Down
16 changes: 16 additions & 0 deletions uc-0a/results_ahmedabad.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
AM-202401,Heat Hazard,Standard,Description contains 'melting' indicating heat hazard.,
AM-202402,Heat Hazard,Standard,Description contains 'temperatures' indicating heat hazard.,
AM-202405,Other,Standard,Description did not match any known category pattern.,NEEDS_REVIEW
AM-202406,Heat Hazard,Standard,Description contains 'heatwave' indicating heat hazard.,
AM-202407,Road Damage,Urgent,"Description contains 'upturned' indicating road damage. severity keyword(s) [injur, child] triggered urgent priority.",
AM-202410,Pothole,Standard,Description contains 'pothole' indicating pothole.,
AM-202414,Streetlight,Standard,Description contains 'unlit' indicating streetlight.,
AM-202417,Waste,Standard,Description contains 'waste' indicating waste.,
AM-202421,Noise,Standard,Description contains 'music' indicating noise.,
AM-202424,Heat Hazard,Standard,Description contains 'bubbling' indicating heat hazard.,
AM-202429,Heat Hazard,Standard,Description contains '52°c' indicating heat hazard.,
AM-202431,Heritage Damage,Standard,Description contains 'heritage' indicating heritage damage.,
AM-202435,Heat Hazard,Standard,Description contains 'heat' indicating heat hazard.,
AM-202444,Waste,Standard,Description contains 'waste' indicating waste.,
AM-202445,Heat Hazard,Standard,Description contains 'full sun' indicating heat hazard.,
16 changes: 16 additions & 0 deletions uc-0a/results_hyderabad.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
GH-202401,Flooding,Urgent,Description contains 'flooded' indicating flooding. severity keyword(s) [ambulance] triggered urgent priority.,
GH-202402,Flooding,Standard,Description contains 'flooded' indicating flooding.,
GH-202406,Drain Blockage,Standard,Description contains 'drain' indicating drain blockage.,
GH-202407,Drain Blockage,Standard,Description contains 'drain' indicating drain blockage.,
GH-202410,Pothole,Standard,Description contains 'potholes' indicating pothole.,
GH-202411,Pothole,Urgent,Description contains 'pothole' indicating pothole. severity keyword(s) [hospital] triggered urgent priority.,
GH-202412,Pothole,Urgent,Description contains 'potholes' indicating pothole. severity keyword(s) [school] triggered urgent priority.,
GH-202417,Waste,Standard,Description contains 'garbage' indicating waste.,
GH-202420,Noise,Standard,Description contains 'drilling' indicating noise.,
GH-202422,Road Damage,Urgent,Description contains 'collapsed' indicating road damage. severity keyword(s) [collaps] triggered urgent priority.,
GH-202424,Flooding,Standard,Description contains 'floods' indicating flooding.,
GH-202428,Waste,Standard,Description contains 'waste' indicating waste.,
GH-202432,Noise,Standard,Description contains 'idling' indicating noise.,
GH-202448,Flooding,Standard,Description contains 'flooding' indicating flooding.,
GH-202438,Flooding,Standard,Description contains 'rainwater' indicating flooding.,
16 changes: 16 additions & 0 deletions uc-0a/results_kolkata.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
KM-202401,Heritage Damage,Standard,Description contains 'heritage' indicating heritage damage.,
KM-202402,Heritage Damage,Standard,Description contains 'historic' indicating heritage damage.,
KM-202405,Noise,Standard,Description contains 'band playing' indicating noise.,
KM-202409,Pothole,Standard,Description contains 'potholes' indicating pothole.,
KM-202410,Pothole,Standard,Description contains 'pothole' indicating pothole.,
KM-202411,Pothole,Standard,Description contains 'pothole' indicating pothole.,
KM-202415,Drain Blockage,Standard,Description contains 'draining' indicating drain blockage.,
KM-202418,Waste,Standard,Description contains 'waste' indicating waste.,
KM-202421,Road Damage,Urgent,"Description contains 'sinking' indicating road damage. severity keyword(s) [hospital, fell] triggered urgent priority.",
KM-202422,Road Damage,Standard,Description contains 'buckled' indicating road damage.,
KM-202426,Heritage Damage,Standard,Description contains 'heritage' indicating heritage damage.,
KM-202430,Road Damage,Standard,Description contains 'subsided' indicating road damage.,
KM-202434,Heritage Damage,Standard,Description contains 'heritage' indicating heritage damage.,
KM-202436,Streetlight,Standard,Description contains 'darkness' indicating streetlight.,
KM-202438,Heritage Damage,Standard,Description contains 'heritage' indicating heritage damage.,
16 changes: 16 additions & 0 deletions uc-0a/results_pune.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
PM-202401,Pothole,Standard,Description contains 'pothole' indicating pothole.,
PM-202402,Pothole,Urgent,"Description contains 'pothole' indicating pothole. severity keyword(s) [child, school] triggered urgent priority.",
PM-202406,Flooding,Standard,Description contains 'flooded' indicating flooding.,
PM-202408,Flooding,Standard,Description contains 'flooded' indicating flooding.,
PM-202410,Streetlight,Standard,Description contains 'streetlights' indicating streetlight.,
PM-202411,Streetlight,Urgent,Description contains 'streetlight' indicating streetlight. severity keyword(s) [hazard] triggered urgent priority.,
PM-202413,Waste,Standard,Description contains 'garbage' indicating waste.,
PM-202418,Noise,Standard,Description contains 'music' indicating noise.,
PM-202419,Road Damage,Standard,Description contains 'sinking' indicating road damage.,
PM-202420,Drain Blockage,Urgent,Description contains 'manhole' indicating drain blockage. severity keyword(s) [injur] triggered urgent priority.,
PM-202427,Flooding,Standard,Description contains 'floods' indicating flooding.,
PM-202428,Waste,Standard,Description contains 'dead animal' indicating waste.,
PM-202430,Streetlight,Standard,Description contains 'lights out' indicating streetlight.,
PM-202433,Waste,Standard,Description contains 'waste' indicating waste.,
PM-202446,Road Damage,Urgent,Description contains 'footpath' indicating road damage. severity keyword(s) [fell] triggered urgent priority.,
40 changes: 31 additions & 9 deletions uc-0a/skills.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,37 @@
# skills.md — UC-0A Complaint Classifier
# INSTRUCTIONS: Same as agents.md — paste README into AI, ask for skills.md YAML

skills:
- name: classify_complaint
description: "[FILL IN]"
input: "[FILL IN]"
output: "[FILL IN]"
error_handling: "[FILL IN]"
description: >
Classifies a single citizen complaint into a structured output
using the fixed classification schema. Applies severity keyword
detection for priority escalation and taxonomy enforcement for
category assignment.
input: >
One complaint row as a dict with at minimum a description field
and a location field.
output: >
A dict with four fields:
category — one of: Pothole, Flooding, Streetlight, Waste, Noise,
Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other
priority — Urgent, Standard, or Low
reason — one sentence citing specific words from the description
flag — NEEDS_REVIEW or blank
error_handling: >
Vague, extremely short, or unintelligible descriptions produce
category: Other, flag: NEEDS_REVIEW, and a reason stating the
description was insufficient for confident classification.

- name: batch_classify
description: "[FILL IN]"
input: "[FILL IN]"
output: "[FILL IN]"
error_handling: "[FILL IN]"
description: >
Processes an entire test CSV file of complaints, applying
classify_complaint to each row and writing results to an output CSV.
input: >
Path to a test CSV file (e.g., data/city-test-files/test_pune.csv).
output: >
Path to a results CSV file (e.g., uc-0a/results_pune.csv) containing
all original columns plus category, priority, reason, and flag.
error_handling: >
Malformed or unreadable rows are logged with their row index and
skipped. Processing continues for remaining rows. A summary of
skipped rows is printed at the end.
43 changes: 20 additions & 23 deletions uc-mcp/agents.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,29 @@
# agents.md — UC-MCP MCP Server
# INSTRUCTIONS:
# 1. Open your AI tool
# 2. Paste the full contents of uc-mcp/README.md
# 3. Use this prompt:
# "Read this UC README. Using the R.I.C.E framework, generate an
# agents.md YAML with four fields: role, intent, context, enforcement.
# The enforcement must include every rule listed under
# 'Enforcement Rules Your agents.md Must Include'.
# Output only valid YAML."
# 4. Paste the output below, replacing this placeholder
# 5. Pay special attention to enforcement rule 1 — the tool description
# must state exact document scope

role: >
[FILL IN: Who is this agent? What layer of the stack does it operate at?
Hint: an MCP server that exposes policy retrieval as a tool]
MCP (Model Context Protocol) server that exposes the RAG policy retrieval
system as a discoverable tool over plain HTTP using JSON-RPC 2.0. Operates
at the tool-serving layer — it does not answer questions itself but
delegates to the RAG server and returns structured, protocol-compliant
responses.

intent: >
[FILL IN: What does a correctly implemented MCP server produce?
Hint: JSON-RPC compliant responses, scoped tool description, correct refusals]
Serve a single MCP tool (query_policy_documents) that any AI agent can
discover via tools/list and invoke via tools/call. Responses must be
JSON-RPC 2.0 compliant. The tool description must clearly communicate
scope so agents only call it for relevant queries. Refusals and errors
must be signalled correctly through the protocol.

context: >
[FILL IN: What does this server have access to?
Hint: RAG server results only — no direct LLM calls, no outside knowledge]
The server wraps the RAG server (rag_server.py or stub_rag.py) from
UC-RAG. It has access to RAG query results only — it does not make
direct LLM calls and has no knowledge outside of what the RAG server
returns. The three policy documents in scope are: CMC HR Leave Policy,
IT Acceptable Use Policy, and Finance Reimbursement Policy.

enforcement:
- "[FILL IN: Tool description scope rule]"
- "[FILL IN: Refusal documentation rule]"
- "[FILL IN: inputSchema required field rule]"
- "[FILL IN: isError on failure rule]"
- "[FILL IN: HTTP 200 for all JSON-RPC responses rule]"
- "Tool description must state the exact document scope: CMC HR Leave Policy, IT Acceptable Use Policy, and Finance Reimbursement Policy. A vague description causes agents to call the tool for out-of-scope questions."
- "Tool description must state what it cannot answer: questions outside these three policy documents will return a refusal. This prevents wasted tool calls from agents."
- "inputSchema must require 'question' as a non-empty string. The question field is the only accepted input."
- "Error responses must use isError: true in the result. Never return an empty content array on failure — always include a descriptive error message."
- "The server must return HTTP 200 for all JSON-RPC responses, including application-level errors. Transport errors (malformed HTTP, etc.) use HTTP 4xx/5xx. Application errors use JSON-RPC error objects within a 200 response."
Loading