Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 47 additions & 18 deletions uc-0a/agents.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,56 @@
# agents.md — UC-0A Complaint Classifier
# INSTRUCTIONS:
# 1. Open your AI tool
# 2. Paste the full contents of uc-0a/README.md
# 3. Use this prompt:
# "Read this UC README. Using the R.I.C.E framework, generate an
# agents.md YAML with four fields: role, intent, context, enforcement.
# Enforcement must include every rule listed under
# 'Enforcement Rules Your agents.md Must Include'.
# Output only valid YAML."
# 4. Paste the output below
# R.I.C.E Framework: Role, Intent, Context, Enforcement

role: >
[FILL IN]
You are a City Operations Complaint Classifier. You read citizen complaint
descriptions and produce structured classification output containing a
category, priority, reason, and flag. Your output feeds the Director's
weekly dashboard and must be deterministic, schema-compliant, and
traceable back to the complaint text.

intent: >
[FILL IN]
For every complaint row, produce exactly four fields — category, priority,
reason, flag — that conform to a fixed schema. Prevent taxonomy drift,
severity blindness, missing justifications, hallucinated sub-categories,
and false confidence on ambiguous inputs. When in doubt, refuse to guess:
output category "Other" with flag "NEEDS_REVIEW".

context: >
[FILL IN]
The City Operations team receives hundreds of complaints per week across
categories such as potholes, flooding, streetlights, waste, noise, road
damage, heritage damage, heat hazards, and drain blockages. Each complaint
has fields: complaint_id, date_raised, city, ward, location, description,
reported_by, days_open. The classifier must process these rows, applying a
fixed taxonomy and severity keyword list, and output a results CSV with
complaint_id, category, priority, reason, and flag columns.

enforcement:
- "[FILL IN: category enum rule]"
- "[FILL IN: severity keyword rule — list the keywords]"
- "[FILL IN: reason field rule]"
- "[FILL IN: ambiguity refusal rule]"
- "[FILL IN: no invented categories rule]"
- >
CATEGORY ENUM RULE: category must be exactly one value from the allowed
list: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage,
Heritage Damage, Heat Hazard, Drain Blockage, Other. No synonyms, no
variations, no invented sub-categories. If the complaint does not clearly
match any category, use "Other".
- >
SEVERITY KEYWORD RULE: priority must be "Urgent" if the complaint
description (case-insensitive) contains ANY of these keywords: injury,
child, school, hospital, ambulance, fire, hazard, fell, collapse.
Otherwise, assign "Standard" for actionable complaints or "Low" for
informational-only complaints. Priority values are restricted to:
Urgent, Standard, Low.
- >
REASON FIELD RULE: every output row must include a reason field
containing exactly one sentence that cites specific words or phrases
directly from the complaint description. Never leave the reason blank.
Never fabricate details not present in the original text.
- >
AMBIGUITY REFUSAL RULE: if the complaint description is vague, too
short (fewer than 5 meaningful words), or does not clearly map to a
single category, output category as "Other" and flag as "NEEDS_REVIEW".
Never classify ambiguous complaints confidently.
- >
NO INVENTED CATEGORIES RULE: never create, infer, or output category
names outside the allowed list. Examples of banned outputs include
"Pedestrian Safety Incident", "Traffic Issue", "Water Logging", or any
other label not in the enumerated set. If tempted to invent a category,
use "Other" instead.
173 changes: 162 additions & 11 deletions uc-0a/classifier.py
Original file line number Diff line number Diff line change
@@ -1,31 +1,182 @@
"""
UC-0A — Complaint Classifier
classifier.py — Starter file
classifier.py — Implementation based on agents.md and skills.md

Build this using your AI coding tool:
1. Share agents.md, skills.md, and uc-0a/README.md
2. Ask the AI to implement this file
3. Run: python3 classifier.py --input ../data/city-test-files/test_pune.csv \
--output results_pune.csv
Enforcement rules applied:
1. CATEGORY ENUM RULE — fixed allowed list, no invented categories
2. SEVERITY KEYWORD RULE — keyword scan triggers Urgent priority
3. REASON FIELD RULE — every row gets a one-sentence reason citing the text
4. AMBIGUITY REFUSAL RULE — vague/short → Other + NEEDS_REVIEW
5. NO INVENTED CATEGORIES RULE — only allowed enum values emitted

Run:
python classifier.py --input ../data/city-test-files/test_pune.csv --output results_pune.csv
"""

import argparse
import csv
import sys
import re

# ── Fixed taxonomy from agents.md enforcement ────────────────────────────────

ALLOWED_CATEGORIES = [
"Pothole", "Flooding", "Streetlight", "Waste", "Noise",
"Road Damage", "Heritage Damage", "Heat Hazard", "Drain Blockage", "Other"
]

SEVERITY_KEYWORDS = [
"injury", "child", "school", "hospital", "ambulance",
"fire", "hazard", "fell", "collapse"
]

# ── Keyword-to-category mapping ──────────────────────────────────────────────
# Each tuple: (list of keywords, category)
# Order matters — first match wins; more specific patterns come first.

CATEGORY_RULES = [
(["pothole", "tyre damage", "crater"], "Pothole"),
(["flood", "flooded", "waterlog", "water-log", "submerged",
"knee-deep", "stranded"], "Flooding"),
(["streetlight", "street light", "lights out", "dark at night",
"flickering", "sparking", "lamp post"], "Streetlight"),
(["garbage", "waste", "rubbish", "trash", "overflowing bin",
"dumped", "dead animal", "bulk waste", "litter"], "Waste"),
(["noise", "loud music", "music past midnight", "decibel",
"honking", "sound pollution"], "Noise"),
(["road surface", "crack", "sinking", "broken road",
"road damage", "footpath", "tiles broken", "upturned",
"manhole", "missing cover"], "Road Damage"),
(["heritage", "monument", "historical", "old city",
"heritage street"], "Heritage Damage"),
(["heat", "heatwave", "sunstroke", "temperature",
"hot surface"], "Heat Hazard"),
(["drain", "blocked drain", "clogged", "sewer",
"drain block", "nullah"], "Drain Blockage"),
]

MIN_MEANINGFUL_WORDS = 5


# ── Skill: classify_complaint ────────────────────────────────────────────────

def classify_complaint(row: dict) -> dict:
"""
Classify a single complaint row.
Returns dict with: complaint_id, category, priority, reason, flag

Applies all five enforcement rules from agents.md.
"""
raise NotImplementedError("Build this using your AI tool + agents.md")
complaint_id = row.get("complaint_id", "UNKNOWN")
description = row.get("description", "").strip()
desc_lower = description.lower()

# ── AMBIGUITY REFUSAL RULE ───────────────────────────────────────────
meaningful_words = [w for w in re.findall(r"[a-zA-Z]+", description) if len(w) > 1]
if not description or len(meaningful_words) < MIN_MEANINGFUL_WORDS:
return {
"complaint_id": complaint_id,
"category": "Other",
"priority": "Low",
"reason": "Description too vague or short for confident classification.",
"flag": "NEEDS_REVIEW",
}

# ── SEVERITY KEYWORD RULE ────────────────────────────────────────────
matched_severity = [kw for kw in SEVERITY_KEYWORDS if kw in desc_lower]
priority = "Urgent" if matched_severity else "Standard"

# ── CATEGORY ENUM RULE + NO INVENTED CATEGORIES RULE ─────────────────
category = None
matched_cat_keywords = []

for keywords, cat in CATEGORY_RULES:
hits = [kw for kw in keywords if kw in desc_lower]
if hits:
category = cat
matched_cat_keywords = hits
break # first match wins

flag = ""
if category is None:
category = "Other"
flag = "NEEDS_REVIEW"

# ── REASON FIELD RULE ────────────────────────────────────────────────
if matched_cat_keywords and matched_severity:
reason = (
f"Classified as {category} due to mention of "
f"{', '.join(repr(k) for k in matched_cat_keywords)}; "
f"priority Urgent due to severity keyword "
f"{', '.join(repr(k) for k in matched_severity)}."
)
elif matched_cat_keywords:
reason = (
f"Classified as {category} due to mention of "
f"{', '.join(repr(k) for k in matched_cat_keywords)} "
f"in the description."
)
elif flag == "NEEDS_REVIEW":
reason = (
"Description does not clearly map to a single category."
)
else:
reason = f"Classified as {category} based on overall description."

return {
"complaint_id": complaint_id,
"category": category,
"priority": priority,
"reason": reason,
"flag": flag,
}


# ── Skill: batch_classify ───────────────────────────────────────────────────

def batch_classify(input_path: str, output_path: str):
"""Read input CSV, classify each row, write results CSV."""
raise NotImplementedError("Build this using your AI tool + agents.md")
"""
Read input CSV, classify each row, write results CSV.

Malformed rows are logged to stderr and skipped.
Processing continues for all remaining rows.
"""
results = []

try:
with open(input_path, newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
for idx, row in enumerate(reader, start=2): # row 1 = header
try:
result = classify_complaint(row)
results.append(result)
except Exception as e:
print(
f"WARNING: Skipping malformed row {idx}: {e}",
file=sys.stderr,
)
except FileNotFoundError:
raise FileNotFoundError(
f"Input file not found: {input_path}. "
f"Check the path and try again."
)

# Write results CSV
fieldnames = ["complaint_id", "category", "priority", "reason", "flag"]
with open(output_path, "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(results)

print(f"Processed {len(results)} complaints.")


# ── CLI entry point ──────────────────────────────────────────────────────────

if __name__ == "__main__":
parser = argparse.ArgumentParser(description="UC-0A Complaint Classifier")
parser.add_argument("--input", required=True)
parser.add_argument("--output", required=True)
parser.add_argument("--input", required=True, help="Path to input CSV file")
parser.add_argument("--output", required=True, help="Path to output results CSV")
args = parser.parse_args()
batch_classify(args.input, args.output)
print(f"Done. Results written to {args.output}")
16 changes: 16 additions & 0 deletions uc-0a/results_ahmedabad.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
AM-202401,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
AM-202402,Heat Hazard,Standard,Classified as Heat Hazard due to mention of 'temperature' in the description.,
AM-202405,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
AM-202406,Heat Hazard,Standard,"Classified as Heat Hazard due to mention of 'heat', 'heatwave' in the description.",
AM-202407,Road Damage,Urgent,Classified as Road Damage due to mention of 'upturned'; priority Urgent due to severity keyword 'child'.,
AM-202410,Pothole,Standard,Classified as Pothole due to mention of 'pothole' in the description.,
AM-202414,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
AM-202417,Waste,Standard,Classified as Waste due to mention of 'waste' in the description.,
AM-202421,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
AM-202424,Road Damage,Standard,Classified as Road Damage due to mention of 'road surface' in the description.,
AM-202429,Heat Hazard,Standard,Classified as Heat Hazard due to mention of 'temperature' in the description.,
AM-202431,Heritage Damage,Standard,"Classified as Heritage Damage due to mention of 'heritage', 'old city' in the description.",
AM-202435,Heat Hazard,Standard,Classified as Heat Hazard due to mention of 'heat' in the description.,
AM-202444,Waste,Standard,Classified as Waste due to mention of 'waste' in the description.,
AM-202445,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
16 changes: 16 additions & 0 deletions uc-0a/results_hyderabad.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
GH-202401,Flooding,Urgent,"Classified as Flooding due to mention of 'flood', 'flooded'; priority Urgent due to severity keyword 'ambulance'.",
GH-202402,Flooding,Standard,"Classified as Flooding due to mention of 'flood', 'flooded' in the description.",
GH-202406,Drain Blockage,Standard,Classified as Drain Blockage due to mention of 'drain' in the description.,
GH-202407,Drain Blockage,Standard,"Classified as Drain Blockage due to mention of 'drain', 'drain block' in the description.",
GH-202410,Pothole,Standard,Classified as Pothole due to mention of 'pothole' in the description.,
GH-202411,Pothole,Urgent,Classified as Pothole due to mention of 'pothole'; priority Urgent due to severity keyword 'hospital'.,
GH-202412,Pothole,Urgent,Classified as Pothole due to mention of 'pothole'; priority Urgent due to severity keyword 'school'.,
GH-202417,Waste,Standard,"Classified as Waste due to mention of 'garbage', 'waste' in the description.",
GH-202420,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
GH-202422,Pothole,Urgent,Classified as Pothole due to mention of 'crater'; priority Urgent due to severity keyword 'collapse'.,
GH-202424,Flooding,Standard,Classified as Flooding due to mention of 'flood' in the description.,
GH-202428,Waste,Standard,Classified as Waste due to mention of 'waste' in the description.,
GH-202432,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
GH-202448,Flooding,Standard,Classified as Flooding due to mention of 'flood' in the description.,
GH-202438,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
16 changes: 16 additions & 0 deletions uc-0a/results_kolkata.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
KM-202401,Streetlight,Standard,Classified as Streetlight due to mention of 'lamp post' in the description.,
KM-202402,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
KM-202405,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
KM-202409,Pothole,Standard,Classified as Pothole due to mention of 'pothole' in the description.,
KM-202410,Pothole,Standard,Classified as Pothole due to mention of 'pothole' in the description.,
KM-202411,Pothole,Standard,Classified as Pothole due to mention of 'pothole' in the description.,
KM-202415,Drain Blockage,Standard,Classified as Drain Blockage due to mention of 'drain' in the description.,
KM-202418,Waste,Standard,Classified as Waste due to mention of 'waste' in the description.,
KM-202421,Road Damage,Urgent,"Classified as Road Damage due to mention of 'sinking', 'footpath'; priority Urgent due to severity keyword 'hospital', 'fell'.",
KM-202422,Road Damage,Standard,Classified as Road Damage due to mention of 'road surface' in the description.,
KM-202426,Heritage Damage,Standard,Classified as Heritage Damage due to mention of 'heritage' in the description.,
KM-202430,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
KM-202434,Heritage Damage,Standard,Classified as Heritage Damage due to mention of 'heritage' in the description.,
KM-202436,Other,Standard,Description does not clearly map to a single category.,NEEDS_REVIEW
KM-202438,Heritage Damage,Standard,Classified as Heritage Damage due to mention of 'heritage' in the description.,
16 changes: 16 additions & 0 deletions uc-0a/results_pune.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
PM-202401,Pothole,Standard,"Classified as Pothole due to mention of 'pothole', 'tyre damage' in the description.",
PM-202402,Pothole,Urgent,"Classified as Pothole due to mention of 'pothole'; priority Urgent due to severity keyword 'child', 'school'.",
PM-202406,Flooding,Standard,"Classified as Flooding due to mention of 'flood', 'flooded', 'knee-deep', 'stranded' in the description.",
PM-202408,Flooding,Standard,"Classified as Flooding due to mention of 'flood', 'flooded' in the description.",
PM-202410,Streetlight,Standard,"Classified as Streetlight due to mention of 'streetlight', 'lights out', 'dark at night' in the description.",
PM-202411,Streetlight,Urgent,"Classified as Streetlight due to mention of 'streetlight', 'flickering', 'sparking'; priority Urgent due to severity keyword 'hazard'.",
PM-202413,Waste,Standard,Classified as Waste due to mention of 'garbage' in the description.,
PM-202418,Noise,Standard,Classified as Noise due to mention of 'music past midnight' in the description.,
PM-202419,Road Damage,Standard,"Classified as Road Damage due to mention of 'road surface', 'crack', 'sinking' in the description.",
PM-202420,Road Damage,Urgent,Classified as Road Damage due to mention of 'manhole'; priority Urgent due to severity keyword 'injury'.,
PM-202427,Flooding,Standard,Classified as Flooding due to mention of 'flood' in the description.,
PM-202428,Waste,Standard,Classified as Waste due to mention of 'dead animal' in the description.,
PM-202430,Streetlight,Standard,Classified as Streetlight due to mention of 'lights out' in the description.,
PM-202433,Waste,Standard,"Classified as Waste due to mention of 'waste', 'dumped', 'bulk waste' in the description.",
PM-202446,Road Damage,Urgent,"Classified as Road Damage due to mention of 'footpath', 'tiles broken', 'upturned'; priority Urgent due to severity keyword 'fell'.",
Loading