Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 29 additions & 19 deletions uc-0a/agents.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,37 @@
# agents.md — UC-0A Complaint Classifier
# INSTRUCTIONS:
# 1. Open your AI tool
# 2. Paste the full contents of uc-0a/README.md
# 3. Use this prompt:
# "Read this UC README. Using the R.I.C.E framework, generate an
# agents.md YAML with four fields: role, intent, context, enforcement.
# Enforcement must include every rule listed under
# 'Enforcement Rules Your agents.md Must Include'.
# Output only valid YAML."
# 4. Paste the output below
# agents.md — UC-0A Complaint Classifier (RICE Framework)

role: >
[FILL IN]
City Operations Complaint Classifier Agent

intent: >
[FILL IN]
Classify incoming municipal complaints by category, priority, and severity
to route them to appropriate departments and flag urgent safety issues for
immediate intervention.

context: >
[FILL IN]
The City Operations team receives hundreds of complaints weekly covering:
potholes, flooding, streetlight failures, waste, noise, road/heritage damage,
heat hazards, drain blockages. Each complaint has a description and location.
Some involve urgent safety situations (injuries, children, schools, hospitals).
Staff depend on accurate, consistent classifications for dashboard reporting.

enforcement:
- "[FILL IN: category enum rule]"
- "[FILL IN: severity keyword rule — list the keywords]"
- "[FILL IN: reason field rule]"
- "[FILL IN: ambiguity refusal rule]"
- "[FILL IN: no invented categories rule]"
- rule: "Taxonomy Constraint (Fixed Enum)"
description: "Category MUST be exactly one value from: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other. NEVER invent category names."
keywords: "enum, fixed categories, no variations"

- rule: "Severity Keyword Detection"
description: "Priority MUST be Urgent if description contains ANY of: injury, child, school, hospital, ambulance, fire, hazard, fell, collapse (case-insensitive). Otherwise Standard."
keywords: "injury, child, school, hospital, ambulance, fire, hazard, fell, collapse"

- rule: "Justification Requirement"
description: "Every output row MUST include a reason field with exactly one sentence citing specific words from the original description. Extract key complaint element, do not repeat verbatim."
keywords: "reason, citation, specific words"

- rule: "Ambiguity Handling"
description: "If category cannot be determined with confidence (vague/short/contradictory description), output category: Other and flag: NEEDS_REVIEW. Better to flag for review than confidently misclassify."
keywords: "ambiguity, NEEDS_REVIEW, confidence threshold"

- rule: "No Hallucinated Sub-Categories"
description: "Never output sub-values like Pothole-Minor, Flooding-Severe, or Water-Related Damage. The category list is exhaustive."
keywords: "no sub-categories, exhaustive list"
214 changes: 202 additions & 12 deletions uc-0a/classifier.py
Original file line number Diff line number Diff line change
@@ -1,31 +1,221 @@
"""
UC-0A — Complaint Classifier
classifier.py — Starter file
classifier.py — RICE-constrained complaint classification

Build this using your AI coding tool:
1. Share agents.md, skills.md, and uc-0a/README.md
2. Ask the AI to implement this file
3. Run: python3 classifier.py --input ../data/city-test-files/test_pune.csv \
--output results_pune.csv
Enforcement Rules:
1. Taxonomy Constraint: Category must be exactly one value from the allowed list
2. Severity Keyword Detection: Priority=Urgent if ANY severity keyword present
3. Justification Requirement: Every row must have a reason citing specific words
4. Ambiguity Handling: Unclear cases → Other + NEEDS_REVIEW
5. No Hallucinated Sub-Categories: Only use enum values, never invent
"""
import argparse
import csv
import sys
import re

# Enforcement: Fixed enum for categories (no variations allowed)
ALLOWED_CATEGORIES = {
"Pothole", "Flooding", "Streetlight", "Waste", "Noise",
"Road Damage", "Heritage Damage", "Heat Hazard", "Drain Blockage", "Other"
}

# Enforcement: Severity keywords that trigger Urgent priority
SEVERITY_KEYWORDS = {
"injury", "child", "school", "hospital", "ambulance", "fire", "hazard", "fell", "collapse"
}

def extract_reason(description: str) -> str:
"""
Extract a one-sentence reason citing specific words from the description.
Returns the first sentence or a summary of key complaint element.
"""
if not description or len(description.strip()) < 3:
return "Vague or empty description"

# Try to extract first sentence (up to period, question mark, or exclamation)
sentences = re.split(r'[.!?]', description.strip())
first_sentence = sentences[0].strip()

if len(first_sentence) < 5:
return "Short/unclear complaint"

# Extract key words (first 15 words or full first sentence)
words = first_sentence.split()[:15]
return " ".join(words)

def has_severity_keywords(description: str) -> bool:
"""
Check if description contains ANY severity keyword.
Enforcement: ALL matches → Urgent (no threshold).
"""
if not description:
return False

desc_lower = description.lower()
for keyword in SEVERITY_KEYWORDS:
if keyword in desc_lower:
return True
return False

def classify_complaint(row: dict) -> dict:
"""
Classify a single complaint row.
Classify a single complaint row using RICE enforcement rules.

Enforcement:
- Category must be from ALLOWED_CATEGORIES enum (no invented values)
- Priority=Urgent if ANY severity keyword present
- Reason must cite specific words from description
- Ambiguous cases → Other + NEEDS_REVIEW

Returns dict with: complaint_id, category, priority, reason, flag
"""
raise NotImplementedError("Build this using your AI tool + agents.md")
complaint_id = row.get("complaint_id", "")
description = row.get("description", "").strip()
location = row.get("location", "").strip()

# Extract reason (always include, never empty)
reason = extract_reason(description)

# Check for severity keywords (Urgent detection)
has_severity = has_severity_keywords(description)

# Determine category based on description keywords
# If no strong match found, classify as Other
category = determine_category(description)

# Determine priority
if has_severity:
priority = "Urgent"
else:
priority = "Standard" # Default; could be Low for explicit non-urgent indicators

# Flag ambiguous cases
flag = ""
confidence = calculate_confidence(description, category)
if confidence < 0.6: # Low confidence threshold
flag = "NEEDS_REVIEW"

return {
"complaint_id": complaint_id,
"category": category,
"priority": priority,
"reason": reason,
"flag": flag
}

def determine_category(description: str) -> str:
"""
Determine the category based on description content.
Enforcement: Only return values from ALLOWED_CATEGORIES.
If no clear match, return 'Other' (never invent categories).
"""
if not description:
return "Other"

desc_lower = description.lower()

# Define keyword patterns for each category (not exhaustive, just patterns)
category_patterns = {
"Pothole": ["pothole", "hole", "pit", "crater"],
"Flooding": ["flood", "water", "wet", "inundation"],
"Streetlight": ["light", "street light", "lamp", "dark", "blackout"],
"Waste": ["garbage", "waste", "trash", "litter", "debris"],
"Noise": ["noise", "sound", "loud", "loudspeaker", "music"],
"Road Damage": ["road damage", "road", "asphalt", "pavement", "cracked"],
"Heritage Damage": ["heritage", "monument", "historical", "ancient"],
"Heat Hazard": ["heat", "hot", "temperature"],
"Drain Blockage": ["drain", "drainage", "sewage", "blockage", "clogged"]
}

# Score each category by keyword matches
best_category = "Other"
best_score = 0

for category, keywords in category_patterns.items():
score = sum(1 for kw in keywords if kw in desc_lower)
if score > best_score:
best_score = score
best_category = category

# Enforcement: Always return a value from ALLOWED_CATEGORIES
assert best_category in ALLOWED_CATEGORIES, f"Invalid category: {best_category}"
return best_category

def calculate_confidence(description: str, category: str) -> float:
"""
Calculate confidence score for the classification.
Returns 0.0–1.0. Below 0.6 triggers NEEDS_REVIEW flag.
"""
if not description or len(description.strip()) < 5:
return 0.3 # Very short/vague

if len(description.strip()) < 10:
return 0.5 # Too short

if category == "Other":
return 0.4 # Default category (uncertain)

return 0.8 # Reasonable confidence for matched categories

def batch_classify(input_path: str, output_path: str):
"""Read input CSV, classify each row, write results CSV."""
raise NotImplementedError("Build this using your AI tool + agents.md")
"""
Read input CSV, classify each row, write results CSV.

Enforcement:
- Malformed rows logged and skipped, processing continues
- Every row receives output (never skip)
- All required fields present in output
"""
rows_processed = 0
rows_skipped = 0
results = []

try:
with open(input_path, "r", encoding="utf-8") as infile:
reader = csv.DictReader(infile)

for row_num, row in enumerate(reader, start=2): # start=2 (header is 1)
try:
# Validate required fields
if not row.get("complaint_id") or not row.get("description"):
print(f"Warning: Row {row_num} missing complaint_id or description. Skipping.", file=sys.stderr)
rows_skipped += 1
continue

# Classify the complaint
result = classify_complaint(row)
results.append(result)
rows_processed += 1

except Exception as e:
print(f"Error processing row {row_num}: {e}", file=sys.stderr)
rows_skipped += 1
continue

# Write results to output CSV
if results:
with open(output_path, "w", newline="", encoding="utf-8") as outfile:
fieldnames = ["complaint_id", "category", "priority", "reason", "flag"]
writer = csv.DictWriter(outfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(results)

print(f"Processed: {rows_processed}, Skipped: {rows_skipped}")
return output_path

except FileNotFoundError:
print(f"Error: Input file not found: {input_path}", file=sys.stderr)
raise
except Exception as e:
print(f"Error during batch classification: {e}", file=sys.stderr)
raise

if __name__ == "__main__":
parser = argparse.ArgumentParser(description="UC-0A Complaint Classifier")
parser.add_argument("--input", required=True)
parser.add_argument("--output", required=True)
parser.add_argument("--input", required=True, help="Path to input CSV (test_[city].csv)")
parser.add_argument("--output", required=True, help="Path to output CSV (results_[city].csv)")
args = parser.parse_args()

batch_classify(args.input, args.output)
print(f"Done. Results written to {args.output}")
16 changes: 16 additions & 0 deletions uc-0a/results_ahmedabad.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
AM-202401,Other,Standard,Tarmac surface melting at 44°C,NEEDS_REVIEW
AM-202402,Heat Hazard,Standard,Metal bus shelter reaching dangerous temperatures,
AM-202405,Other,Standard,Dead trees with split branches,NEEDS_REVIEW
AM-202406,Heat Hazard,Standard,Irrigation system broken,
AM-202407,Other,Urgent,Broken bench and upturned paving,NEEDS_REVIEW
AM-202410,Pothole,Standard,Pothole on main highway causing morning rush lane closure,
AM-202414,Other,Standard,Residential colony unlit after 9pm,NEEDS_REVIEW
AM-202417,Waste,Standard,Night market waste not cleared before morning,
AM-202421,Noise,Standard,Club music audible at residential buildings at 2am,
AM-202424,Road Damage,Standard,Zoo approach road surface bubbling at 45°C,
AM-202429,Heat Hazard,Standard,River walk surface temperature unbearable,
AM-202431,Heritage Damage,Standard,Old city road subsidence near ancient step well,
AM-202435,Road Damage,Standard,Black metal road dividers storing heat,
AM-202444,Waste,Standard,Restaurant waste bins overflowing on Sunday night,
AM-202445,Other,Standard,BRT shelter roof glass broken,NEEDS_REVIEW
16 changes: 16 additions & 0 deletions uc-0a/results_hyderabad.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
GH-202401,Flooding,Urgent,Underpass flooded after 1hr rain,
GH-202402,Flooding,Standard,Market area flooded,
GH-202406,Flooding,Standard,Main stormwater drain 100% blocked with construction debris,
GH-202407,Drain Blockage,Standard,Drain blocked and mosquito breeding,
GH-202410,Pothole,Standard,Potholes causing vehicles to slow to 20kmph on fast road,
GH-202411,Pothole,Urgent,Pothole swallowed entire motorcycle wheel,
GH-202412,Pothole,Urgent,School bus struggling to navigate 6 potholes in 200m stretch,
GH-202417,Waste,Standard,Heritage zone garbage overflow,
GH-202420,Other,Standard,Construction drilling from 5am daily near residential towers,NEEDS_REVIEW
GH-202422,Pothole,Urgent,Road collapsed partially,
GH-202424,Flooding,Standard,Underpass floods in light rain,
GH-202428,Waste,Standard,Post-market waste not cleared,
GH-202432,Other,Standard,24hr supermarket delivery trucks idling with engines on,NEEDS_REVIEW
GH-202448,Flooding,Standard,Main drain blocked — entire locality at flooding risk this week,
GH-202438,Flooding,Standard,Colony surrounded by fields that channel rainwater through main road,
16 changes: 16 additions & 0 deletions uc-0a/results_kolkata.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
KM-202401,Streetlight,Standard,Heritage lamp post knocked over by delivery vehicle,
KM-202402,Road Damage,Standard,Historic tram road cobblestones broken up by cable laying work,
KM-202405,Other,Standard,Wedding band playing near Tagore Museum at 11pm,NEEDS_REVIEW
KM-202409,Pothole,Standard,Airport access road full of potholes,
KM-202410,Pothole,Standard,Pothole causing tyre blowouts,
KM-202411,Pothole,Standard,Deep pothole filling with rainwater,
KM-202415,Road Damage,Standard,New residential complex draining directly onto public road,
KM-202418,Waste,Standard,Tourist zone waste overflowing,
KM-202421,Pothole,Urgent,Footpath broken and sinking,
KM-202422,Road Damage,Standard,Road surface buckled near bridge,
KM-202426,Heritage Damage,Standard,Heritage residential building exterior defaced by billboard installation,
KM-202430,Road Damage,Standard,Road subsided near gas pipeline,
KM-202434,Heritage Damage,Standard,Street paving removed for utility work — heritage stone not replaced,
KM-202436,Streetlight,Standard,Entire colony substation tripped,
KM-202438,Heritage Damage,Standard,Street vendors using amplifiers illegally in heritage precinct,
16 changes: 16 additions & 0 deletions uc-0a/results_pune.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
PM-202401,Pothole,Standard,Large pothole 60cm wide causing tyre damage,
PM-202402,Pothole,Urgent,Deep pothole near bus stop,
PM-202406,Flooding,Standard,Underpass flooded knee-deep after 2hrs rain,
PM-202408,Flooding,Standard,Bus stand flooded,
PM-202410,Streetlight,Standard,Three consecutive streetlights out for 10 days,
PM-202411,Streetlight,Urgent,Streetlight flickering and sparking,
PM-202413,Waste,Standard,Overflowing garbage bins near vegetable market,
PM-202418,Noise,Standard,Wedding venue playing music past midnight on weeknights,
PM-202419,Road Damage,Standard,Road surface cracked and sinking near utility work done 1 month ago,
PM-202420,Pothole,Urgent,Manhole cover missing,
PM-202427,Flooding,Standard,Bridge approach floods in 30mins of rain,
PM-202428,Other,Standard,Dead animal not removed for 36 hours,NEEDS_REVIEW
PM-202430,Streetlight,Standard,"Heritage street, lights out",
PM-202433,Waste,Standard,Bulk waste from apartment renovation dumped on public road,
PM-202446,Other,Urgent,Footpath tiles broken and upturned,NEEDS_REVIEW
Loading