Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 27 additions & 11 deletions uc-0a/agents.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,34 @@
# agents.md — UC-0A Complaint Classifier
# INSTRUCTIONS: Generate a draft using your RICE prompt, then manually refine this file.
# Delete these comments before committing.

role: >
[FILL IN: Who is this agent? What is its operational boundary?]
You are a municipal complaint classifier. Your sole responsibility is to read
citizen complaint descriptions and produce structured classification output.
You operate strictly within a fixed taxonomy and severity ruleset. You do not
infer intent beyond what is stated in the complaint text, and you do not
invent or extend the allowed category list.

intent: >
[FILL IN: What does a correct output look like — make it verifiable]
For each complaint row, produce exactly four fields: category (one value from
the allowed taxonomy), priority (Urgent, Standard, or Low), reason (one
sentence that quotes or directly references specific words from the complaint
description), and flag (NEEDS_REVIEW if the category is genuinely ambiguous,
blank otherwise). A correct output is verifiable: category matches the allowed
list exactly, priority reflects the presence or absence of severity keywords,
reason traces back to the source text, and flag is set whenever reasonable
doubt exists about the classification.

context: >
[FILL IN: What information is the agent allowed to use? State exclusions explicitly.]
The agent may use only the complaint description text provided in the input
row and the fixed classification schema defined in this configuration. It must
not use external knowledge to infer categories, must not hallucinate
sub-categories not present in the allowed list, and must not fabricate
severity cues that are absent from the description. No information outside the
input CSV row and this schema is permitted.

enforcement:
- "[FILL IN: Specific testable rule 1 — e.g. Category must be exactly one of: Pothole, Flooding, ...]"
- "[FILL IN: Specific testable rule 2 — e.g. Priority must be Urgent if description contains: injury, child, school, ...]"
- "[FILL IN: Specific testable rule 3 — e.g. Every output row must include a reason field citing specific words from the description]"
- "[FILL IN: Refusal condition — e.g. If category cannot be determined from description alone, output category: Other and flag: NEEDS_REVIEW]"
- "category must be one of exactly: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other — no spelling variations, abbreviations, or invented sub-categories are allowed"
- "priority must be one of exactly: Urgent, Standard, Low"
- "priority must be set to Urgent if any of the following keywords appear anywhere in the complaint description: injury, child, school, hospital, ambulance, fire, hazard, fell, collapse"
- "reason must be a single sentence and must cite specific words or phrases taken directly from the complaint description — generic justifications not grounded in the source text are invalid"
- "flag must be set to NEEDS_REVIEW when the complaint description is genuinely ambiguous and could reasonably map to more than one category; flag must be blank when the classification is clear"
- "flag must never be left blank solely to appear confident — ambiguity must be surfaced"
- "no extra fields, sub-categories, or schema extensions may be added to the output"
- "category values must be reproduced exactly as specified, including capitalisation and spacing — no case or punctuation variations are permitted"
212 changes: 201 additions & 11 deletions uc-0a/classifier.py
Original file line number Diff line number Diff line change
@@ -1,29 +1,219 @@
"""
UC-0A — Complaint Classifier
Starter file. Build this using the RICE → agents.md → skills.md → CRAFT workflow.
Pure-Python regex classifier enforcing the RICE rules from agents.md and skills.md.
No external dependencies beyond the standard library.
"""
import argparse
import csv
import re
import sys

ALLOWED_CATEGORIES = [
"Pothole", "Flooding", "Streetlight", "Waste", "Noise",
"Road Damage", "Heritage Damage", "Heat Hazard", "Drain Blockage", "Other",
]

SEVERITY_KEYWORDS = [
"injury", "child", "school", "hospital", "ambulance",
"fire", "hazard", "fell", "collapse",
]

# Per-category regex patterns. Each entry: (pattern_string, display_label).
# Patterns are matched case-insensitively against the full description.
CATEGORY_PATTERNS = {
"Pothole": [
(r"pot[\s-]?hole", "pothole"),
],
"Flooding": [
(r"\bflood", "flood"),
(r"water[\s-]?log", "waterlogged"),
(r"\binundat", "inundated"),
(r"\bsubmerg", "submerged"),
(r"knee[\s-]?deep", "knee-deep"),
(r"standing\s+water", "standing water"),
],
"Streetlight": [
(r"street[\s-]?light", "streetlight"),
(r"\blamp\s*post\b", "lamp post"),
(r"\blights?\s+(out|off|flicker|spark)", "lights out/flickering"),
],
"Waste": [
(r"\bgarbage\b", "garbage"),
(r"\bwaste\b", "waste"),
(r"\brubbish\b", "rubbish"),
(r"\btrash\b", "trash"),
(r"\blitter\b", "litter"),
(r"\bbins?\b", "bin"),
(r"dead\s+animal", "dead animal"),
(r"\brefuse\b", "refuse"),
(r"\bdumped\b", "dumped"),
(r"\boverflowing\b", "overflowing"),
],
"Noise": [
(r"\bnoise\b", "noise"),
(r"\bnoisy\b", "noisy"),
(r"\bloud\b", "loud"),
(r"\bmusic\b", "music"),
(r"\bdisturbance\b", "disturbance"),
],
"Road Damage": [
(r"road\s+surface", "road surface"),
(r"\bcracked\b", "cracked"),
(r"\bsinking\b", "sinking"),
(r"\bfootpath\b", "footpath"),
(r"\btiles?\s+(broken|upturned|cracked)", "broken/upturned tiles"),
(r"broken\s+road", "broken road"),
],
"Heritage Damage": [
(r"\bheritage\b", "heritage"),
(r"\bhistoric", "historic"),
(r"\bmonument\b", "monument"),
(r"\bancient\b", "ancient"),
],
"Heat Hazard": [
(r"\bheatwave\b", "heatwave"),
(r"extreme\s+(heat|temperature)", "extreme heat/temperature"),
(r"\bscorch", "scorching"),
],
"Drain Blockage": [
(r"\bdrains?\b", "drain"),
(r"\bmanhole\b", "manhole"),
(r"\bsewer\b", "sewer"),
(r"drain\s+block", "drain block"),
],
}


def _match_categories(description: str) -> dict:
"""
Return {category: [matched_text, ...]} for every category with at least one hit.
Matched text is the actual substring found in the description.
"""
hits = {}
for cat, patterns in CATEGORY_PATTERNS.items():
matched = []
for pattern, _ in patterns:
m = re.search(pattern, description, re.I)
if m:
matched.append(m.group(0))
if matched:
hits[cat] = matched
return hits


def classify_complaint(row: dict) -> dict:
"""
Classify a single complaint row.
Returns: dict with keys: complaint_id, category, priority, reason, flag

TODO: Build this using your AI tool guided by your agents.md and skills.md.
Your RICE enforcement rules must be reflected in this function's behaviour.
Classify a single complaint row using regex keyword matching.
Returns dict with keys: category, priority, reason, flag.
"""
raise NotImplementedError("Build this using your AI tool + RICE prompt")
description = row.get("description", "").strip()

if not description:
return {
"category": "Other",
"priority": "Low",
"reason": "No parseable description provided",
"flag": "NEEDS_REVIEW",
}

# Priority: Urgent if any severity keyword present, Standard otherwise
severity_hits = [
kw for kw in SEVERITY_KEYWORDS
if re.search(r"\b" + kw + r"\b", description, re.I)
]
priority = "Urgent" if severity_hits else "Standard"

cat_hits = _match_categories(description)

if not cat_hits:
return {
"category": "Other",
"priority": priority,
"reason": "No category keywords matched in the complaint description.",
"flag": "",
}

if len(cat_hits) == 1:
category = next(iter(cat_hits))
words = cat_hits[category]
reason = f'Description contains "{", ".join(words)}", indicating {category}.'
flag = ""
else:
# Multiple categories match — pick the one with the most keyword hits;
# flag as ambiguous since the description maps to more than one category.
category = max(cat_hits, key=lambda c: len(cat_hits[c]))
all_words = [w for hits in cat_hits.values() for w in hits]
matched_cats = " or ".join(cat_hits.keys())
reason = (
f'Description contains "{", ".join(all_words)}" which could indicate '
f'{matched_cats}.'
)
flag = "NEEDS_REVIEW"

return {
"category": category,
"priority": priority,
"reason": reason,
"flag": flag,
}


def batch_classify(input_path: str, output_path: str):
"""
Read input CSV, classify each row, write results CSV.

TODO: Build this using your AI tool.
Must: flag nulls, not crash on bad rows, produce output even if some rows fail.
Aborts on missing file or missing description column.
Applies error-handling defaults for individual malformed rows.
"""
raise NotImplementedError("Build this using your AI tool + RICE prompt")
try:
with open(input_path, newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
fieldnames = reader.fieldnames

if not fieldnames or "description" not in fieldnames:
print(
"Error: input CSV is missing the required 'description' column.",
file=sys.stderr,
)
sys.exit(1)

rows = list(reader)

except FileNotFoundError:
print(f"Error: input file '{input_path}' not found.", file=sys.stderr)
sys.exit(1)
except OSError as e:
print(f"Error: unable to read input file '{input_path}': {e}", file=sys.stderr)
sys.exit(1)

output_fields = list(fieldnames) + ["category", "priority", "reason", "flag"]

with open(output_path, "w", newline="", encoding="utf-8") as out_f:
writer = csv.DictWriter(out_f, fieldnames=output_fields)
writer.writeheader()

total = len(rows)
for i, row in enumerate(rows, start=1):
complaint_id = row.get("complaint_id", f"row-{i}")
try:
classification = classify_complaint(row)
except Exception as e:
print(
f"Warning: unexpected error on row {i} ({complaint_id}): {e}",
file=sys.stderr,
)
classification = {
"category": "Other",
"priority": "Low",
"reason": "No parseable description provided",
"flag": "NEEDS_REVIEW",
}

writer.writerow({**row, **classification})
print(
f" [{i}/{total}] {complaint_id} → {classification['category']} / "
f"{classification['priority']}"
+ (f" [{classification['flag']}]" if classification["flag"] else "")
)


if __name__ == "__main__":
Expand Down
16 changes: 16 additions & 0 deletions uc-0a/results_pune.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,date_raised,city,ward,location,description,reported_by,days_open,category,priority,reason,flag
PM-202401,2024-06-21,Pune,Ward 4 – Warje,Karve Road near Deccan Gymkhana,Large pothole 60cm wide causing tyre damage. Three vehicles affected this week.,Citizen Portal,4,Pothole,Standard,"Description contains ""pothole"", indicating Pothole.",
PM-202402,2024-06-04,Pune,Ward 3 – Kothrud,FC Road near MIT College junction,Deep pothole near bus stop. School children at risk during morning hours.,Councillor Referral,12,Pothole,Urgent,"Description contains ""pothole"", indicating Pothole.",
PM-202406,2024-06-08,Pune,Ward 1 – Kasba,Ambedkar Road underpass,Underpass flooded knee-deep after 2hrs rain. Commuters stranded.,Councillor Referral,8,Flooding,Standard,"Description contains ""flood, knee-deep"", indicating Flooding.",
PM-202408,2024-06-05,Pune,Ward 3 – Kothrud,Kothrud Depot bus stand,Bus stand flooded. Passengers standing in water. Drain blocked.,Ward Office Walk-in,12,Drain Blockage,Standard,"Description contains ""flood, Drain, Drain block"" which could indicate Flooding or Drain Blockage.",NEEDS_REVIEW
PM-202410,2024-06-04,Pune,Ward 2 – Shivajinagar,JM Road near Goodluck Chowk,Three consecutive streetlights out for 10 days. Area very dark at night.,Email,18,Streetlight,Standard,"Description contains ""streetlight"", indicating Streetlight.",
PM-202411,2024-06-22,Pune,Ward 2 – Shivajinagar,"Viman Nagar, Vadgaon Sheri Road",Streetlight flickering and sparking. Electrical hazard reported.,Social Media,9,Streetlight,Urgent,"Description contains ""Streetlight"", indicating Streetlight.",
PM-202413,2024-06-29,Pune,Ward 1 – Kasba,Shivajinagar market area,Overflowing garbage bins near vegetable market. Smell affecting shoppers.,WhatsApp Helpline,13,Waste,Standard,"Description contains ""garbage, bins, Overflowing"", indicating Waste.",
PM-202418,2024-06-02,Pune,Ward 4 – Warje,Kalyani Nagar residential zone,Wedding venue playing music past midnight on weeknights.,Citizen Portal,5,Noise,Standard,"Description contains ""music"", indicating Noise.",
PM-202419,2024-06-01,Pune,Ward 2 – Shivajinagar,Senapati Bapat Road,Road surface cracked and sinking near utility work done 1 month ago.,Social Media,2,Road Damage,Standard,"Description contains ""Road surface, cracked, sinking"", indicating Road Damage.",
PM-202420,2024-06-03,Pune,Ward 1 – Kasba,Paud Road near Kothrud,Manhole cover missing. Risk of serious injury to cyclists.,Email,14,Drain Blockage,Urgent,"Description contains ""Manhole"", indicating Drain Blockage.",
PM-202427,2024-06-07,Pune,Ward 5 – Hadapsar,Erandwane depression near bridge,Bridge approach floods in 30mins of rain. Bridge becomes inaccessible.,Councillor Referral,12,Flooding,Standard,"Description contains ""flood"", indicating Flooding.",
PM-202428,2024-06-23,Pune,Ward 3 – Kothrud,Pune University Road,Dead animal not removed for 36 hours. Health concern.,Email,1,Waste,Standard,"Description contains ""Dead animal"", indicating Waste.",
PM-202430,2024-06-23,Pune,Ward 1 – Kasba,"Rasta Peth, old city area","Heritage street, lights out. Safety concern for pedestrians after dark.",Social Media,19,Streetlight,Standard,"Description contains ""lights out, Heritage"" which could indicate Streetlight or Heritage Damage.",NEEDS_REVIEW
PM-202433,2024-06-08,Pune,Ward 2 – Shivajinagar,Dhole Patil Road,Bulk waste from apartment renovation dumped on public road.,Phone Helpline,20,Waste,Standard,"Description contains ""waste, dumped"", indicating Waste.",
PM-202446,2024-06-09,Pune,Ward 2 – Shivajinagar,"Swargate, Tilak Road",Footpath tiles broken and upturned. Elderly resident fell last week.,Email,18,Road Damage,Urgent,"Description contains ""Footpath, tiles broken"", indicating Road Damage.",
60 changes: 46 additions & 14 deletions uc-0a/skills.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,48 @@
# skills.md
# INSTRUCTIONS: Generate a draft by prompting AI, then manually refine this file.
# Delete these comments before committing.

skills:
- name: [skill_name]
description: [One sentence — what does this skill do?]
input: [What does it receive? Type and format.]
output: [What does it return? Type and format.]
error_handling: [What does it do when input is invalid or ambiguous?]
- name: classify_complaint
description: Classifies a single citizen complaint row into a category, priority, reason, and optional review flag using a fixed taxonomy and severity keyword rules.
input:
type: object
format: >
A single complaint record with at least a free-text description field;
may include an optional complaint_id for traceability.
output:
type: object
format: >
category (string, one of: Pothole, Flooding, Streetlight, Waste, Noise,
Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other),
priority (string, one of: Urgent, Standard, Low),
reason (string, one sentence citing specific words from the description),
flag (string, NEEDS_REVIEW if category is genuinely ambiguous, else blank).
error_handling: >
If the description is empty or unparseable, output category=Other,
priority=Low, reason="No parseable description provided", flag=NEEDS_REVIEW.
If severity keywords (injury, child, school, hospital, ambulance, fire,
hazard, fell, collapse) are present, priority must be Urgent regardless of
other signals; never downgrade to Standard or Low.
If the complaint could legitimately belong to more than one allowed category,
set flag=NEEDS_REVIEW and choose the most specific matching category.
Never invent category names outside the allowed list.

- name: [second_skill_name]
description: [One sentence]
input: [Type and format]
output: [Type and format]
error_handling: [What does it do when input is invalid or ambiguous?]
- name: batch_classify
description: Reads an input CSV of complaint rows, applies classify_complaint to each row, and writes the results to an output CSV with the four classification columns appended.
input:
type: file
format: >
CSV file path; rows must contain at least a description column;
category and priority_flag columns are absent (stripped from source data).
output:
type: file
format: >
CSV file at the specified output path containing all original columns plus
category, priority, reason, and flag columns produced by classify_complaint
for every row.
error_handling: >
If the input file is missing or unreadable, abort with a clear error message
and do not create a partial output file.
If an individual row is malformed or has an empty description, apply the
classify_complaint error handling for that row (category=Other, priority=Low,
reason="No parseable description provided", flag=NEEDS_REVIEW) and continue
processing remaining rows.
If the input CSV lacks the expected description column, abort with an error
identifying the missing column.
Loading