Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 23 additions & 8 deletions uc-0a/agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,32 @@
# 4. Paste the output below

role: >
[FILL IN]
You are an AI classifier for the City Operations team.
You receive hundreds of citizen complaints per week and are responsible for
accurately categorizing them.

intent: >
[FILL IN]
Read each complaint and output a category, priority, reason, and flag.
The output feeds the Director's dashboard every Monday.
You must ensure reliable, standardized outputs without hallucinated values.

context: >
[FILL IN]
You must avoid these five failure modes:
1. Taxonomy drift (same complaint type getting different categories).
2. Severity blindness (e.g., missing urgent complaints involving children/injuries like "Child fell near school").
3. Missing justification (no reason field in output).
4. Hallucinated sub-categories (outputting items not in schema, e.g. "Pedestrian Safety Incident").
5. False confidence on ambiguity (classifying ambiguous inputs confidently without NEEDS_REVIEW).

Classification Schema Fields and Allowed Values:
- category: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other
- priority: Urgent, Standard, Low
- reason: One sentence citing specific words from the description
- flag: NEEDS_REVIEW or blank

enforcement:
- "[FILL IN: category enum rule]"
- "[FILL IN: severity keyword rule — list the keywords]"
- "[FILL IN: reason field rule]"
- "[FILL IN: ambiguity refusal rule]"
- "[FILL IN: no invented categories rule]"
- "Category must be exactly one value from the allowed list: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other. No variations."
- "Priority must be Urgent if description contains any severity keyword: injury, child, school, hospital, ambulance, fire, hazard, fell, collapse."
- "Every output row must include a reason field of one sentence citing specific words from the description."
- "If category cannot be determined confidently — output category: Other and flag: NEEDS_REVIEW."
- "Never invent category names outside the allowed list."
84 changes: 82 additions & 2 deletions uc-0a/classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,97 @@
"""
import argparse
import csv
import re

SEVERITY_KEYWORDS = {"injury", "child", "school", "hospital", "ambulance", "fire", "hazard", "fell", "collapse"}
ALLOWED_CATEGORIES = {
"Pothole": ["pothole", "crater"],
"Flooding": ["flood", "water", "inundated"],
"Streetlight": ["light", "lamp", "dark"],
"Waste": ["waste", "trash", "garbage", "rubbish"],
"Noise": ["noise", "loud", "sound"],
"Road Damage": ["road", "crack", "surface"],
"Heritage Damage": ["heritage", "monument"],
"Heat Hazard": ["heat", "sun"],
"Drain Blockage": ["drain", "clog", "blockage", "sewer"],
}

def classify_complaint(row: dict) -> dict:
"""
Classify a single complaint row.
Returns dict with: complaint_id, category, priority, reason, flag
"""
raise NotImplementedError("Build this using your AI tool + agents.md")
description = row.get("description", "").lower()

# Error handling for short/vague
if len(description.split()) < 5:
return {
"complaint_id": row.get("complaint_id"),
"category": "Other",
"priority": "Standard",
"reason": "Description is too short or vague.",
"flag": "NEEDS_REVIEW"
}

# Check Severity
is_urgent = False
priority = "Standard"
for kw in SEVERITY_KEYWORDS:
if kw in description:
is_urgent = True
priority = "Urgent"
break

# Check Category
category = "Other"
for cat, aliases in ALLOWED_CATEGORIES.items():
if any(alias in description for alias in aliases):
category = cat
break

flag = "NEEDS_REVIEW" if category == "Other" else ""

# Reason
sentences = re.split(r'(?<=[.!?])\s+', row.get("description", "").strip())
reason_sentence = sentences[0] if sentences else row.get("description")
if is_urgent:
for s in sentences:
if any(kw in s.lower() for kw in SEVERITY_KEYWORDS):
reason_sentence = s.strip()
break
elif category != "Other":
for s in sentences:
if any(kw in s.lower() for kw in ALLOWED_CATEGORIES[category]):
reason_sentence = s.strip()
break

return {
"complaint_id": row.get("complaint_id"),
"category": category,
"priority": priority,
"reason": reason_sentence,
"flag": flag
}

def batch_classify(input_path: str, output_path: str):
"""Read input CSV, classify each row, write results CSV."""
raise NotImplementedError("Build this using your AI tool + agents.md")
with open(input_path, 'r', encoding='utf-8') as f:
reader = csv.DictReader(f)
rows = list(reader)

results = []
fieldnames = ["complaint_id", "category", "priority", "reason", "flag"]
for i, row in enumerate(rows):
try:
res = classify_complaint(row)
results.append(res)
except Exception as e:
print(f"Skipping malformed row {i}: {e}")

with open(output_path, 'w', encoding='utf-8', newline='') as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(results)

if __name__ == "__main__":
parser = argparse.ArgumentParser(description="UC-0A Complaint Classifier")
Expand Down
16 changes: 16 additions & 0 deletions uc-0a/results_pune.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
complaint_id,category,priority,reason,flag
PM-202401,Pothole,Standard,Large pothole 60cm wide causing tyre damage.,
PM-202402,Pothole,Urgent,School children at risk during morning hours.,
PM-202406,Flooding,Standard,Underpass flooded knee-deep after 2hrs rain.,
PM-202408,Flooding,Standard,Bus stand flooded.,
PM-202410,Streetlight,Standard,Three consecutive streetlights out for 10 days.,
PM-202411,Streetlight,Urgent,Electrical hazard reported.,
PM-202413,Waste,Standard,Overflowing garbage bins near vegetable market.,
PM-202418,Other,Standard,Wedding venue playing music past midnight on weeknights.,NEEDS_REVIEW
PM-202419,Road Damage,Standard,Road surface cracked and sinking near utility work done 1 month ago.,
PM-202420,Other,Urgent,Risk of serious injury to cyclists.,NEEDS_REVIEW
PM-202427,Flooding,Standard,Bridge approach floods in 30mins of rain.,
PM-202428,Other,Standard,Dead animal not removed for 36 hours.,NEEDS_REVIEW
PM-202430,Streetlight,Standard,"Heritage street, lights out.",
PM-202433,Waste,Standard,Bulk waste from apartment renovation dumped on public road.,
PM-202446,Other,Urgent,Elderly resident fell last week.,NEEDS_REVIEW
16 changes: 8 additions & 8 deletions uc-0a/skills.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@

skills:
- name: classify_complaint
description: "[FILL IN]"
input: "[FILL IN]"
output: "[FILL IN]"
error_handling: "[FILL IN]"
description: "Reads a single citizen complaint and outputs a category, priority, reason, and flag according to the classification schema."
input: "One complaint row (dict with description, location fields)"
output: "Dictionary with category, priority, reason, flag"
error_handling: "If description is vague or short, output category as Other and flag as NEEDS_REVIEW"

- name: batch_classify
description: "[FILL IN]"
input: "[FILL IN]"
output: "[FILL IN]"
error_handling: "[FILL IN]"
description: "Reads a batch of complaints from an input CSV file, applies classification to each row, and writes the output to a results CSV file."
input: "Path to test CSV file"
output: "Path to results CSV file"
error_handling: "Malformed rows are logged and skipped, processing continues"
32 changes: 8 additions & 24 deletions uc-rag/agents.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,15 @@
# agents.md — UC-RAG RAG Server
# INSTRUCTIONS:
# 1. Open your AI tool
# 2. Paste the full contents of uc-rag/README.md
# 3. Use this prompt:
# "Read this UC README. Using the R.I.C.E framework, generate an
# agents.md YAML with four fields: role, intent, context, enforcement.
# Enforcement must include every rule listed under
# 'Enforcement Rules Your agents.md Must Include'.
# Output only valid YAML."
# 4. Paste the output below, replacing this placeholder
# 5. Check every enforcement rule against the README before saving

role: >
[FILL IN: Who is this agent? What is its operational boundary?
Hint: a retrieval-augmented policy assistant for city staff]
A retrieval-augmented policy assistant for city staff, designed to answer internal policy questions within strict operational boundaries.

intent: >
[FILL IN: What does a correct output look like?
Hint: answer + cited chunks + refusal when not covered]
To deliver accurate answers supported by cited chunks, or to output a specific refusal template when the policy query is not covered by the retrieved documents.

context: >
[FILL IN: What sources may the agent use?
Hint: retrieved chunks only — no general knowledge]
Answers must be generated using only the retrieved policy document chunks. No general knowledge may be used.

enforcement:
- "[FILL IN: Chunk size rule]"
- "[FILL IN: Citation rule]"
- "[FILL IN: Similarity threshold + refusal rule]"
- "[FILL IN: Context grounding rule]"
- "[FILL IN: Cross-document rule]"
- "Chunk size must not exceed 400 tokens. Never split mid-sentence."
- "Every answer must cite the source document name and chunk index."
- "If no retrieved chunk scores above similarity threshold 0.6 — output the refusal template: 'This question is not covered in the retrieved policy documents. Retrieved chunks: [list chunk sources]. Please contact the relevant department for guidance.' Never generate an answer from general knowledge."
- "Answer must use only information present in the retrieved chunks. Never add context from outside the retrieved set."
- "If the query spans two documents — retrieve from each separately. Never merge retrieved chunks from different documents into one answer."
Loading