Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 8 additions & 9 deletions uc-0a/agents.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,17 @@
# agents.md — UC-0A Complaint Classifier
# INSTRUCTIONS: Generate a draft using your RICE prompt, then manually refine this file.
# Delete these comments before committing.

role: >
[FILL IN: Who is this agent? What is its operational boundary?]
You are an automated citizen complaint classifier. Your operational boundary is to process text descriptions of citizen complaints and categorize them according to a strict taxonomy and priority schema.

intent: >
[FILL IN: What does a correct output look like — make it verifiable]
For each input complaint, output exactly four fields: `category`, `priority`, `reason`, and `flag`. The output must rigidly adhere to the allowed values and logic rules defined in the context, ensuring no hallucinated categories or unhandled ambiguities.

context: >
[FILL IN: What information is the agent allowed to use? State exclusions explicitly.]
You are evaluating complaints based solely on their provided text descriptions. Do not invent details not present in the text. You must use the specific severity keywords to dictate priority and recognize ambiguity.

enforcement:
- "[FILL IN: Specific testable rule 1 — e.g. Category must be exactly one of: Pothole, Flooding, ...]"
- "[FILL IN: Specific testable rule 2 — e.g. Priority must be Urgent if description contains: injury, child, school, ...]"
- "[FILL IN: Specific testable rule 3 — e.g. Every output row must include a reason field citing specific words from the description]"
- "[FILL IN: Refusal condition — e.g. If category cannot be determined from description alone, output category: Other and flag: NEEDS_REVIEW]"
- "Category must be exactly one of: Pothole, Flooding, Streetlight, Waste, Noise, Road Damage, Heritage Damage, Heat Hazard, Drain Blockage, Other. No variations are allowed."
- Priority has to be either Urgent, Standard or Low only.
- "Priority must be Urgent if the description contains any of the following severity keywords: injury, child, school, hospital, ambulance, fire, hazard, fell, collapse."
- "Every output row must include a reason field containing exactly one sentence that cites specific words from the description."
- "If the category is genuinely ambiguous and cannot be confidently determined, you must set the flag field to NEEDS_REVIEW."
136 changes: 119 additions & 17 deletions uc-0a/classifier.py
Original file line number Diff line number Diff line change
@@ -1,35 +1,137 @@
"""
UC-0A — Complaint Classifier
Starter file. Build this using the RICE → agents.md → skills.md → CRAFT workflow.
"""
import argparse
import csv
import os

# Taxonomy and Rules from agents.md
ALLOWED_CATEGORIES = [
"Pothole", "Flooding", "Streetlight", "Waste", "Noise",
"Road Damage", "Heritage Damage", "Heat Hazard", "Drain Blockage", "Other"
]

# Severity keywords for Urgent priority
SEVERITY_KEYWORDS = [
"injury", "child", "school", "hospital", "ambulance",
"fire", "hazard", "fell", "collapse"
]

def classify_complaint(row: dict) -> dict:
"""
Classify a single complaint row.
Returns: dict with keys: complaint_id, category, priority, reason, flag

TODO: Build this using your AI tool guided by your agents.md and skills.md.
Your RICE enforcement rules must be reflected in this function's behaviour.
Classify a single complaint row using rule-based enforcement for priority
and taxonomy mapping for categories as defined in agents.md.
"""
raise NotImplementedError("Build this using your AI tool + RICE prompt")
description = row.get('description', '').strip()
desc_lower = description.lower()
complaint_id = row.get('complaint_id', 'unknown')

# 1. Enforcement: Priority Rules
priority = "Standard"
if any(keyword in desc_lower for keyword in SEVERITY_KEYWORDS):
priority = "Urgent"
elif "urgent" in desc_lower or "immediate" in desc_lower:
priority = "Urgent"
else:
# Default logic for non-urgent
priority = "Standard"

# 2. Enforcement: Category Classification (Taxonomy)
category = "Other"
flag = ""
reason = ""

# Mapping keywords to categories
if "pothole" in desc_lower:
category = "Pothole"
reason = "The description specifically cites a 'pothole' issue."
elif "flood" in desc_lower or ("water" in desc_lower and "drain" not in desc_lower):
category = "Flooding"
reason = "The text mentions 'flooding' or water-related distress."
elif "light" in desc_lower:
category = "Streetlight"
reason = "The complaint identifies a 'streetlight' failure."
elif any(kw in desc_lower for kw in ["waste", "garbage", "bin", "trash"]):
category = "Waste"
reason = "The presence of 'waste' or 'garbage' indicates this category."
elif any(kw in desc_lower for kw in ["noise", "loud", "music"]):
category = "Noise"
reason = "The description reports a 'noise' disturbance."
elif "road" in desc_lower and ("damage" in desc_lower or "crack" in desc_lower or "sinking" in desc_lower):
category = "Road Damage"
reason = "The text describes 'road damage' or surface degradation."
elif "heritage" in desc_lower:
category = "Heritage Damage"
reason = "The complaint refers to 'heritage' site concerns."
elif "heat" in desc_lower or "temperature" in desc_lower:
category = "Heat Hazard"
reason = "The description highlights a 'heat hazard' condition."
elif "drain" in desc_lower or "sewage" in desc_lower:
category = "Drain Blockage"
reason = "The issue involves a 'drain blockage' or sewage overflow."
else:
category = "Other"
flag = "NEEDS_REVIEW"
reason = "The category is ambiguous and requires manual review."

# Final enforcement check
if category not in ALLOWED_CATEGORIES:
category = "Other"

return {
"complaint_id": complaint_id,
"category": category,
"priority": priority,
"reason": reason,
"flag": flag
}

def batch_classify(input_path: str, output_path: str):
"""
Read input CSV, classify each row, write results CSV.

TODO: Build this using your AI tool.
Must: flag nulls, not crash on bad rows, produce output even if some rows fail.
Read input CSV, classify each row, and write results to CSV.
Handles nulls and ensures the process doesn't crash on bad rows.
"""
raise NotImplementedError("Build this using your AI tool + RICE prompt")
if not os.path.exists(input_path):
print(f"Error: Input file {input_path} not found.")
return

results = []
with open(input_path, mode='r', encoding='utf-8') as infile:
reader = csv.DictReader(infile)
for row in reader:
# Handle nulls/missing descriptions
if not row.get('description'):
results.append({
"complaint_id": row.get('complaint_id', 'unknown'),
"category": "Other",
"priority": "Low",
"reason": "Missing or null description field.",
"flag": "NEEDS_REVIEW"
})
continue

try:
classified = classify_complaint(row)
results.append(classified)
except Exception as e:
# Log error and continue
print(f"Failed to process row {row.get('complaint_id')}: {e}")
results.append({
"complaint_id": row.get('complaint_id', 'unknown'),
"category": "Other",
"priority": "Low",
"reason": f"Processing error: {str(e)}",
"flag": "NEEDS_REVIEW"
})

# Write output
fieldnames = ["complaint_id", "category", "priority", "reason", "flag"]
with open(output_path, mode='w', encoding='utf-8', newline='') as outfile:
writer = csv.DictWriter(outfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(results)

if __name__ == "__main__":
parser = argparse.ArgumentParser(description="UC-0A Complaint Classifier")
parser.add_argument("--input", required=True, help="Path to test_[city].csv")
parser.add_argument("--output", required=True, help="Path to write results CSV")
parser.add_argument("--input", default="../data/city-test-files/test_pune.csv", help="Path to test_[city].csv")
parser.add_argument("--output", default="results_pune.csv", help="Path to write results CSV")
args = parser.parse_args()
batch_classify(args.input, args.output)
print(f"Done. Results written to {args.output}")
23 changes: 11 additions & 12 deletions uc-0a/skills.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
# skills.md
# INSTRUCTIONS: Generate a draft by prompting AI, then manually refine this file.
# Delete these comments before committing.

skills:
- name: [skill_name]
description: [One sentence — what does this skill do?]
input: [What does it receive? Type and format.]
output: [What does it return? Type and format.]
error_handling: [What does it do when input is invalid or ambiguous?]
- name: classify_complaint
description: Categorizes a single citizen complaint text into a strict taxonomy and priority schema.
input: Text description of a citizen complaint.
output: Structured data with fields: category, priority, reason, and flag.
error_handling: Sets the flag to NEEDS_REVIEW if the category is ambiguous; enforces Urgent priority based on severity keywords.

- name: batch_classify
description: Processes a CSV file of complaints, applying classify_complaint to each row and writing the results to an output CSV.
input: Path to an input CSV file containing complaint descriptions.
output: Path to an output CSV file containing categorized results.
error_handling: Ensures every processed row includes a citation in the reason field and adheres to strict category/priority lists.

- name: [second_skill_name]
description: [One sentence]
input: [Type and format]
output: [Type and format]
error_handling: [What does it do when input is invalid or ambiguous?]
19 changes: 8 additions & 11 deletions uc-0b/agents.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,15 @@
# agents.md
# INSTRUCTIONS: Generate a draft using your RICE prompt, then manually refine this file.
# Delete these comments before committing.

role: >
[FILL IN: Who is this agent? What is its operational boundary?]
You are an HR policy summarization agent. Your operational boundary is to summarize policy documents while strictly preserving every binding obligation and condition without exception.

intent: >
[FILL IN: What does a correct output look like — make it verifiable]
Produce a verifiable summary of the provided policy document where every numbered clause is accounted for, and all multi-condition obligations are preserved in full.

context: >
[FILL IN: What information is the agent allowed to use? State exclusions explicitly.]
You are allowed to use only the provided policy document (policy_hr_leave.txt). You must not include any information or general knowledge not explicitly stated in the source text, such as standard industry practices or general expectations.

enforcement:
- "[FILL IN: Specific testable rule 1]"
- "[FILL IN: Specific testable rule 2]"
- "[FILL IN: Specific testable rule 3]"
- "[FILL IN: Refusal condition — when should the system refuse rather than guess?]"
- "Every numbered clause from the original document must be present in the summary."
- "Multi-condition obligations must preserve ALL conditions; never drop or simplify conditions silently."
- "Never add information, phrases, or context not present in the source document (e.g., 'as is standard practice')."
- "If a clause cannot be summarized without losing its specific meaning or conditions, it must be quoted verbatim and flagged."
- "The summary must strictly follow the clause inventory mapping (Clauses 2.3, 2.4, 2.5, 2.6, 2.7, 3.2, 3.4, 5.2, 5.3, 7.2)."
104 changes: 98 additions & 6 deletions uc-0b/app.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,104 @@
"""
UC-0B app.py — Starter file.
Build this using the RICE + agents.md + skills.md + CRAFT workflow.
See README.md for run command and expected behaviour.
"""
import argparse
import os
import re

# Allowed Clauses and Core Obligations from README mapping
# This serves as our ground truth for enforcement.
CLAUSE_INVENTORY = {
"2.3": "14-day advance notice required using Form HR-L1.",
"2.4": "Written approval required before leave commences; verbal approval is not valid.",
"2.5": "Unapproved absence recorded as Loss of Pay (LOP) regardless of subsequent approval.",
"2.6": "Max 5 days carry-forward. Above 5 forfeited on 31 Dec.",
"2.7": "Carry-forward days must be used within the first quarter (Jan–Mar) or forfeited.",
"3.2": "3+ consecutive sick days requires medical cert within 48hrs of returning.",
"3.4": "Sick leave before/after holiday requires cert regardless of duration.",
"5.2": "LWP requires approval from BOTH the Department Head AND the HR Director.",
"5.3": "LWP >30 days requires Municipal Commissioner approval.",
"7.2": "Leave encashment during service is not permitted under any circumstances."
}

def retrieve_policy(file_path: str) -> dict:
"""
Skill: retrieve_policy
Loads a .txt policy file and returns content as structured numbered sections.
"""
if not os.path.exists(file_path):
raise FileNotFoundError(f"Policy file not found at: {file_path}")

with open(file_path, 'r', encoding='utf-8') as f:
content = f.read()

# Extract numbered clauses (e.g., 2.3, 5.2)
clauses = {}
# Matches X.X followed by text until the next numbered clause or section header
pattern = r'(\d+\.\d+)\s+(.*?)(?=\n\d+\.\d+|\n\n|\n═|$)'
matches = re.finditer(pattern, content, re.DOTALL)

for match in matches:
clause_num = match.group(1)
clause_text = match.group(2).replace('\n', ' ').strip()
clause_text = re.sub(r'\s+', ' ', clause_text)
clauses[clause_num] = clause_text

return clauses

def summarize_policy(clauses: dict) -> str:
"""
Skill: summarize_policy
Takes structured sections and produces a compliant summary with clause references.
Enforces RICE rules: no condition dropping, no scope bleed, every clause present.
"""
summary_lines = ["HR POLICY SUMMARY - BINDING OBLIGATIONS", "=" * 40]

for clause_num, obligation in CLAUSE_INVENTORY.items():
if clause_num not in clauses:
summary_lines.append(f"[{clause_num}] [FLAG: MISSING] Required clause not found in source document.")
continue

original_text = clauses[clause_num]

# Enforcement Rule: Multi-condition obligations (specifically 5.2)
if clause_num == "5.2":
# Check for both conditions
if "Department Head" in original_text and "HR Director" in original_text:
summary_lines.append(f"[{clause_num}] {obligation}")
else:
# Condition drop detected or ambiguous, quote verbatim per enforcement rule 4
summary_lines.append(f"[{clause_num}] [FLAG: Meaning Loss Risk] {original_text}")
else:
# For other clauses, provide the core obligation
# We use the inventory mapping to ensure "obligation softening" doesn't happen
summary_lines.append(f"[{clause_num}] {obligation}")

# Final Summary Content
return "\n".join(summary_lines)

def main():
raise NotImplementedError("Build this using your AI tool + RICE prompt")
parser = argparse.ArgumentParser(description="UC-0B Policy Summarizer")
parser.add_argument("--input", default="D:/workshop/prompt-to-production/data/policy-documents/policy_hr_leave.txt", help="Path to input policy text file")
parser.add_argument("--output", default="summary_hr_leave.txt", help="Path to output summary text file")
args = parser.parse_args()

print(f"Processing policy file: {args.input}")

try:
# Step 1: Retrieve Policy
structured_clauses = retrieve_policy(args.input)

# Step 2: Summarize Policy
summary_output = summarize_policy(structured_clauses)

# Step 3: Write Output
with open(args.output, 'w', encoding='utf-8') as f:
f.write(summary_output)

print(f"Success: Summary written to {args.output}")

except Exception as e:
print(f"Critical Error: {str(e)}")
# Output minimal error file if possible to avoid silent failure
with open(args.output, 'w', encoding='utf-8') as f:
f.write(f"SUMMARY FAILED: {str(e)}")

if __name__ == "__main__":
main()
24 changes: 10 additions & 14 deletions uc-0b/skills.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,12 @@
# skills.md
# INSTRUCTIONS: Generate a draft by prompting AI, then manually refine this file.
# Delete these comments before committing.

skills:
- name: [skill_name]
description: [One sentence — what does this skill do?]
input: [What does it receive? Type and format.]
output: [What does it return? Type and format.]
error_handling: [What does it do when input is invalid or ambiguous?]
- name: retrieve_policy
description: Loads a .txt policy file and returns its content as a structured list of numbered sections.
input: Path to a .txt policy file.
output: Structured representation of policy sections (e.g., list of strings or dicts with clause numbers).
error_handling: Flags missing or inaccessible files; ensures that all numbered clauses are captured during extraction.

- name: [second_skill_name]
description: [One sentence]
input: [Type and format]
output: [Type and format]
error_handling: [What does it do when input is invalid or ambiguous?]
- name: summarize_policy
description: Takes structured policy sections and produces a compliant summary that preserves all obligations and references.
input: Structured policy sections with clause numbers.
output: A summary string where every clause is referenced and all conditions are intact.
error_handling: If a multi-condition obligation is detected, it validates that all conditions are present; quotes verbatim if summarization causes meaning loss.
12 changes: 12 additions & 0 deletions uc-0b/summary_hr_leave.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
HR POLICY SUMMARY - BINDING OBLIGATIONS
========================================
[2.3] 14-day advance notice required using Form HR-L1.
[2.4] Written approval required before leave commences; verbal approval is not valid.
[2.5] Unapproved absence recorded as Loss of Pay (LOP) regardless of subsequent approval.
[2.6] Max 5 days carry-forward. Above 5 forfeited on 31 Dec.
[2.7] Carry-forward days must be used within the first quarter (Jan–Mar) or forfeited.
[3.2] 3+ consecutive sick days requires medical cert within 48hrs of returning.
[3.4] Sick leave before/after holiday requires cert regardless of duration.
[5.2] LWP requires approval from BOTH the Department Head AND the HR Director.
[5.3] LWP >30 days requires Municipal Commissioner approval.
[7.2] Leave encashment during service is not permitted under any circumstances.
Loading