nasscomAI · mekimac · Apr 15, 2026
diff --git a/data/city-test-files/results_ahmedabad.csv b/data/city-test-files/results_ahmedabad.csv
@@ -0,0 +1,16 @@
+complaint_id,category,priority,reason,flag
+AM-202401,Other,Standard,"Other classification based on 'Tarmac, surface, melting'.",NEEDS_REVIEW
+AM-202402,Other,Standard,"Other classification based on 'Metal, shelter, reaching'.",NEEDS_REVIEW
+AM-202405,Other,Standard,"Other classification based on 'Dead, trees, with'.",NEEDS_REVIEW
+AM-202406,Other,Standard,"Other classification based on 'Irrigation, system, broken.'.",NEEDS_REVIEW
+AM-202407,Other,Urgent,"Other classification based on 'Broken, bench, upturned'.",NEEDS_REVIEW
+AM-202410,Pothole,Standard,"Pothole classification based on 'Pothole, main, highway'.",
+AM-202414,Other,Standard,"Other classification based on 'Residential, colony, unlit'.",NEEDS_REVIEW
+AM-202417,Waste,Standard,"Waste classification based on 'Night, market, waste'.",NEEDS_REVIEW
+AM-202421,Other,Standard,"Other classification based on 'Club, music, audible'.",NEEDS_REVIEW
+AM-202424,Other,Standard,"Other classification based on 'approach, road, surface'.",NEEDS_REVIEW
+AM-202429,Heat Hazard,Standard,"Heat Hazard classification based on 'River, walk, surface'.",
+AM-202431,Heritage Damage,Standard,"Heritage Damage classification based on 'city, road, subsidence'.",
+AM-202435,Heat Hazard,Standard,"Heat Hazard classification based on 'Black, metal, road'.",
+AM-202444,Waste,Standard,"Waste classification based on 'Restaurant, waste, bins'.",
+AM-202445,Other,Standard,"Other classification based on 'shelter, roof, glass'.",NEEDS_REVIEW
diff --git a/uc-0a/agenda.md b/uc-0a/agenda.md
@@ -0,0 +1,49 @@
+# UC-0A Agenda — Complaint Classifier
+
+## Core Failure Modes to Address
+- Taxonomy drift
+- Severity blindness
+- Missing justification
+- Hallucinated sub-categories
+- False confidence on ambiguity
+
+---
+
+## Input & Output Files
+- **Input:** `../data/city-test-files/test_[your-city].csv` (15 rows per city)
+- **Output:** `uc-0a/results_[your-city].csv`
+- Note: `category` and `priority_flag` columns are stripped — you must classify them
+
+---
+
+## Run Command
+```bash
+python classifier.py \
+  --input ../data/city-test-files/test_pune.csv \
+  --output results_pune.csv
+```
+
+---
+
+## Workflow Steps
+
+### 1. Run Naive Prompt (Baseline)
+Execute `"Classify this citizen complaint by category and priority."` to establish baseline performance.
+
+### 2. Check for Failures
+Look for these specific issues:
+1. Category names that vary across rows for the same type of complaint
+2. Injury/child/school complaints classified as Standard instead of Urgent
+3. No reason field in the output
+4. Category names that are not in the allowed list
+5. Confident classification on genuinely ambiguous complaints
+
+### 3. Revise and Test
+- Fix identified failure modes
+- Re-run the classifier
+- Validate results against the classification schema
+
+---
+
+## Commit Strategy
+Use the formula: `UC-0A Fix [failure mode]: [why it failed] → [what you changed]`
diff --git a/uc-0a/classifier.py b/uc-0a/classifier.py
@@ -1,35 +1,210 @@
 """
 UC-0A — Complaint Classifier
-Starter file. Build this using the RICE → agents.md → skills.md → CRAFT workflow.
+Built using the RICE → agents.md → skills.md → CRAFT workflow.
+Enforces exact schema matching, severity detection, and ambiguity flagging.
 """
 import argparse
 import csv
+import re
+from typing import Dict, List
 
-def classify_complaint(row: dict) -> dict:
+# ============================================================================
+# ENFORCEMENT CONSTANTS (from skills.md)
+# ============================================================================
+
+ALLOWED_CATEGORIES = {
+    "Pothole",
+    "Flooding",
+    "Streetlight",
+    "Waste",
+    "Noise",
+    "Road Damage",
+    "Heritage Damage",
+    "Heat Hazard",
+    "Drain Blockage",
+    "Other"
+}
+
+ALLOWED_PRIORITIES = {"Urgent", "Standard", "Low"}
+
+# Severity keywords that trigger Urgent priority (case-insensitive)
+SEVERITY_KEYWORDS = {
+    "injury", "child", "school", "hospital", "ambulance",
+    "fire", "hazard", "fell", "collapse"
+}
+
+# Category detection patterns (simplified heuristics)
+CATEGORY_PATTERNS = {
+    "Pothole": r"\b(pothole|hole|pit|crater)\b",
+    "Flooding": r"\b(flood|water|submerged|waterlogged)\b",
+    "Streetlight": r"\b(light|streetlight|lamp|bulb|dark)\b",
+    "Waste": r"\b(waste|garbage|rubbish|trash|litter)\b",
+    "Noise": r"\b(noise|sound|loud|honking|traffic noise)\b",
+    "Road Damage": r"\b(road damage|asphalt|pavement|broken road|uneven)\b",
+    "Heritage Damage": r"\b(heritage|monument|historical|structure)\b",
+    "Heat Hazard": r"\b(heat|temperature|hot|burning)\b",
+    "Drain Blockage": r"\b(drain|blocked|blockage|sewage|clogged)\b",
+}
+
+# ============================================================================
+# SKILL 1: classify_complaint
+# ============================================================================
+
+def classify_complaint(row: dict, row_id: int = None) -> dict:
     """
-    Classify a single complaint row.
-    Returns: dict with keys: complaint_id, category, priority, reason, flag
+    Classify a single complaint row into category, priority, reason, and flag.
 
-    TODO: Build this using your AI tool guided by your agents.md and skills.md.
-    Your RICE enforcement rules must be reflected in this function's behaviour.
+    Implements Skill 1 from skills.md:
+    - Input: complaint description text
+    - Output: category, priority, reason, flag
+    - Enforces exact category matching
+    - Detects severity keywords for Urgent priority
+    - Flags genuinely ambiguous complaints for review
+
+    Args:
+        row: dict with complaint data (must contain 'description' or similar text field)
+        row_id: optional row number for error tracking
+
+    Returns:
+        dict with keys: complaint_id, category, priority, reason, flag
     """
-    raise NotImplementedError("Build this using your AI tool + RICE prompt")
+    # Extract complaint text (try common field names)
+    description = row.get("description") or row.get("complaint") or row.get("text") or ""
+    complaint_id = row.get("complaint_id") or row.get("id") or f"row_{row_id}"
+
+    if not description or not description.strip():
+        return {
+            "complaint_id": complaint_id,
+            "category": "Other",
+            "priority": "Standard",
+            "reason": "No description provided",
+            "flag": "NEEDS_REVIEW"
+        }
+
+    description_lower = description.lower()
+
+    # ========================================================================
+    # RULE 1: Detect severity keywords → Urgent priority
+    # ========================================================================
+    has_severity = any(keyword in description_lower for keyword in SEVERITY_KEYWORDS)
+    priority = "Urgent" if has_severity else "Standard"
+
+    # ========================================================================
+    # RULE 2: Detect category via pattern matching
+    # ========================================================================
+    matched_categories = []
+    for category, pattern in CATEGORY_PATTERNS.items():
+        if re.search(pattern, description_lower, re.IGNORECASE):
+            matched_categories.append(category)
+
+    # Determine category and flag
+    if len(matched_categories) == 0:
+        category = "Other"
+        reason = f"No specific category keywords found."
+        flag = "NEEDS_REVIEW"
+    elif len(matched_categories) == 1:
+        category = matched_categories[0]
+        # Extract relevant words for reason
+        reason = f"Classified as {category} based on complaint description."
+        flag = ""  # Clear flag when unambiguous
+    else:
+        # Multiple categories match → ambiguous
+        category = matched_categories[0]  # Pick first; all are plausible
+        reason = f"Multiple categories possible: {', '.join(matched_categories)}. Selected {category}."
+        flag = "NEEDS_REVIEW"
+
+    # ========================================================================
+    # RULE 3: Ensure reason cites specific words from description
+    # ========================================================================
+    # Extract first few keywords from description for reason
+    words = [w for w in description.split() if len(w) > 3][:3]
+    cited_words = ", ".join(words) if words else "complaint details"
+    reason = f"{category} classification based on '{cited_words}'."
+
+    return {
+        "complaint_id": complaint_id,
+        "category": category,
+        "priority": priority,
+        "reason": reason,
+        "flag": flag
+    }
 
 
+# ============================================================================
+# SKILL 2: batch_classify
+# ============================================================================
+
 def batch_classify(input_path: str, output_path: str):
     """
     Read input CSV, classify each row, write results CSV.
 
-    TODO: Build this using your AI tool.
-    Must: flag nulls, not crash on bad rows, produce output even if some rows fail.
+    Implements Skill 2 from skills.md:
+    - Reads input CSV file
+    - Applies classify_complaint to each row
+    - Writes output CSV with classified results
+    - Handles errors gracefully without crashing
+
+    Args:
+        input_path: path to input CSV file
+        output_path: path to write results CSV
     """
-    raise NotImplementedError("Build this using your AI tool + RICE prompt")
+    results = []
+    error_count = 0
+
+    try:
+        with open(input_path, 'r', encoding='utf-8') as infile:
+            reader = csv.DictReader(infile)
+
+            for row_num, row in enumerate(reader, start=2):  # 2 = header is row 1
+                try:
+                    result = classify_complaint(row, row_id=row_num)
+                    results.append(result)
+                except Exception as e:
+                    error_count += 1
+                    print(f"Warning: Row {row_num} failed: {e}")
+                    # Still add a result row so output has all rows
+                    results.append({
+                        "complaint_id": row.get("complaint_id", f"row_{row_num}"),
+                        "category": "Other",
+                        "priority": "Standard",
+                        "reason": f"Processing error: {str(e)}",
+                        "flag": "NEEDS_REVIEW"
+                    })
+
+    except FileNotFoundError:
+        print(f"Error: Input file not found: {input_path}")
+        return
+    except Exception as e:
+        print(f"Error reading input file: {e}")
+        return
+
+    # Write results CSV
+    try:
+        fieldnames = ["complaint_id", "category", "priority", "reason", "flag"]
+        with open(output_path, 'w', newline='', encoding='utf-8') as outfile:
+            writer = csv.DictWriter(outfile, fieldnames=fieldnames)
+            writer.writeheader()
+            writer.writerows(results)
+
+        print(f"✓ Classified {len(results) - error_count} rows successfully")
+        if error_count > 0:
+            print(f"⚠ {error_count} rows had errors (flagged for review)")
+        print(f"✓ Results written to {output_path}")
+
+    except Exception as e:
+        print(f"Error writing output file: {e}")
+        return
 
 
+# ============================================================================
+# MAIN
+# ============================================================================
+
 if __name__ == "__main__":
     parser = argparse.ArgumentParser(description="UC-0A Complaint Classifier")
-    parser.add_argument("--input",  required=True, help="Path to test_[techm].csv")
+    parser.add_argument("--input",  required=True, help="Path to test_[city].csv")
     parser.add_argument("--output", required=True, help="Path to write results CSV")
     args = parser.parse_args()
+
     batch_classify(args.input, args.output)
     print(f"Done. Results written to {args.output}")
diff --git a/uc-0a/skills.md b/uc-0a/skills.md
@@ -1,16 +1,92 @@
-# skills.md
-# INSTRUCTIONS: Generate a draft by prompting AI, then manually refine this file.
-# Delete these comments before committing.
-
-skills:
-  - name: [skill_name]
-    description: [One sentence — what does this skill do?]
-    input: [What does it receive? Type and format.]
-    output: [What does it return? Type and format.]
-    error_handling: [What does it do when input is invalid or ambiguous?]
-
-  - name: [second_skill_name]
-    description: [One sentence]
-    input: [Type and format]
-    output: [Type and format]
-    error_handling: [What does it do when input is invalid or ambiguous?]
+# UC-0A Skills — Complaint Classifier
+
+## Skill 1: classify_complaint
+
+**Input:** One complaint row (text description)
+
+**Output:** 
+- `category` — String
+- `priority` — String
+- `reason` — String
+- `flag` — String
+
+**Purpose:** Classify a single citizen complaint record into category, priority level, and provide justification.
+
+---
+
+## Skill 2: batch_classify
+
+**Input:** Input CSV file path
+
+**Process:**
+1. Read input CSV file
+2. Apply `classify_complaint` skill to each row
+3. Write results to output CSV
+
+**Output:** Output CSV file with classified results
+
+**Purpose:** Process all complaint rows in a batch operation efficiently.
+
+---
+
+## Classification Schema — Enforce Exactly
+
+### Category Field
+**Allowed values (exact strings only — no variations):**
+- Pothole
+- Flooding
+- Streetlight
+- Waste
+- Noise
+- Road Damage
+- Heritage Damage
+- Heat Hazard
+- Drain Blockage
+- Other
+
+**Rule:** Must match exactly. No variations or abbreviations allowed.
+
+### Priority Field
+**Allowed values:**
+- Urgent
+- Standard
+- Low
+
+**Rule:** Set to Urgent if any severity keywords are present in description.
+
+### Reason Field
+**Format:** One sentence
+
+**Rule:** Must cite specific words from the original complaint description. Do not use generic explanations.
+
+### Flag Field
+**Allowed values:**
+- `NEEDS_REVIEW` (when category is genuinely ambiguous)
+- Blank/Empty (when classification is clear)
+
+**Rule:** Set NEEDS_REVIEW only when the complaint could legitimately fit multiple categories.
+
+---
+
+## Severity Keywords (Trigger Urgent Priority)
+
+These keywords must trigger an Urgent classification:
+- `injury`
+- `child`
+- `school`
+- `hospital`
+- `ambulance`
+- `fire`
+- `hazard`
+- `fell`
+- `collapse`
+
+---
+
+## Error Handling & Validation
+
+- **Invalid category:** Flag for review and log the error
+- **Missing severity keywords:** Double-check classification; if doubt exists, set flag to NEEDS_REVIEW
+- **Ambiguous complaints:** Always set flag to NEEDS_REVIEW
+- **No matching category:** Use "Other" and flag for review
+- **Missing reason/justification:** Always provide a reason citing specific words from the complaint