Northwoods

Northwoods processes handwritten intake documents into structured, confidence-scored fields that route through human-in-the-loop review before acceptance. It was built for the Banyan Software CTO assignment.

Live instance: northwoods.muness.com

For the reviewer

This section is for you. It maps the assignment requirements to where they live in the codebase.

Try the live system

Role	Email	Password	What you'll see
Intake Worker	worker@sunrise.example	password	Upload dashboard — select template, attach PDF, watch status
Reviewer	reviewer@sunrise.example	password	Review queue — confidence indicators, field corrections, similar cases, finalize

Tenant-b (Lakewood) credentials also exist (worker@lakewood.example, reviewer@lakewood.example) — use them to verify tenant isolation.

The developer scaffold (preset login buttons, raw API explorer) is at northwoods.muness.com/#dev. It is not linked from the main UI.

Evaluate the architecture

Document	What it covers
Architecture Rationale	System diagram, component responsibilities, extraction model, RAG strategy, tenancy model, trade-offs
ADR 001: Postgres hybrid retrieval	Why Postgres for vector search instead of a dedicated vector DB
ADR 002: Temporal for workflows	Considered and deferred — worker polling used instead
ADR 003: MinIO for storage	S3-compatible object store for documents
ADR 004: Shared tenancy with RLS	Row-level security as defense-in-depth for tenant isolation
ADR 005: Consensus extraction pipeline	Multi-provider extraction with append-only attempt history

Evaluate AI tool usage

Document	What it covers
AI Development Tooling	What tools were used, how, and what remained human-owned
Self-Assessment	What's complete, what's missing, 5 concrete prompts with outputs

The agentic development pipeline is in .claude/agents/dev-pipeline.md and .claude/agents/dev-pipeline-oversight.md. Each GitHub issue was worked end-to-end by an agent: branch, implement, /review, /dissent, CodeRabbit, merge. The human role was scoping, reviewing output, catching drift, and making architectural calls.

RAG pipeline report

Log in as a reviewer and navigate to /#rag-report to see the RAG pipeline results: expected vs. actual similar case retrieval for known narrative arcs across 40 fictional people. This page runs live queries — it is not canned data.

What to look for

Confidence drives behavior. Low-confidence fields route to review. High-confidence fields can auto-accept. The review UI shows why the system is uncertain (per-provider breakdown).
Tenant isolation is provable. RLS on all data tables. scripts/ci/check-rls-compliance.py verifies it. Integration tests confirm cross-tenant returns empty.
Audit trail is append-only. extraction_completed → field_corrected × N → finalized — each event with actor, timestamp, correlation ID, before/after values.
The corpus tells a story. 40 people across two tenants, two visual form styles, longitudinal narrative arcs. Frequent flyers (P019 Raymond Castillo, P039 Gloria Navarro) show progression from crisis to stability across months. The RAG report page shows whether the system found these relationships.

Run locally

Prerequisites

Docker and Docker Compose
.NET SDK 10
Node 22+ and pnpm

Start everything

docker compose up -d

Starts Postgres (with pgvector), MinIO, the API, the extraction worker, and the web frontend. Schema and seed data load automatically.

Service	URL
Web UI	http://localhost:5173
API	http://localhost:5100
OpenAPI spec	http://localhost:5100/openapi/v1.json
MinIO console	http://localhost:9001 (northwoods/northwoods)
Health check	http://localhost:5100/healthz

Run tests

# All backend tests
dotnet test src/Northwoods.slnx

# Frontend type check + lint
pnpm --filter web check

# Playwright e2e (requires running stack + pnpm dev)
pnpm --filter web test:e2e

# RAG pipeline test (requires running stack + OPENAI_API_KEY)
dotnet test tests/Northwoods.Api.IntegrationTests/ --filter Category=RAG

Enable OpenAI Vision extraction

The extraction worker defaults to UseOpenAiVision=true. Without an API key the worker will throw a startup exception and documents will not be processed.

Set OPENAI_API_KEY before starting:

export OPENAI_API_KEY=sk-...
docker compose up -d

Or create a .env file at the repo root (docker compose picks this up automatically):

cp .env.example .env
# Edit .env and set OPENAI_API_KEY=sk-...

To run without extraction (UI/API development only), disable the provider explicitly:

# Add to .env:
Extraction__UseOpenAiVision=false

In this mode uploads and review UI work, but no documents will be extracted.

System overview

Intake Worker → [Upload PDF] → API → MinIO (store) + Postgres (metadata)
                                        ↓
                              Extraction Worker polls
                                        ↓
                              OCR/AI extraction → confidence scoring
                                        ↓
                              review_ready → Reviewer sees queue
                                        ↓
                              Reviewer: correct fields, finalize
                                        ↓
                              Embeddings regenerated → case_profiles updated
                                        ↓
                              RAG: similar cases surfaced for next review

Stack

Component	Technology	Responsibility
API	.NET 10 minimal API	Auth, upload, review, search, audit, tenant scoping
Extraction Worker	.NET BackgroundService	Polling, staged provider extraction, confidence gating, append-only attempts
Frontend	React + TypeScript + Tailwind	Role-based dashboards, confidence visualization, similar case panel
Database	Postgres 18 + pgvector + pg_trgm	System of record, RLS, hybrid retrieval, FTS
Object Storage	MinIO (S3-compatible)	Document blobs

Seed data

2 tenants: Sunrise (tenant-a) and Lakewood (tenant-b)
6 users per system: worker + reviewer + admin per tenant (password: password)
4 templates per tenant: General Assistance, Housing Stability, Behavioral Health, SOAP Progress Note
8 seeded documents for RAG demo (P017, P019, P037, P039 across both tenants) — loaded by DatabaseInitializer.SeedCorpusAsync on every API startup
Corpus generators in scripts/corpus/ — regenerate with python3 scripts/corpus/generate_seed_sql.py

RAG cold-start

Vector similarity search requires embeddings generated via the OpenAI API. On a fresh database:

With OPENAI_API_KEY set: The extraction worker generates embeddings for seeded case profiles automatically. All three retrieval strategies (FTS, trigram, vector cosine similarity) work.
Without OPENAI_API_KEY: FTS and trigram search still work, but vector similarity results will be absent. The similar-cases panel will show fewer or no results for queries that depend on semantic matching.

To regenerate embeddings after resetting the database, run scripts/reset_demo.py which re-uploads seed documents and triggers the extraction/embedding pipeline.

Deployment

Deployment is release-driven. Create and push a v* tag (for example v0.3.3) and GitHub Actions will build images and publish a GitHub Release for that tag.

It then updates all three Render services (api, worker, web) and triggers deploys.

Blueprint: render.yaml. Custom domain via Cloudflare CNAME.

API reference

Method	Path	Auth	Description
POST	`/auth/login`	None	Authenticate, receive JWT
GET	`/templates`	JWT	List tenant-scoped templates
GET	`/templates/{id}/blank`	JWT	Download printable blank template
POST	`/templates`	JWT (Admin)	Create a new template
PUT	`/templates/{id}`	JWT (Admin)	Update an existing template
DELETE	`/templates/{id}`	JWT (Admin)	Archive a template
POST	`/templates/{id}/blank-pdf`	JWT (Admin)	Upload blank PDF for a template
POST	`/intakes`	JWT (Worker)	Upload intake document
POST	`/intakes/{id}/retry`	JWT (Reviewer)	Retry a failed intake
GET	`/intakes/{id}`	JWT	Check processing status
GET	`/documents`	JWT	List tenant-scoped documents
GET/HEAD	`/documents/{id}/source`	JWT	Retrieve or check source document
GET	`/review-queue`	JWT (Reviewer)	Documents awaiting review
GET	`/reviews/{id}`	JWT (Reviewer)	Fields, confidence, similar cases, audit trail
POST	`/reviews/{id}/finalize`	JWT (Reviewer)	Finalize with corrections
GET	`/search?q=`	JWT	Full-text search
GET	`/cases/{personKey}`	JWT	Case aggregate across documents
DELETE	`/admin/documents`	JWT (Admin)	Wipe all tenant documents
POST	`/admin/reprocess`	JWT (Admin)	Reprocess all tenant documents
GET	`/healthz`	None	Service health
GET	`/metrics`	JWT	Tenant-scoped counters

Full OpenAPI spec: http://localhost:5100/openapi/v1.json

Name		Name	Last commit message	Last commit date
Latest commit History 207 Commits
.claude		.claude
.github/workflows		.github/workflows
.oh-omp		.oh-omp
.oh		.oh
apps/web		apps/web
docs		docs
infra/postgres		infra/postgres
samples/intakes		samples/intakes
scripts		scripts
skills		skills
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.mise.toml		.mise.toml
.npmrc		.npmrc
AGENTS.md		AGENTS.md
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
render.yaml		render.yaml
skills-lock.json		skills-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Northwoods

For the reviewer

Try the live system

Evaluate the architecture

Evaluate AI tool usage

RAG pipeline report

What to look for

Run locally

Prerequisites

Start everything

Run tests

Enable OpenAI Vision extraction

System overview

Stack

Seed data

RAG cold-start

Deployment

API reference

About

Uh oh!

Releases 31

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Northwoods

For the reviewer

Try the live system

Evaluate the architecture

Evaluate AI tool usage

RAG pipeline report

What to look for

Run locally

Prerequisites

Start everything

Run tests

Enable OpenAI Vision extraction

System overview

Stack

Seed data

RAG cold-start

Deployment

API reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 31

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages