A self-hosted control plane that turns GitHub issues into validated pull requests using AI agents — with human approval at every critical step.
Status: Active development. The full 11-phase pipeline is implemented and tested end-to-end. Not yet used in production.
Software Factory sits between your GitHub repo and an LLM. When you assign it a task (via GitHub issue or API), it:
- Reads your codebase — indexes symbols with tree-sitter, maps dependencies, scans GitHub capabilities (rulesets, CODEOWNERS, branch protections)
- Plans the work — generates an implementation plan grounded in repo context
- Writes code in a sandbox — executes in an isolated Docker container with no network access
- Validates the result — runs your tests, linter, and security checks using configs from the base branch (tamper-proof)
- Shows you the evidence — assembles a structured 13-field report (annotated diffs, blast radius, test results, security findings, owners impacted)
- Waits for your approval — only creates a PR after you explicitly approve
- Tracks the PR through merge — monitors required checks, reviews, merge queue, and addresses feedback automatically
The key idea: the AI does the work, but you stay in control. Every file access is governed by policy. Every mutation is auditable. The agent can't merge, can't skip checks, and can't modify its own rules.
- Solo developers or small teams who want AI help on real tasks (not just autocomplete) but don't trust black-box automation
- Teams with compliance requirements who need audit trails and structured evidence for every code change
- Anyone who wants to delegate implementation to an AI while keeping approval authority over what ships
Most AI coding tools either give you autocomplete (low leverage) or full autonomy (low trust). Software Factory targets the middle ground:
| Gap | What Software Factory Does |
|---|---|
| Customer-owned state | All orchestration state, audit logs, and policy config live in your Postgres — not a vendor's cloud |
| Model portability | Uses OpenRouter — swap models without changing anything else |
| Path-level governance | Read/write/index policies per file path, enforced at every layer including code indexing |
| Evaluator integrity | Validation configs load from the base branch, not the agent's working branch — no self-tampering |
| Structured evidence | You review a 13-field evidence packet, not just a diff |
| Append-only audit | Every state transition, side effect, and cost is recorded with RLS-protected immutability |
GitHub Issue / API Call
│
▼
┌─────────────────────────────────────────────────────────────┐
│ API Server (Fastify) Dashboard (SvelteKit) │
│ │ │ │
│ ▼ │ │
│ Temporal Orchestrator ◄─── signals ─────┘ │
│ │ │
│ ├─→ intake ──→ understand ──→ plan ──→ setup │
│ │ │
│ ├─→ implement ──→ validate ──→ evidence ──→ review │
│ │ ▲ │ │ │
│ │ └── feedback ──┘ │ │
│ │ │ │
│ └─→ pr-creation ──→ pr-tracking ──→ merge │ │
│ ▲ │ │
│ └──────── changes requested ─────────┘ │
│ │
│ ┌─────────────┐ ┌──────────┐ ┌──────────────────────┐ │
│ │ LLM Agent │ │ Docker │ │ Policy Engine │ │
│ │ (OpenRouter) │ │ Sandbox │ │ (path-level govnce) │ │
│ └─────────────┘ └──────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│ │ │
PostgreSQL 16 Redis 7 MinIO (S3)
(state + audit) (safety/pubsub) (artifacts)
See Architecture Guide for the full breakdown.
| Package | Purpose | Key Constraint |
|---|---|---|
core |
Domain types, policy engine, state machine | Pure TS — no Node.js APIs (V8 isolate safe) |
db |
Schema, migrations, encrypted storage | Drizzle ORM, 18 tables, RLS-protected audit |
temporal-workflows |
11 pipeline phases + orchestrator | V8 isolate — no Node.js imports |
temporal-activities |
Side effects: GitHub, LLM, Docker, indexing, safety | 8 domain modules, ~12K lines |
api |
HTTP server, webhooks, auth, SSE | Fastify + Zod validation, RBAC |
worker |
Temporal worker process | Wires all activities with DI |
cli |
Terminal interface | Commander.js + chalk + ora, 11 commands |
e2e |
End-to-end workflow tests | Temporal test environment, time-skipping |
Plus apps/dashboard — a SvelteKit app for visual task management.
TypeScript (strict) · Node.js 22 · pnpm workspaces · Temporal · PostgreSQL 16 · Redis 7 · Docker · Drizzle ORM · Fastify · Vercel AI SDK · OpenRouter · Octokit · tree-sitter · Zod · neverthrow · Vitest · Biome · SvelteKit
- Node.js 22+
- pnpm 10+
- Docker & Docker Compose
# Clone and install
git clone <repo-url> && cd software-factory
pnpm install
# Generate local secrets
./scripts/generate-secrets.sh
# Start infrastructure
docker compose up -d
# Run database migrations
pnpm --filter @software-factory/db run db:migrate
# Verify
pnpm run typecheck
pnpm run test# Terminal 1: Temporal worker
pnpm run worker:dev
# Terminal 2: API server
pnpm --filter @software-factory/api run dev
# Terminal 3: Dashboard (optional)
cd apps/dashboard && pnpm dev| Service | Port |
|---|---|
| API Server | 3000 |
| Dashboard | 5173 |
| PostgreSQL | 5433 |
| Redis | 6380 |
| Temporal gRPC | 7233 |
| Temporal UI | 8080 |
| MinIO API | 9000 |
| MinIO Console | 9001 |
Create a .env file in the project root (or edit the one generated by generate-secrets.sh):
# Required — infrastructure (Docker Compose provides defaults)
DATABASE_URL=postgresql://factory:factory@localhost:5433/factory
REDIS_URL=redis://localhost:6380
TEMPORAL_ADDRESS=localhost:7233
WEBHOOK_SECRET=<from generate-secrets.sh>
# GitHub App (required for repo cloning and PR operations)
GITHUB_APP_ID=<your-app-id>
GITHUB_PRIVATE_KEY=<base64-encoded-key>
GITHUB_INSTALLATION_ID=<installation-id>
# LLM (required for plan and implement phases)
OPENROUTER_API_KEY=<your-key>
# MinIO (required for evidence artifacts — Docker Compose provides defaults)
MINIO_ENDPOINT=http://localhost:9000
MINIO_ACCESS_KEY=factory
MINIO_SECRET_KEY=<from generate-secrets.sh>The GitHub App can be created via the API's manifest flow (GET /api/setup/github returns the manifest URL) or manually in GitHub Settings.
On first boot, the API server generates an admin API key and prints it to stdout:
Admin API key created: sf_abc123...
Use this as a Bearer token for all API calls.
curl -X POST http://localhost:3000/api/tasks \
-H "Authorization: Bearer sf_<your-key>" \
-H "Content-Type: application/json" \
-d '{
"objective": "Add input validation to the user registration endpoint",
"repoOwner": "your-org",
"repoName": "your-repo",
"autonomyLevel": "L1"
}'Response:
{
"taskId": "uuid",
"repoId": "uuid",
"workflowId": "task-uuid",
"status": "created"
}The repo is automatically created/resolved on first submission — no separate registration step needed.
Autonomy levels:
| Level | Behavior |
|---|---|
| L0 | Maximum human oversight — approval required at every gate |
| L1 | Balanced — human approval at review, setup approval if no .factory/setup.yml |
| L2 | Full autonomy — auto-approves setup, human review still required for merge |
- Dashboard:
http://localhost:5173/tasks/<id> - API:
GET /api/tasks/<id> - SSE:
GET /api/events?token=<api-key>for real-time updates - CLI:
factory status <id> - Temporal UI:
http://localhost:8080for workflow internals
When the pipeline reaches evidence_ready, review the structured evidence packet (annotated diffs, blast radius, test results, security findings). Then:
# If setup phase pauses waiting for approval (L0/L1 without .factory/setup.yml):
curl -X POST http://localhost:3000/api/tasks/<id>/approve-setup \
-H "Authorization: Bearer sf_<your-key>"
# When evidence is ready — approve to trigger PR creation:
curl -X POST http://localhost:3000/api/tasks/<id>/approve \
-H "Authorization: Bearer sf_<your-key>"
# Via CLI
factory approve <id>
# Via Dashboard
Click "Approve" on the evidence review pageAfter approval, Software Factory creates the PR, monitors required checks and reviews, and addresses feedback automatically. It never merges without all GitHub-required approvals passing.
| Command | Description |
|---|---|
factory status <id> |
Show task state and progress |
factory evidence <id> |
Display the evidence packet |
factory approve <id> |
Approve evidence → trigger PR creation |
factory reject <id> |
Reject with reason → send back for rework |
factory changes <id> |
Request specific changes |
factory review <id> |
Interactive review flow |
factory kill [id] |
Kill a task (or all tasks globally) |
factory budget <id> |
View/update cost budget |
factory safety |
View safety dashboard (kill switch, circuits, budgets) |
factory health |
Check infrastructure health |
factory config |
View/edit configuration |
TOML-based with 5-tier precedence:
CLI flags > Environment variables > Project config > User config > Defaults
Config files follow XDG conventions:
- User:
~/.config/software-factory/config.toml - Project:
.factory/config.toml
- Kill switch — instantly halt all agent work (Redis-backed, checked at every activity boundary)
- Cost budgets — per-task spending limits with override signals
- Circuit breakers — automatic halt on repeated failures
- Branch leases — prevent concurrent work on the same branch
- Credential rotation — 50-minute phase-separated token scoping
- Append-only audit — RLS-enforced immutability with SHA-256 content hashes
- Trusted base context — behavioral files pinned to base SHA at intake
See Security Model for the full threat model and controls.
| Document | Description |
|---|---|
| Architecture Guide | Package structure, data flow, infrastructure |
| Task Lifecycle | The 11-phase pipeline in detail |
| Security Model | Threat model, controls, and trust boundaries |
| Product Requirements | Full PRD with requirements and design rationale |
| Architecture Decisions | Append-only ADR log |
pnpm run typecheck # Type-check all packages
pnpm run test # Run unit + integration tests
pnpm run test:e2e # Run end-to-end workflow tests
pnpm run lint # Lint with Biome
pnpm run lint:fix # Auto-fix lint issuesTests use real databases via Testcontainers (Postgres, Redis) and Temporal's time-skipping test environment. No mocks for infrastructure.
Open source (license TBD).