Architecture

sqlforge is a Go-native SQL transformation engine for a data project (CONTEXT.md):

Pure SQL models — no Jinja; structural references via Polyglot WASM (ADR 0001)
Environments — named plan/apply targets with isolated warehouse schemas and zero-copy isolation on supported warehouses
Fingerprints — stable AST + model config hashes drive plan / apply
CGO-free core — all warehouse drivers live in standalone gRPC plugin binaries; the sqlforge binary has zero CGO dependencies

Components

Package	Role
`cmd/sqlforge`	Cobra CLI and `sqlforge mcp` entrypoint
`cmd/plugins/sqlforge-plugin-*`	Standalone warehouse plugin binaries (gRPC server per dialect)
`internal/parser`	Polyglot WASM + tokenizer for apply-time SQL
`internal/model`	Load `.sql` models and `-- @` model config
`internal/graph`	Model DAG, topological sort, fingerprints
`internal/plan`	Execution plan, apply, data quality assertions, static catalog compiler
`internal/virtual`	Runner interface, gRPC plugin client, DDL helpers, incremental merge strategies
`internal/state`	Local state store (`.sqlforge/state.db`)
`internal/semantic`	Semantic layer / metric compiler
`internal/mcp`	SQLForge MCP server (JSON-RPC)
`internal/config`	Project manifest (`sqlforge.yml`) and warehouse connection
`ui/`	Optional React DAG viewer (npm build-time only; static `dist/` embedded into binary)

Plugin Architecture

The core sqlforge binary has no warehouse drivers. Each dialect is a standalone gRPC plugin binary spawned on demand:

sqlforge plan prod
   └── loads runner_factory.go
       └── exec: sqlforge-plugin-clickhouse  (unix socket, gRPC)
           └── virtual.Runner interface over protobuf

The factory (internal/virtual/runner_factory.go) resolves the binary by name (sqlforge-plugin-<dialect>) from $PATH or the same directory as the sqlforge binary. The DSN is passed via SQLFORGE_PLUGIN_DSN environment variable.

See docs/explanation/04-grpc-plugin-architecture.md for the full design rationale.

Plan and apply

Load models and optional metrics (materialize: true injects derived models).
Plan — diff fingerprints vs local state store → changed, impacted, unchanged models.
Apply — pre-create schemas sequentially; materialise models concurrently in dependency order; update state.
Test — run column, relationship, and singular SQL assertions against the applied models.

Static Data Catalog

sqlforge docs generate [env] compiles the full model DAG, column-level lineage, semantic metrics, data quality tests, and model configs into a single self-contained HTML file (target/index.html). The catalog embeds all data as an inline JSON payload (window.SQLFORGE_CATALOG) so it works correctly under the file:// protocol without any CORS errors.

Agents and CI

Drover Code: primary integration is CLI invocation in the repo (ADR 0002).
MCP server: sqlforge mcp [env] exposes read and mutation tools for agent flows.
CI: preview environments (pr_*) via GitHub Action (ADR 0003).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Architecture

Components

Plugin Architecture

Plan and apply

Static Data Catalog

Agents and CI

Further reading

Uh oh!

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture

Components

Plugin Architecture

Plan and apply

Static Data Catalog

Agents and CI

Further reading