pgmold is a PostgreSQL schema-as-code tool built in Rust. It follows a pipeline architecture where schemas flow through parsing, normalization, diffing, planning, and execution stages.
- Canonical Model is Truth: All operations use the normalized
model::SchemaIR. No module compares SQL to DB directly. - Deterministic Output: BTreeMap everywhere. Sorted collections. Predictable diffs.
- Strict Module Boundaries: No SQL outside
pg/sqlgen.rs. No DB access outsidepg/. - Fail Fast: No panics. Clear errors via
anyhow::Result.
pgmold/
├── src/
│ ├── cli/ # CLI argument parsing, command routing
│ ├── parser/ # PostgreSQL DDL parser → canonical model
│ │ ├── mod.rs # SQL parsing with sqlparser
│ │ └── loader.rs # Multi-file schema loading
│ ├── model/ # Canonical schema IR (the core)
│ ├── pg/
│ │ ├── connection.rs # Database connection pool
│ │ ├── introspect.rs # DB → canonical model
│ │ └── sqlgen.rs # Migration ops → SQL
│ ├── diff/
│ │ ├── mod.rs # Schema comparison
│ │ └── planner.rs # Operation ordering
│ ├── filter/ # Object filtering by name patterns and types
│ ├── lint/ # Safety rules
│ │ ├── mod.rs # Lint rules and severity
│ │ └── locks.rs # Lock hazard detection
│ ├── drift/ # Drift detection via fingerprinting
│ ├── baseline/ # Schema export with round-trip validation
│ ├── dump.rs # Schema → SQL DDL generation
│ ├── migrate.rs # Migration file numbering utilities
│ ├── apply/ # Transactional execution
│ ├── util/ # Shared types, errors
│ └── main.rs
└── tests/
├── integration.rs # testcontainers tests
├── baseline.rs # Baseline command tests
└── semantic_equivalence.rs # Normalization tests
┌─────────────┐ ┌─────────────┐
│ SQL File │ │ PostgreSQL │
└──────┬──────┘ └──────┬──────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│parser::parse│ │pg::introspect│
└──────┬──────┘ └──────┬──────┘
│ │
└────────┬──────────┘
│
▼
┌───────────────┐
│ model::Schema │ ← Canonical IR
└───────┬───────┘
│
▼
┌───────────────┐
│filter::filter │ ← Apply include/exclude patterns
└───────┬───────┘
│
▼
┌───────────────┐
│ diff::compute │
└───────┬───────┘
│
▼
┌───────────────┐
│ MigrationOp │ ← Operations list
└───────┬───────┘
│
▼
┌───────────────┐
│ diff::planner │ ← Order operations
└───────┬───────┘
│
▼
┌───────────────┐
│ lint::check │ ← Safety validation
└───────┬───────┘
│
▼
┌───────────────┐
│ pg::sqlgen │ ← Generate SQL
└───────┬───────┘
│
▼
┌───────────────┐
│ apply::exec │ ← Execute in transaction
└───────────────┘
The canonical IR represents all schema objects in a normalized form:
pub struct Schema {
pub tables: BTreeMap<String, Table>,
pub enums: BTreeMap<String, EnumType>,
pub domains: BTreeMap<String, Domain>,
pub extensions: BTreeMap<String, Extension>,
pub functions: BTreeMap<String, Function>,
pub views: BTreeMap<String, View>,
pub triggers: BTreeMap<String, Trigger>,
pub sequences: BTreeMap<String, Sequence>,
pub partitions: BTreeMap<String, Partition>,
}
pub struct Table {
pub name: String,
pub schema: String,
pub columns: BTreeMap<String, Column>,
pub indexes: BTreeMap<String, Index>,
pub primary_key: Option<PrimaryKey>,
pub foreign_keys: BTreeMap<String, ForeignKey>,
pub check_constraints: BTreeMap<String, CheckConstraint>,
pub policies: BTreeMap<String, Policy>,
pub rls_enabled: bool,
pub rls_force: bool,
pub partition_key: Option<PartitionKey>,
}
pub struct Column {
pub name: String,
pub data_type: PgType,
pub nullable: bool,
pub default: Option<String>,
pub identity: Option<String>,
}Key Design Decisions:
BTreeMapfor deterministic iteration order- Map keys use qualified names:
schema.name - All objects have a
schemafield (default: "public") - Fingerprinting via SHA256 of JSON serialization
Operations represent atomic schema changes:
pub enum MigrationOp {
CreateExtension(Extension),
DropExtension(String),
CreateEnum(EnumType),
DropEnum(String, String),
AddEnumValue { ... },
CreateDomain(Domain),
DropDomain(String, String),
AlterDomain { ... },
CreateTable(Table),
DropTable(String, String),
CreatePartition(Partition),
DropPartition(String, String),
AddColumn { ... },
DropColumn { ... },
AlterColumn { ... },
AddPrimaryKey { ... },
DropPrimaryKey { ... },
AddIndex { ... },
DropIndex { ... },
AddForeignKey { ... },
DropForeignKey { ... },
AddCheckConstraint { ... },
DropCheckConstraint { ... },
EnableRls { ... },
DisableRls { ... },
ForceRls { ... },
NoForceRls { ... },
CreatePolicy(Policy),
AlterPolicy { ... },
DropPolicy { ... },
CreateFunction(Function),
DropFunction { ... },
ReplaceFunction(Function),
CreateView(View),
DropView { ... },
ReplaceView(View),
CreateTrigger(Trigger),
DropTrigger { ... },
AlterTriggerEnabled { ... },
CreateSequence(Sequence),
DropSequence { ... },
AlterSequence { ... },
}The planner orders operations to satisfy dependencies:
-
Create phase (safe to add):
- CreateExtension
- CreateEnum, AddEnumValue
- CreateDomain
- CreateSequence
- CreateTable (topologically sorted by FK dependencies)
- CreatePartition
- AddColumn, AlterColumn
- AddPrimaryKey
- AddIndex
- AddForeignKey
- AddCheckConstraint
- EnableRls, ForceRls
- CreatePolicy, AlterPolicy
- CreateFunction, ReplaceFunction
- CreateView, ReplaceView
- CreateTrigger
-
Drop phase (reverse order):
- DropTrigger
- DropView
- DropFunction
- DropPolicy
- DisableRls, NoForceRls
- DropCheckConstraint
- DropForeignKey
- DropIndex
- DropPrimaryKey
- DropColumn
- DropPartition
- DropTable
- DropSequence
- DropDomain
- DropEnum
- DropExtension
The filter module supports filtering by:
- Name patterns (glob syntax:
*,?) - Object types (tables, indexes, policies, etc.)
Filters apply to both source and target schemas before diffing.
| Rule | Severity | Condition |
|---|---|---|
deny_drop_column |
Error | Without --allow-destructive |
deny_drop_table |
Error | Without --allow-destructive |
deny_drop_enum |
Error | Without --allow-destructive |
deny_drop_table_in_prod |
Error | When PGMOLD_PROD=1 |
warn_type_narrowing |
Warning | Type change may lose data |
warn_set_not_null |
Warning | May fail on existing NULLs |
Lock hazard detection warns about operations that acquire exclusive locks.
cli → parser, pg, diff, filter, lint, drift, baseline, dump, migrate, apply
parser → model
pg/introspect → model
pg/sqlgen → model, diff
diff → model
filter → model
lint → diff
drift → model
baseline → parser, pg, diff, dump
dump → model, pg/sqlgen
apply → pg
No circular dependencies. model is the leaf dependency.
- Unit tests: Each module has inline
#[cfg(test)]modules - Integration tests: Full pipeline with testcontainers PostgreSQL
- Semantic equivalence tests: Verify normalization produces identical results
- Tables, columns, partitioned tables
- Primary keys, foreign keys, check constraints
- Indexes (btree, hash, gin, gist, brin)
- Enums, domains
- Functions (with volatility, security, SET parameters)
- Views
- Triggers (with WHEN clauses, transition tables)
- Sequences (with SERIAL/BIGSERIAL support)
- Row-Level Security (RLS) policies
- Extensions
- Multi-schema support