pgmold Architecture

Overview

pgmold is a PostgreSQL schema-as-code tool built in Rust. It follows a pipeline architecture where schemas flow through parsing, normalization, diffing, planning, and execution stages.

Core Principles

Canonical Model is Truth: All operations use the normalized model::Schema IR. No module compares SQL to DB directly.
Deterministic Output: BTreeMap everywhere. Sorted collections. Predictable diffs.
Strict Module Boundaries: No SQL outside pg/sqlgen.rs. No DB access outside pg/.
Fail Fast: No panics. Clear errors via anyhow::Result.

Module Structure

pgmold/
├── src/
│   ├── cli/           # CLI argument parsing, command routing
│   ├── parser/        # PostgreSQL DDL parser → canonical model
│   │   ├── mod.rs         # SQL parsing with sqlparser
│   │   └── loader.rs      # Multi-file schema loading
│   ├── model/         # Canonical schema IR (the core)
│   ├── pg/
│   │   ├── connection.rs   # Database connection pool
│   │   ├── introspect.rs   # DB → canonical model
│   │   └── sqlgen.rs       # Migration ops → SQL
│   ├── diff/
│   │   ├── mod.rs          # Schema comparison
│   │   └── planner.rs      # Operation ordering
│   ├── filter/        # Object filtering by name patterns and types
│   ├── lint/          # Safety rules
│   │   ├── mod.rs          # Lint rules and severity
│   │   └── locks.rs        # Lock hazard detection
│   ├── drift/         # Drift detection via fingerprinting
│   ├── baseline/      # Schema export with round-trip validation
│   ├── dump.rs        # Schema → SQL DDL generation
│   ├── migrate.rs     # Migration file numbering utilities
│   ├── apply/         # Transactional execution
│   ├── util/          # Shared types, errors
│   └── main.rs
└── tests/
    ├── integration.rs      # testcontainers tests
    ├── baseline.rs         # Baseline command tests
    └── semantic_equivalence.rs  # Normalization tests

Data Flow

┌─────────────┐     ┌─────────────┐
│  SQL File   │     │  PostgreSQL │
└──────┬──────┘     └──────┬──────┘
       │                   │
       ▼                   ▼
┌─────────────┐     ┌─────────────┐
│parser::parse│     │pg::introspect│
└──────┬──────┘     └──────┬──────┘
       │                   │
       └────────┬──────────┘
                │
                ▼
        ┌───────────────┐
        │ model::Schema │  ← Canonical IR
        └───────┬───────┘
                │
                ▼
        ┌───────────────┐
        │filter::filter │  ← Apply include/exclude patterns
        └───────┬───────┘
                │
                ▼
        ┌───────────────┐
        │ diff::compute │
        └───────┬───────┘
                │
                ▼
        ┌───────────────┐
        │  MigrationOp  │  ← Operations list
        └───────┬───────┘
                │
                ▼
        ┌───────────────┐
        │ diff::planner │  ← Order operations
        └───────┬───────┘
                │
                ▼
        ┌───────────────┐
        │  lint::check  │  ← Safety validation
        └───────┬───────┘
                │
                ▼
        ┌───────────────┐
        │ pg::sqlgen    │  ← Generate SQL
        └───────┬───────┘
                │
                ▼
        ┌───────────────┐
        │ apply::exec   │  ← Execute in transaction
        └───────────────┘

Canonical Model (`model/`)

The canonical IR represents all schema objects in a normalized form:

pub struct Schema {
    pub tables: BTreeMap<String, Table>,
    pub enums: BTreeMap<String, EnumType>,
    pub domains: BTreeMap<String, Domain>,
    pub extensions: BTreeMap<String, Extension>,
    pub functions: BTreeMap<String, Function>,
    pub views: BTreeMap<String, View>,
    pub triggers: BTreeMap<String, Trigger>,
    pub sequences: BTreeMap<String, Sequence>,
    pub partitions: BTreeMap<String, Partition>,
}

pub struct Table {
    pub name: String,
    pub schema: String,
    pub columns: BTreeMap<String, Column>,
    pub indexes: BTreeMap<String, Index>,
    pub primary_key: Option<PrimaryKey>,
    pub foreign_keys: BTreeMap<String, ForeignKey>,
    pub check_constraints: BTreeMap<String, CheckConstraint>,
    pub policies: BTreeMap<String, Policy>,
    pub rls_enabled: bool,
    pub rls_force: bool,
    pub partition_key: Option<PartitionKey>,
}

pub struct Column {
    pub name: String,
    pub data_type: PgType,
    pub nullable: bool,
    pub default: Option<String>,
    pub identity: Option<String>,
}

Key Design Decisions:

BTreeMap for deterministic iteration order
Map keys use qualified names: schema.name
All objects have a schema field (default: "public")
Fingerprinting via SHA256 of JSON serialization

Migration Operations

Operations represent atomic schema changes:

pub enum MigrationOp {
    CreateExtension(Extension),
    DropExtension(String),
    CreateEnum(EnumType),
    DropEnum(String, String),
    AddEnumValue { ... },
    CreateDomain(Domain),
    DropDomain(String, String),
    AlterDomain { ... },
    CreateTable(Table),
    DropTable(String, String),
    CreatePartition(Partition),
    DropPartition(String, String),
    AddColumn { ... },
    DropColumn { ... },
    AlterColumn { ... },
    AddPrimaryKey { ... },
    DropPrimaryKey { ... },
    AddIndex { ... },
    DropIndex { ... },
    AddForeignKey { ... },
    DropForeignKey { ... },
    AddCheckConstraint { ... },
    DropCheckConstraint { ... },
    EnableRls { ... },
    DisableRls { ... },
    ForceRls { ... },
    NoForceRls { ... },
    CreatePolicy(Policy),
    AlterPolicy { ... },
    DropPolicy { ... },
    CreateFunction(Function),
    DropFunction { ... },
    ReplaceFunction(Function),
    CreateView(View),
    DropView { ... },
    ReplaceView(View),
    CreateTrigger(Trigger),
    DropTrigger { ... },
    AlterTriggerEnabled { ... },
    CreateSequence(Sequence),
    DropSequence { ... },
    AlterSequence { ... },
}

Operation Ordering

The planner orders operations to satisfy dependencies:

Create phase (safe to add):
- CreateExtension
- CreateEnum, AddEnumValue
- CreateDomain
- CreateSequence
- CreateTable (topologically sorted by FK dependencies)
- CreatePartition
- AddColumn, AlterColumn
- AddPrimaryKey
- AddIndex
- AddForeignKey
- AddCheckConstraint
- EnableRls, ForceRls
- CreatePolicy, AlterPolicy
- CreateFunction, ReplaceFunction
- CreateView, ReplaceView
- CreateTrigger
Drop phase (reverse order):
- DropTrigger
- DropView
- DropFunction
- DropPolicy
- DisableRls, NoForceRls
- DropCheckConstraint
- DropForeignKey
- DropIndex
- DropPrimaryKey
- DropColumn
- DropPartition
- DropTable
- DropSequence
- DropDomain
- DropEnum
- DropExtension

Object Filtering

The filter module supports filtering by:

Name patterns (glob syntax: *, ?)
Object types (tables, indexes, policies, etc.)

Filters apply to both source and target schemas before diffing.

Lint Rules

Rule	Severity	Condition
`deny_drop_column`	Error	Without `--allow-destructive`
`deny_drop_table`	Error	Without `--allow-destructive`
`deny_drop_enum`	Error	Without `--allow-destructive`
`deny_drop_table_in_prod`	Error	When `PGMOLD_PROD=1`
`warn_type_narrowing`	Warning	Type change may lose data
`warn_set_not_null`	Warning	May fail on existing NULLs

Lock hazard detection warns about operations that acquire exclusive locks.

Module Dependencies

cli → parser, pg, diff, filter, lint, drift, baseline, dump, migrate, apply
parser → model
pg/introspect → model
pg/sqlgen → model, diff
diff → model
filter → model
lint → diff
drift → model
baseline → parser, pg, diff, dump
dump → model, pg/sqlgen
apply → pg

No circular dependencies. model is the leaf dependency.

Testing Strategy

Unit tests: Each module has inline #[cfg(test)] modules
Integration tests: Full pipeline with testcontainers PostgreSQL
Semantic equivalence tests: Verify normalization produces identical results

Supported PostgreSQL Features

Tables, columns, partitioned tables
Primary keys, foreign keys, check constraints
Indexes (btree, hash, gin, gist, brin)
Enums, domains
Functions (with volatility, security, SET parameters)
Views
Triggers (with WHEN clauses, transition tables)
Sequences (with SERIAL/BIGSERIAL support)
Row-Level Security (RLS) policies
Extensions
Multi-schema support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pgmold Architecture

Overview

Core Principles

Module Structure

Data Flow

Canonical Model (`model/`)

Migration Operations

Operation Ordering

Object Filtering

Lint Rules

Module Dependencies

Testing Strategy

Supported PostgreSQL Features

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

pgmold Architecture

Overview

Core Principles

Module Structure

Data Flow

Canonical Model (model/)

Migration Operations

Operation Ordering

Object Filtering

Lint Rules

Module Dependencies

Testing Strategy

Supported PostgreSQL Features

Canonical Model (`model/`)