dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Please see the dbt documentation on Exasol setup for more information on how to start using the Exasol adapter.
dbt-exasol is cleared for production use.
| dbt-exasol | dbt-core | Python | Exasol |
|---|---|---|---|
| 1.11.x | 1.11.x | 3.10-3.13 | 7.x, 8.x, ≥2025.x |
| 1.10.x | 1.10.x | 3.10-3.13 | 7.x, 8.x, ≥2025.x |
| 1.8.x | 1.8.x | 3.9-3.12 | 7.x, 8.x |
| 1.7.x | 1.7.x | 3.8-3.11 | 7.x, 8.x |
Parity claim against dbt-core 1.11 (reference adapter: dbt-snowflake). Each
supported feature is proven by an upstream dbt-tests-adapter subclass in CI; each
unsupported feature carries a reason. Legend: ✅ Supported ·
| Feature | Status | Notes |
|---|---|---|
| Microbatch incremental strategy | ✅ | incremental_strategy='microbatch' |
| Microbatch concurrency | ❌ (platform) | Batches run sequentially — see footnote 1 |
Sample mode (--sample) |
✅ | Requires DBT_EXPERIMENTAL_SAMPLE_MODE |
Empty model (--empty) |
✅ | |
| UDFs (SQL) | ✅ | See "User-Defined Functions" below |
| UDAFs (Python) | ✅ | Python SET SCRIPT aggregates |
| Python models | ❌ (platform) | See footnote 2 |
| Materialized views | ❌ (platform) | See footnote 3 |
Snapshots — hard_deletes |
✅ | Upstream + Exasol-specific tests |
Snapshots — dbt_valid_to_current |
✅ | Upstream + Exasol-specific tests |
dbt clone |
Clone-as-view (no native zero-copy clone) — footnote 4 | |
| Catalog integrations / Iceberg | ❌ (platform) | Parses if unused; errors if a model sets catalog — footnote 5 |
| Unit testing | ✅ | |
| Single-relation catalog | ✅ | get_catalog_for_single_relation |
| Batched last-modified metadata | ✅ | EXA_ALL_OBJECTS, cross-owner sources |
get_columns_in_relation |
✅ | |
persist_docs |
✅ | Table/column comments |
| Grants | ✅ |
Why (platform-blocked features):
- Microbatch concurrency — Exasol uses optimistic transaction-conflict detection at table granularity, so concurrent DELETE+INSERT batches against the same target relation abort each other; batches therefore run sequentially.
- Python models — Exasol has no general Python execution sandbox outside UDF SCRIPTs, so arbitrary dbt Python models cannot run on the database.
- Materialized views — Exasol has no materialized-view primitive.
dbt clone— Exasol has no native zero-copy clone, so clones are materialised as views (the dbt-core default whencan_clone_tableisFalse). Cross-target clones (dbt clone --target otherschema) work: the connection manager lazily acquires a pooled connection for threads that have no bound connection, so post-clone metadata calls succeed. Same-source-and-target zero-copy semantics are N/A:BaseCloneSameSourceAndTargetasserts the "skipping clone" log line emitted only on thecan_clone_table=Truepath, which Exasol never takes.- Catalog integrations / Iceberg — Exasol has no external table-format /
catalog integration; a
catalogs.ymlparses fine, but a model settingconfig(catalog=...)fails with a clear error.
This project uses mise-en-place for managing development tools and environment.
-
Install mise: mise.jdx.dev/installing-mise
-
Add shell activation to your rc file:
# For bash (~/.bashrc) eval "$(mise activate bash)" # For zsh (~/.zshrc) eval "$(mise activate zsh)" # For fish (~/.config/fish/config.fish) mise activate fish | source
# Trust the project configuration (one-time)
mise trust
# Install development tools (uv, gh, bun, usage)
mise install
# Sync Python dependencies
mise run sync| Command | Description |
|---|---|
mise run test |
Run all tests with coverage (nox -s test:coverage) |
mise run test:unit |
Run unit tests only |
mise run test:integration |
Run integration tests only |
mise run format |
Auto-format code (nox -s format:fix) |
mise run format-check |
Check code formatting without changes |
mise run lint |
Run all linters (code + security) |
mise run check |
Run all checks (format, lint, type) |
mise run sync |
Sync dependencies using uv |
mise run nox |
Run nox sessions directly |
mise run tunnel-start |
Start SSH tunnel to remote Docker host |
mise run tunnel-stop |
Stop SSH tunnel |
mise run tunnel-status |
Check SSH tunnel status |
mise run tunnel-restart |
Restart SSH tunnel |
Arguments can be passed to tasks: mise run nox -- -s test:unit
See @mise.toml [env] section for environment variables with default values.
.env- Local overrides (gitignored)- Required environment variables (
DBT_DSN,DBT_USER,DBT_PASS, etc. as described in @mise.toml) mise.local.toml- Developer-specific mise overrides (gitignored)
| Variable | Default | Description |
|---|---|---|
DBT_CONN_POOL_SIZE |
5 | Number of connections to pre-initialize in the pool for improved test performance |
To use a remote Docker host via SSH:
-
Configure the connection in
.env:DOCKER_HOST=ssh://user@remote-host
-
Manage the SSH tunnel using mise tasks:
# Start the SSH tunnel mise run tunnel-start # Check tunnel status mise run tunnel-status # Stop the tunnel mise run tunnel-stop # Restart the tunnel mise run tunnel-restart
The tunnel manager creates a persistent SSH connection that Docker can use for remote operations. It handles:
- Background SSH master connection with control sockets
- Automatic PID tracking
- Graceful shutdown and cleanup
- Connection keepalive (60s intervals)
Requirements:
- SSH access to the remote host with key-based authentication
- SSH keys available in
~/.ssh/or SSH agent - Docker installed on the remote host
Troubleshooting:
# Check detailed status
mise run tunnel-status
# View tunnel process
ps aux | grep ssh
# Test Docker connection
docker -H ssh://user@remote-host psdbt-exasol:
target: dev
outputs:
dev:
type: exasol
threads: 1
dsn: HOST:PORT
user: USERNAME
password: PASSWORD
dbname: db
schema: SCHEMAOpenID login through access_token or refresh_token instead of user+password
- connection_timeout: defaults to pyexasol default
- socket_timeout: defaults to pyexasol default
- query_timeout: defaults to pyexasol default
- compression: default: False
- encryption: default: True
- validate_server_certificate: default: True (requires valid SSL certificate when encryption=True)
- protocol_version: default: v3
- row_separator: default: CRLF for windows - LF otherwise
- timestamp_format: default: YYYY-MM-DDTHH:MI:SS.FF6
- pool_size: default: None (resolved from dbt
threadssetting). Maximum number of pooled connections per credentials key. When omitted, the pool size equals thethreadsvalue so every thread can reuse a cached connection without creating a new one on each model run.
As of dbt-exasol 1.8.1 it is possible to add new model config parameters for models materialized as table or incremental.
- partition_by_config
- distribute_by_config
- primary_key_config
- Example table materialization config
{{
config(
materialized='table',
primary_key_config=['<column>','<column2>'],
partition_by_config='<column>',
distribute_by_config='<column>'
)
}}NOTE In case more than one column is used, put them in a list.
As of dbt-exasol version 1.8 we have decided to switch to Apache License from GPLv3 - to be equal to dbt-core licensing.
Due to a breaking change in setuptools and a infected dependency from dbt-core, we need to use the following workaround for poetry install.
Starting from Exasol 8, encryption is enforced by default. If you are still using Exasol 7 and have trouble connecting, you can disable encryption in profiles.yaml (see optional parameters).
By default, dbt-exasol validates SSL/TLS certificates when encryption=True (which is the default). This provides secure connections and suppresses PyExasol warnings about certificate validation behavior.
Default behavior (recommended for production):
outputs:
prod:
type: exasol
encryption: true # default
validate_server_certificate: true # default
# ... other settingsFor development/testing with self-signed certificates:
outputs:
dev:
type: exasol
encryption: true
validate_server_certificate: false # Skip cert validation (not recommended for production)
# ... other settingsAlternative for self-signed certificates: Use the nocertcheck fingerprint in the DSN:
outputs:
dev:
type: exasol
dsn: myhost/nocertcheck:8563
# ... other settingsFor more information about SSL configuration, see the PyExasol security documentation.
Materialized views are not supported in Exasol (no materialized-view primitive); the default dbt-core behaviour will fail accordingly.
dbt clone is supported as clone-as-view: Exasol has no native zero-copy clone,
so clones are materialised as views (the dbt-core default when can_clone_table is
False). See the parity matrix entry for dbt clone and footnote 4 for details.
In Exasol empty string are NULL. Due to this behaviour and as of this pull request 7776 published in dbt-core 1.6, seeds in tests that use EMPTY literal to simulate empty string have to be handled with special behaviour in exasol. See fixture for csv in exasolseedsdata_hash_csv for tests/functional/adapter/utils/test_utils.py::TestHashExasol.
The following database constraints are implemented for Exasol:
| Constraint Type | Status |
|---|---|
| check | NOT supported |
| not null | enforced |
| unique | NOT supported |
| primary key | enforced |
| foreign key | enforced |
Supported since dbt-exasol 1.11.x (requires dbt-core 1.11.x)
dbt-exasol supports dbt-core's UDF feature for defining and registering custom functions in Exasol that can be reused outside dbt (BI tools, notebooks).
| Type | Language | Status | Exasol Mechanism |
|---|---|---|---|
| Scalar | SQL | Supported | CREATE OR REPLACE FUNCTION |
| Scalar | Python | Supported | CREATE OR REPLACE PYTHON3 SCALAR SCRIPT |
| Aggregate | SQL | Not supported | Exasol has no SQL aggregate function mechanism |
| Aggregate | Python | Supported | CREATE OR REPLACE PYTHON3 SET SCRIPT |
| Table-returning | — | Not supported | Not yet supported in dbt framework |
- No volatility support: Exasol does not support IMMUTABLE/STABLE/VOLATILE on any UDF type (SQL scalar, Python scalar, or Python aggregate). A warning is emitted if
volatilityis configured on any function. - No default argument values: Exasol does not support DEFAULT clauses for function arguments.
- PYTHON3 only: Exasol uses a fixed PYTHON3 runtime. The runtime_version config is ignored with a warning.
- No PACKAGES clause: Exasol uses BucketFS for Python libraries, not inline PACKAGES.
- Reserved-word argument names in SQL scalar UDFs: SQL scalar
CREATE FUNCTIONargument identifiers are emitted unquoted (so the function body can reference them unquoted, as Exasol requires). Consequently an argument named after an Exasol reserved word (e.g.value) will fail to compile. Rename the argument (e.g.val) to work around this. Python scalar/aggregate UDFs are unaffected: their arguments are quoted and accessed viactx.<name>.
functions/double_price.sql:
SELECT price * 2
functions/double_price.yml:
functions:
- name: double_price
arguments:
- name: price
data_type: DOUBLE
returns:
data_type: DOUBLE
The adapter automatically:
- Strips the leading SELECT keyword (dbt convention)
- Wraps the expression in BEGIN RETURN expr; END name;
- Detects procedural bodies containing BEGIN and inserts them directly
functions/double_price.py:
def double_price(price: float) -> float:
return price * 2
functions/double_price.yml:
functions:
- name: double_price
config:
language: python
entry_point: double_price
runtime_version: "3.12" # Ignored by Exasol; emits warning
arguments:
- name: price
data_type: DOUBLE
returns:
data_type: DOUBLE
The adapter generates a run(ctx) bridge that maps Exasol's ctx.price API to dbt's direct-argument convention.
functions/sum_squared.py:
class SumSquared:
def __init__(self):
self._partial_sum = 0
def accumulate(self, input_value):
self._partial_sum += input_value
def finish(self):
return self._partial_sum ** 2
functions/sum_squared.yml:
functions:
- name: sum_squared
config:
type: aggregate
language: python
entry_point: SumSquared
runtime_version: "3.11" # Ignored by Exasol; emits warning
arguments:
- name: value
data_type: DOUBLE
returns:
data_type: DOUBLE
The adapter generates a ctx.next() iteration bridge. merge() from the dbt convention is never called because Exasol handles distributed aggregation transparently. The aggregate_state config is likewise ignored; a warning is emitted if it is set.
Fallback to dbt-core implementation and supporting strategies:
append- Insert new rowsmerge- Update existing rows, insert new rowsdelete+insert- Delete matching rows, insert all rowsmicrobatch(new in 1.10) - Process data in time-based batches
The microbatch strategy processes data in time-based batches, enabling:
- Efficient processing of large datasets
- Support for late-arriving data via
lookback - Sample mode (
--sample) for development
Example configuration:
{{ config(
materialized='incremental',
incremental_strategy='microbatch',
event_time='created_at',
begin='2024-01-01',
batch_size='day',
lookback=2
) }}
select * from {{ ref('source_table') }}Configuration options:
| Option | Required | Description |
|---|---|---|
event_time |
Yes | Column used for time-based filtering |
begin |
Yes | Start date for initial backfill (YYYY-MM-DD) |
batch_size |
Yes | Size of each batch: hour, day, month, year |
lookback |
No | Number of previous batches to reprocess |
See dbt Microbatch Documentation for more details.
Sample mode (--sample flag) runs dbt in "small-data" mode, building only the N most recent time-based slices of microbatch models. This is useful for:
- Development and testing with representative data
- Quick iteration without processing full history
Example usage:
# Process only 2 most recent days
dbt run --sample="2 days"
# Process most recent week
dbt run --sample="1 week"Requirements:
- Models using
incremental_strategy='microbatch' - dbt-core 1.10 or later
See Sample Mode Documentation for more details.
Timestamp Format: Exasol requires timestamps without timezone suffix in model definitions:
-- Correct (Exasol compatible)
TIMESTAMP '2024-01-01 10:00:00'
-- Incorrect (will cause parse errors)
TIMESTAMP '2024-01-01 10:00:00-0'The dbt-exasol adapter automatically handles timestamp formatting for microbatch boundaries.
Batch Processing:
- Microbatch uses DELETE + INSERT pattern for batch replacement
- Each batch window is processed as a separate transaction
- For large datasets, consider
batch_size='day'overbatch_size='hour'
- Please follow this pull request
- Timestamp format defaults to YYYY-MM-DDTHH:MI:SS.FF6
There is no equivalent SQL function in Exasol for split_part.
The SQL function listagg in Exasol does not support the num_part parameter.
In order to support packages like dbt-utils and dbt-audit-helper, we needed to create the shim package exasol-utils.
This project uses GitHub Actions for continuous integration and deployment:
-
CI Workflow: Runs on pull requests, pushes to main/master, scheduled nightly, and manual dispatch
- Smart Integration Testing: Only runs integration tests when relevant files change (on PRs)
- Python Matrix: Tests across Python 3.10, 3.11, 3.12, and 3.13
- Checks Job (runs for all Python versions):
- Format checking (
nox -s format:check) - Linting (
nox -s lint:code) - Security checks (
nox -s lint:security) - Type checking (
nox -s lint:typing) - Unit tests with coverage reporting (
nox -s test:unit)
- Format checking (
- Integration Job (parallel execution with 8 workers):
- Functional tests against Exasol database (
nox -s test:integration) - Conditional execution based on file changes
- Functional tests against Exasol database (
- Report Job:
- Combines coverage from all jobs
- SonarCloud integration for quality gates and coverage reporting
- Concurrency Control: Cancels redundant runs on the same branch
-
Release Workflow: Triggered by version tags
- Builds package using
uv build - Publishes to PyPI
- Creates GitHub Release
- Builds package using
The following commands are available via mise:
# Run format check
mise run format-check
# Auto-format code
mise run format
# Run all linters
mise run lint
# Run unit tests
mise run test:unit
# Run integration tests
mise run test:integration
# Run all tests with coverage
mise run test
# Run all checks (format, lint, type)
mise run check
# Run specific nox sessions
mise run nox -- -s test:unit
mise run nox -- -s lint:securityMaintainers should configure the following branch protection rules on the main branch:
- Go to Settings > Branches > Add rule
- Branch name pattern:
main - Enable:
- Require a pull request before merging
- Require status checks to pass before merging
- Select "test" as required status check
- Require branches to be up to date before merging
To create a new release:
-
Update version in
pyproject.toml -
Commit the change
-
Create and push a version tag with
vprefix:git tag v1.10.2 git push origin v1.10.2
-
GitHub Actions will automatically:
- Build the package
- Publish to PyPI
- Create a GitHub Release
Note: Only semantic version tags with v prefix (e.g., v1.10.2) trigger releases.
- All code must pass format checks (
ruff check) - All code must pass linting (
nox -s lint:code) - Unit test coverage must be >= 80%
- All tests must pass before merging
- Please report bugs using the issues
- All changes to main must go through pull requests with CI checks passing