Snowflake -> ClickHouse Equivalent Concepts#6244
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
4 Skipped Deployments
|
…into snowflake-equivalent-concepts
|
|
||
| | Snowflake | ClickHouse | Notes | | ||
| |---|---|---| | ||
| | Organization | [Organization](/cloud/security/console-roles#organization-roles) | Root node of the hierarchy in both. | |
There was a problem hiding this comment.
This is actually not correct. Snowflake's account is actually the Clickhouse equivalent to what we call an org
| | Snowflake | ClickHouse | Notes | | ||
| |---|---|---| | ||
| | Organization | [Organization](/cloud/security/console-roles#organization-roles) | Root node of the hierarchy in both. | | ||
| | Account | [Warehouse](/cloud/reference/warehouses) | Each service scales compute independently; storage is shared at the warehouse level. Tier and billing are set at the organization level, not per warehouse. | |
There was a problem hiding this comment.
Per above - this is also not correct.
A service in CH is actually for us the equivalent of a database + warehouse in Snowflake.
Warehouses (CH) are how you have multiple warehouses that can read/write to one database (Snow). But please note that it could be multiple databases in Snow. They have true decoupled storage(the data) + compute
| |---|---|---| | ||
| | Organization | [Organization](/cloud/security/console-roles#organization-roles) | Root node of the hierarchy in both. | | ||
| | Account | [Warehouse](/cloud/reference/warehouses) | Each service scales compute independently; storage is shared at the warehouse level. Tier and billing are set at the organization level, not per warehouse. | | ||
| | Database | [Database](/sql-reference/statements/create/database) | Logical container for tables. Snowflake uses a Database → Schema → Table hierarchy; ClickHouse flattens this to Database → Table. See [Schemas](#schemas) below. | |
There was a problem hiding this comment.
it's not just tables - it's also views
| | Account | [Warehouse](/cloud/reference/warehouses) | Each service scales compute independently; storage is shared at the warehouse level. Tier and billing are set at the organization level, not per warehouse. | | ||
| | Database | [Database](/sql-reference/statements/create/database) | Logical container for tables. Snowflake uses a Database → Schema → Table hierarchy; ClickHouse flattens this to Database → Table. See [Schemas](#schemas) below. | | ||
|
|
||
| :::note[Warehouse terminology] |
There was a problem hiding this comment.
def let's keep this and link to the warehouses page
| |---|---|---| | ||
| | Namespace partitioning — letting objects with the same name coexist (`analytics.users` vs `marketing.users`) | One [database](/sql-reference/statements/create/database) per Snowflake schema, or fold the schema name into the database (`analytics.public.events` → `analytics_public.events`) | Object references move from three-level (`DB.SCHEMA.TABLE`) to two-level (`DB.TABLE`). | | ||
| | Logical grouping by domain or processing stage (`analytics.raw`, `analytics.staging`, `analytics.marts`) | Separate databases or a consistent naming convention | — | | ||
| | Permission boundary | [SQL grants](/sql-reference/statements/grant) at the database, table, or column level | Database-wide grants cover the schema-level grant footprint; per-table grants are also available for finer-grained control. | |
There was a problem hiding this comment.
what does permission boundary mean?
| | Namespace partitioning — letting objects with the same name coexist (`analytics.users` vs `marketing.users`) | One [database](/sql-reference/statements/create/database) per Snowflake schema, or fold the schema name into the database (`analytics.public.events` → `analytics_public.events`) | Object references move from three-level (`DB.SCHEMA.TABLE`) to two-level (`DB.TABLE`). | | ||
| | Logical grouping by domain or processing stage (`analytics.raw`, `analytics.staging`, `analytics.marts`) | Separate databases or a consistent naming convention | — | | ||
| | Permission boundary | [SQL grants](/sql-reference/statements/grant) at the database, table, or column level | Database-wide grants cover the schema-level grant footprint; per-table grants are also available for finer-grained control. | | ||
| | Future grants | Database wildcards (`GRANT … ON db.* TO role`) apply to current and future tables | Can't scope future grants to a subset of tables within a database. | |
There was a problem hiding this comment.
funny enough - I'm not quite sure what this is about can't scope future grants to a subset of tables within a database.
There was a problem hiding this comment.
Is this a CH limitation?
There was a problem hiding this comment.
Good catch this was a miss on my part. Wildcard grants (GRANT … ON db.* TO role) automatically cover tables created later. I'll update.
| | Row access policy | [Row policy](/sql-reference/statements/create/row-policy) — a `WHERE`-style expression evaluated per user | Row policies apply transparently to every query against the table. | | ||
| | Sequence | [`generateSerialID`](/sql-reference/functions/other-functions#generateSerialID) for a Keeper-backed sequential counter; [`generateSnowflakeID`](/sql-reference/functions/uuid-functions#generateSnowflakeID) or [`generateUUIDv7`](/sql-reference/functions/uuid-functions#generateUUIDv7) for distributed unique IDs | `generateSerialID` is the closest match to an auto-incrementing sequence: a named, monotonic counter coordinated through ClickHouse Keeper. The UUID functions suit high-throughput unique IDs that don't need a shared counter. | | ||
|
|
||
| :::note[Time Travel and backups] |
There was a problem hiding this comment.
have we thought about just having this part and not including these features in the table? to lessen up the mentions?
|
|
||
| | Snowflake | ClickHouse | Notes | | ||
| |---|---|---| | ||
| | Primary key (advisory) | Primary key — drives the on-disk sort order and the [sparse primary index](/guides/best-practices/sparse-primary-indexes) | Where Snowflake's PK is advisory only, ClickHouse's PK is load-bearing — it determines physical layout and is used to prune granules, avoid re-sorts, and short-circuit `LIMIT`. Neither system enforces uniqueness. | |
There was a problem hiding this comment.
we should explicitly call out the fact that our primary key does not have to be unique. That's like an industry standard (that PKs are unique)
| | Snowflake | ClickHouse | Notes | | ||
| |---|---|---| | ||
| | Primary key (advisory) | Primary key — drives the on-disk sort order and the [sparse primary index](/guides/best-practices/sparse-primary-indexes) | Where Snowflake's PK is advisory only, ClickHouse's PK is load-bearing — it determines physical layout and is used to prune granules, avoid re-sorts, and short-circuit `LIMIT`. Neither system enforces uniqueness. | | ||
| | Foreign key (advisory) | Wide tables or [dictionaries](/dictionary) for lookups | ClickHouse doesn't accept foreign-key declarations even as advisory hints. | |
There was a problem hiding this comment.
Are we talking about foreign key constraints or...? I'm confused by this because foreign keys to me are just the join key
| | Search Optimization Service | Secondary indexes — [bloom-filter](/engines/table-engines/mergetree-family/mergetree#bloom-filter), token-bloom, [minmax](/engines/table-engines/mergetree-family/mergetree#minmax) | ClickHouse asks you to pick the index type per column and tune its parameters; there's no automatic equivalent. | | ||
| | Cortex Search / Snowflake Cortex Search | [Full-text index](/engines/table-engines/mergetree-family/textindexes) | Token index over string columns for in-database search. | | ||
| | `VECTOR` data type and vector search | [`Array(Float32)`](/sql-reference/data-types/array) or [`Array(BFloat16)`](/sql-reference/data-types/float#bfloat16) with a [vector ANN index](/engines/table-engines/mergetree-family/annindexes); or [`QBit`](/sql-reference/data-types/qbit) for tunable-precision search | ClickHouse has no dedicated `VECTOR` type. Embeddings store as `Array(Float32)`, or `Array(BFloat16)` to halve storage, with an ANN index accelerating approximate nearest-neighbor lookups. `QBit` keeps full precision while letting you trade bits for speed at query time. | | ||
| | Materialized view | [Incremental MV](/materialized-view/incremental-materialized-view) — updates on each insert into a base table | Source-shape rules differ; review both before porting an existing MV. Cost is paid at insert time in ClickHouse. | |
There was a problem hiding this comment.
fun fact - Snowflake views are extremely limited and don't even support joins :)
| | Network policies (IP allowlist) | IP allowlists and [private connectivity](/cloud/security/connectivity/private-networking) — PrivateLink (AWS, Azure) and Private Service Connect (GCP) for ingress restriction | Private connectivity is available across the three major clouds. | | ||
| | Tri-Secret Secure (customer-managed keys) | [CMEK](/cloud/security/cmek) on the service | Supports key rotation and revocation. See the CMEK page for the current list of supported cloud providers. | | ||
| | Object tagging (governance metadata) | — | ClickHouse exposes metadata via `system.*` tables rather than user-defined tags. | | ||
| | Data classification (sensitive-data detection) | — | Not a managed feature; external tools (e.g. DataHub) cover this layer. | |
There was a problem hiding this comment.
We do support tagging but it's definitely not to level of Snowflake
…into snowflake-equivalent-concepts
…into snowflake-equivalent-concepts
…/04_equivalent-concepts.md Co-authored-by: Amy Chen <46451573+amychen1776@users.noreply.github.com>
…/04_equivalent-concepts.md Co-authored-by: Amy Chen <46451573+amychen1776@users.noreply.github.com>
…ickHouse/clickhouse-docs into snowflake-equivalent-concepts
Summary
Snowflake -> ClickHouse equivalent concepts page to strengthen our migration story.
Checklist