Ma/convert builtins/mz roles#36912
Conversation
bc83f39 to
c42b2e9
Compare
|
Failures are also present against main, so ignoring for now |
ggevay
left a comment
There was a problem hiding this comment.
LGTM, just minor comments
| // The two built-in super-roles (mz_system, mz_support) are identified by | ||
| // name; they correspond to `MZ_SYSTEM_ROLE_ID` and `MZ_SUPPORT_ROLE_ID` in | ||
| // `mz_sql::session::user`. Their names are hard-coded constants there too, | ||
| // so filtering by name is stable. |
There was a problem hiding this comment.
Couldn't we grab these names from some compile-time constants, and interpolate into this SQL query?
| WHEN entry->'val' ? 'Flat' THEN entry->'val'->>'Flat' | ||
| ELSE ( | ||
| SELECT pg_catalog.string_agg(t.elem, ', ' ORDER BY t.ord) | ||
| FROM jsonb_array_elements_text(entry->'val'->'SqlSet') |
There was a problem hiding this comment.
Flat and SqlSet here are from VarInput/OwnedVarInput, right? It would be good to put comments on those enums pointing here, so that if somebody adds a new variant later, they'll know to extend this SQL query.
| CatalogItemType::Table, | ||
| MZ_CATALOG_SCHEMA, | ||
| "mz_roles", | ||
| ), |
There was a problem hiding this comment.
This confused me for a moment, because above we say "Smallest supported version: 0.147.0", which I understood naively to mean that we can't remove migrations at versions not older than that. But Claude tells me it's fine, and suggests extending the doc comment with something like this:
/// Exception: when a builtin's `SystemObjectDescription` changes — e.g. a builtin table is
/// converted to a materialized view (see `migrate_builtin_tables_to_mvs`), or a builtin is
/// renamed or removed — existing steps naming the old description must be removed regardless
/// of version, because `validate_migration_steps` panics on steps that don't resolve to a
/// current builtin. This is safe only if a `Replacement` step for the new description is added
/// at the conversion version: every environment that needed the removed steps upgrades from an
/// even older version, so the new replacement subsumes them.
| # Every assertion must pass against the current BuiltinTable | ||
| # implementation. After the conversion, this file must pass unchanged | ||
| # — any diff is a behavioural regression that must be either fixed | ||
| # in the new view SQL or explicitly justified. |
There was a problem hiding this comment.
Is this comment still needed?
And the same question in mz_role_parameters.slt.
| # reformatted a single quoted-string input like 'abc,def' by parsing and | ||
| # re-joining it with ", ". Now the catalog stores 'abc,def' as written. | ||
| # Multi-arg list inputs (ALTER ROLE ... SET search_path TO a, b, c) still | ||
| # round-trip through Vec<Ident>::format() which joins with ", ". |
There was a problem hiding this comment.
Could mention that the new behavior is closer to the Postgres behavior. I asked Claude to verify this on an actual Postgres:
● Verified on your Postgres 18.4 — yes, the PR's new raw-display behavior is exactly the Postgres convention, with one small quoting nuance. (I created and dropped a throwaway canon_test role for this.)
What Postgres does:
┌─────────────────────────────┬───────────────────────────┬──────────────────────────────┐
│ Input │ setconfig/useconfig shows │ Materialize after this PR │
├─────────────────────────────┼───────────────────────────┼──────────────────────────────┤
│ statement_timeout = '60000' │ statement_timeout=60000 │ 60000 — matches │
├─────────────────────────────┼───────────────────────────┼──────────────────────────────┤
│ enable_seqscan = 'yes' │ enable_seqscan=yes │ yes — matches │
├─────────────────────────────┼───────────────────────────┼──────────────────────────────┤
│ search_path = aa, bb, cc │ search_path=aa, bb, cc │ aa, bb, cc — matches │
├─────────────────────────────┼───────────────────────────┼──────────────────────────────┤
│ search_path = 'abc,def' │ search_path="abc,def" │ abc,def — differs in quoting │
└─────────────────────────────┴───────────────────────────┴──────────────────────────────┘
So Postgres stores and displays the raw input in pg_db_role_setting.setconfig/pg_user.useconfig — no canonicalization. Meanwhile SHOW statement_timeout on an active session returns 1min: Postgres canonicalizes
at SHOW-time, not at storage. That means the old Materialize behavior (canonicalizing inside the catalog table) was the un-Postgres-like one, and this PR moves toward Postgres. The pg_catalog_user.slt
expectation change is a compatibility improvement, not a regression — worth saying exactly that in the PR description.
The one nuance: for the comma-containing string, Postgres shows "abc,def" with quotes, because for Postgres SET search_path = 'abc,def' is a single schema whose name contains a comma (confirmed: SHOW
search_path → "abc,def"). Materialize parses that same input as two schemas (the old canonical display abc, def shows this) — that's a pre-existing semantic difference in how Materialize interprets Flat strings
for list vars, not something this PR introduces or could fix in display. The MV showing unquoted abc,def is faithful to Materialize's own semantics.
| pub const VIEW_MZ_BUILTIN_SOURCES_OID: u32 = 17087; | ||
| pub const MV_MZ_INDEXES_OID: u32 = 17088; | ||
| pub const MV_MZ_ROLES_OID: u32 = 17089; | ||
| pub const MV_MZ_ROLE_PARAMETERS_OID: u32 = 17090; |
There was a problem hiding this comment.
Suggestion: reuse the old OIDs instead of allocating new ones.
Seven of the eight prior table→MV conversions (mz_databases, mz_schemas, mz_role_members, mz_connections, mz_secrets, mz_materialized_views, and your mz_sources) kept the OID and just renamed the constant, e.g. TABLE_MZ_SOURCES_OID → MV_MZ_SOURCES_OID (= 16710). Only mz_indexes (#36771) allocated a new one.
Reuse seems preferable here:
- It matches the durable identity:
migrate_builtin_tables_to_mvsdeliberately preserves theCatalogItemId/GlobalId, so the system treats this as the same object changing representation. A new OID makes the pg-facing identity churn whilemz_objects.idstays stable. - Builtin item OIDs are compiled-in statics (not durably persisted), so reuse is mechanically safe — the reused-OID conversions have been deployed since ~26.18 without issue.
- It avoids leaving
TABLE_MZ_ROLES_OID/TABLE_MZ_ROLE_PARAMETERS_OIDas dead constants, and keeps theoid.sltdiff minimal (mz_roles stays at 16720 instead of disappearing and reappearing at 17089).
Concretely: rename TABLE_MZ_ROLES_OID → MV_MZ_ROLES_OID (= 16720) and TABLE_MZ_ROLE_PARAMETERS_OID → MV_MZ_ROLE_PARAMETERS_OID (= 16945), and drop the new 17089/17090 constants.
There was a problem hiding this comment.
Relatedly: since 26.29 is still -dev, do you want to realign mz_indexes (17088 → 16705) in a follow-up before the release cut, for consistency across the series?
There was a problem hiding this comment.
I'll get it in this PR and merge today
c42b2e9 to
f60af94
Compare
…zedView Problem: mz_catalog.mz_roles and mz_catalog.mz_role_parameters are BuiltinTables whose rows are written by pack_role_update. This requires coordinated writes from a single environmentd process, conflicting with the multi-envd goal. Solution: Convert mz_roles and mz_role_parameters to BuiltinMaterializedViews backed by queries over mz_internal.mz_catalog_raw. The Public role is filtered out of mz_roles to match the prior pack_role_update behavior. For mz_role_parameters, the OwnedVarInput::SqlSet branch unpacks the JSON array via jsonb_array_elements_text WITH ORDINALITY and reassembles it with string_agg(..., ', ' ORDER BY ord) so that list elements preserve their original order (matching Vec<Ident>::format()). Behavior change: The previous BuiltinTable populator called var.check() on the user input before storing it. For list-typed session vars (e.g. search_path), var.check() reparsed and reformatted a single quoted-string input like 'abc,def' into 'abc, def'. The new materialized view stores the raw user input verbatim, so 'abc,def' is preserved as-is. Multi-argument inputs (ALTER ROLE ... SET search_path TO a, b, c) still round-trip through Vec<Ident>::format() and are joined with ", ". Testing: New behavioural lockdown test for mz_role_parameters. Updated tests to reflect that mz_roles and mz_role_parameters are now MATERIALIZED VIEWs, and to reflect the search_path normalization behavior change.
f60af94 to
7e7d531
Compare
draft pr because it is stacked on: #36771