Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions doc/developer/design/20260508_restrict_to_user_objects.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,9 +84,9 @@ Fix: `UnmaterializableFunc` has an `allowed_in_restricted_session()` method with

### MCP data product discovery

`mz_mcp_data_products` and `mz_mcp_data_product_details` are system views used by the MCP agent endpoint to discover data products the session has access to. They are exempted from the system object block via an OID allowlist in `check_restrict_to_user_objects`.
`mz_mcp_data_products`, `mz_mcp_data_product_details`, and `mz_show_my_cluster_privileges` are system views used by the MCP agent endpoint to discover data products the session has access to and to decide whether their catalog cluster is usable. They are exempted from the system object block via an OID allowlist in `check_restrict_to_user_objects`. `mz_show_my_cluster_privileges` is in the allowlist because the `read_data_product` lookup query LEFT JOINs it instead of calling `has_cluster_privilege` directly; that function's `sql_impl` body references `mz_catalog.mz_roles`, and the resolved IDs propagate into the restriction check, so the function itself fails to plan under restriction (deferred bug; the call site is rewritten around it).

These views depend on `mz_show_my_object_privileges`, which originally used `pg_has_role(grantee, 'USAGE')`. That function calls `mz_role_oid_memberships()`, which is correctly blocked because it exposes the full system role graph. `mz_show_my_object_privileges` now uses `mz_session_role_memberships()` instead, which returns only the current session's transitive role chain as role names. This is semantically equivalent for the filter purpose and safe to allow in restricted sessions.
These views depend on the `mz_show_my_*_privileges` family, which originally used `pg_has_role(grantee, 'USAGE')`. That function calls `mz_role_oid_memberships()`, which is correctly blocked because it exposes the full system role graph. The `mz_show_my_*_privileges` views (`system`, `cluster`, `database`, `schema`, `object`, `default`, `all_my`) now use `mz_session_role_memberships()` instead, which returns only the current session's transitive role chain as role names. This is semantically equivalent for the filter purpose and safe to allow in restricted sessions.

### `pg_catalog` and `information_schema`

Expand Down
42 changes: 36 additions & 6 deletions src/catalog/src/builtin/mz_internal.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5599,7 +5599,12 @@ FROM mz_internal.mz_show_system_privileges
WHERE
CASE
WHEN grantee = 'PUBLIC' THEN true
ELSE pg_has_role(grantee, 'USAGE')
-- Semantically equivalent to pg_has_role(grantee, 'USAGE'), which checks
-- whether the current user holds role `grantee`. For a nonexistent grantee
-- name, both return false. We use mz_session_role_memberships() instead
-- because pg_has_role internally calls mz_role_oid_memberships(), which
-- loads the full system role graph and is blocked in restricted sessions.
ELSE grantee = ANY(mz_internal.mz_session_role_memberships())
END"#,
access: vec![PUBLIC_SELECT],
ontology: None,
Expand Down Expand Up @@ -5661,7 +5666,12 @@ FROM mz_internal.mz_show_cluster_privileges
WHERE
CASE
WHEN grantee = 'PUBLIC' THEN true
ELSE pg_has_role(grantee, 'USAGE')
-- Semantically equivalent to pg_has_role(grantee, 'USAGE'), which checks
-- whether the current user holds role `grantee`. For a nonexistent grantee
-- name, both return false. We use mz_session_role_memberships() instead
-- because pg_has_role internally calls mz_role_oid_memberships(), which
-- loads the full system role graph and is blocked in restricted sessions.
ELSE grantee = ANY(mz_internal.mz_session_role_memberships())
END"#,
access: vec![PUBLIC_SELECT],
ontology: None,
Expand Down Expand Up @@ -5723,7 +5733,12 @@ FROM mz_internal.mz_show_database_privileges
WHERE
CASE
WHEN grantee = 'PUBLIC' THEN true
ELSE pg_has_role(grantee, 'USAGE')
-- Semantically equivalent to pg_has_role(grantee, 'USAGE'), which checks
-- whether the current user holds role `grantee`. For a nonexistent grantee
-- name, both return false. We use mz_session_role_memberships() instead
-- because pg_has_role internally calls mz_role_oid_memberships(), which
-- loads the full system role graph and is blocked in restricted sessions.
ELSE grantee = ANY(mz_internal.mz_session_role_memberships())
END"#,
access: vec![PUBLIC_SELECT],
ontology: None,
Expand Down Expand Up @@ -5797,7 +5812,12 @@ FROM mz_internal.mz_show_schema_privileges
WHERE
CASE
WHEN grantee = 'PUBLIC' THEN true
ELSE pg_has_role(grantee, 'USAGE')
-- Semantically equivalent to pg_has_role(grantee, 'USAGE'), which checks
-- whether the current user holds role `grantee`. For a nonexistent grantee
-- name, both return false. We use mz_session_role_memberships() instead
-- because pg_has_role internally calls mz_role_oid_memberships(), which
-- loads the full system role graph and is blocked in restricted sessions.
ELSE grantee = ANY(mz_internal.mz_session_role_memberships())
END"#,
access: vec![PUBLIC_SELECT],
ontology: None,
Expand Down Expand Up @@ -5978,7 +5998,12 @@ FROM mz_internal.mz_show_all_privileges
WHERE
CASE
WHEN grantee = 'PUBLIC' THEN true
ELSE pg_has_role(grantee, 'USAGE')
-- Semantically equivalent to pg_has_role(grantee, 'USAGE'), which checks
-- whether the current user holds role `grantee`. For a nonexistent grantee
-- name, both return false. We use mz_session_role_memberships() instead
-- because pg_has_role internally calls mz_role_oid_memberships(), which
-- loads the full system role graph and is blocked in restricted sessions.
ELSE grantee = ANY(mz_internal.mz_session_role_memberships())
END"#,
access: vec![PUBLIC_SELECT],
ontology: None,
Expand Down Expand Up @@ -6084,7 +6109,12 @@ FROM mz_internal.mz_show_default_privileges
WHERE
CASE
WHEN grantee = 'PUBLIC' THEN true
ELSE pg_has_role(grantee, 'USAGE')
-- Semantically equivalent to pg_has_role(grantee, 'USAGE'), which checks
-- whether the current user holds role `grantee`. For a nonexistent grantee
-- name, both return false. We use mz_session_role_memberships() instead
-- because pg_has_role internally calls mz_role_oid_memberships(), which
-- loads the full system role graph and is blocked in restricted sessions.
ELSE grantee = ANY(mz_internal.mz_session_role_memberships())
END"#,
access: vec![PUBLIC_SELECT],
ontology: None,
Expand Down
21 changes: 12 additions & 9 deletions src/environmentd/src/http/mcp.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1023,18 +1023,21 @@ async fn read_data_product(
// Existence check + recover the cluster for auto-routing. The view
// filters by SELECT on the object but not by cluster privileges, so we
// also fetch USAGE on the cluster and prefer a usable one in ORDER BY
// (an MV indexed on multiple clusters can appear more than once).
// (an MV indexed on multiple clusters can appear more than once). Uses
// `mz_show_my_cluster_privileges` instead of `has_cluster_privilege`
// because the latter's body references `mz_roles` and trips
// `restrict_to_user_objects`.
let lookup_query = format!(
"SELECT \
cluster, \
cluster IS NULL OR has_cluster_privilege(cluster, 'USAGE') \
AS has_cluster_usage \
FROM mz_internal.mz_mcp_data_products \
WHERE object_name = {} \
dp.cluster, \
dp.cluster IS NULL OR cp.name IS NOT NULL AS has_cluster_usage \
FROM mz_internal.mz_mcp_data_products dp \
LEFT JOIN mz_internal.mz_show_my_cluster_privileges cp \
ON cp.name = dp.cluster AND cp.privilege_type = 'USAGE' \
WHERE dp.object_name = {} \
ORDER BY \
(cluster IS NOT NULL \
AND has_cluster_privilege(cluster, 'USAGE')) DESC, \
cluster NULLS LAST \
(dp.cluster IS NOT NULL AND cp.name IS NOT NULL) DESC, \
dp.cluster NULLS LAST \
LIMIT 1",
escaped_string_literal(name)
);
Expand Down
44 changes: 44 additions & 0 deletions src/environmentd/tests/server.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5065,6 +5065,23 @@ fn test_webhook_request_compression() {
/// Helper to set up an MCP test server and run datadriven tests.
#[allow(clippy::disallowed_methods)]
fn run_mcp_datadriven(testdata_path: &str, harness: test_util::TestHarness) {
run_mcp_datadriven_inner(testdata_path, harness, false)
}

/// Same as [`run_mcp_datadriven`], but additionally sets
/// `restrict_to_user_objects = true` as a default for the HTTP user, so
/// every MCP HTTP request session starts with the restriction active.
#[allow(clippy::disallowed_methods)]
fn run_mcp_datadriven_restricted(testdata_path: &str, harness: test_util::TestHarness) {
run_mcp_datadriven_inner(testdata_path, harness, true)
}

#[allow(clippy::disallowed_methods)]
fn run_mcp_datadriven_inner(
testdata_path: &str,
harness: test_util::TestHarness,
restrict_to_user_objects: bool,
) {
let version_re = Regex::new(r#"\d+\.\d+\.\d+(\.\d+)?(-(dev|rc)(\.\d+)?)?"#).unwrap();

datadriven::walk(testdata_path, |f| {
Expand Down Expand Up @@ -5104,6 +5121,21 @@ fn run_mcp_datadriven(testdata_path: &str, harness: test_util::TestHarness) {
&HTTP_DEFAULT_USER.name
))
.unwrap();

if restrict_to_user_objects {
// restrict_to_user_objects only takes effect on session start,
// so it must be set as a role default; the per-request MCP
// sessions pick it up automatically.
super_user
.batch_execute("ALTER SYSTEM SET enable_rbac_checks TO true")
.unwrap();
super_user
.batch_execute(&format!(
"ALTER ROLE {} SET restrict_to_user_objects = true",
&HTTP_DEFAULT_USER.name
))
.unwrap();
}
}

let agents_url = format!("http://{}/api/mcp/agent", server.http_local_addr());
Expand Down Expand Up @@ -5156,6 +5188,18 @@ fn test_mcp_agent_query_tool() {
run_mcp_datadriven("tests/testdata/mcp/agent_query_tool", harness);
}

/// Tests the MCP agent endpoint under `restrict_to_user_objects = true`,
/// the setting clients use to scope sessions down to user objects only.
/// Covers a small, focused set of cases; broader agent coverage lives in
/// `test_mcp_agent` / `test_mcp_agent_query_tool`.
#[mz_ore::test]
fn test_mcp_agent_restricted() {
let harness = test_util::TestHarness::default()
.with_mcp_routes(true, false)
.with_system_parameter_default("enable_mcp_agent".to_string(), "true".to_string());
run_mcp_datadriven_restricted("tests/testdata/mcp/agent_restricted", harness);
}

/// Tests that the MCP agent endpoint returns 503 when the feature flag is disabled.
#[mz_ore::test]
fn test_mcp_agent_disabled() {
Expand Down
50 changes: 50 additions & 0 deletions src/environmentd/tests/testdata/mcp/agent_restricted
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Copyright Materialize, Inc. and contributors. All rights reserved.
#
# Use of this software is governed by the Business Source License
# included in the LICENSE file at the root of this repository.
#
# As of the Change Date specified in that file, in accordance with
# the Business Source License, use of this software will be governed
# by the Apache License, Version 2.0.

# A small slice of the MCP agent surface, exercised with the HTTP role
# defaulted to `restrict_to_user_objects = true`. Broader agent coverage
# lives in `agent` / `agent_query_tool`; this file pins a few cases that
# prove the restricted session boots, lists tools, and reaches the
# data-product code paths.

# A restricted session can still complete the MCP handshake.
mcp-agent
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-11-25","capabilities":{},"clientInfo":{"name":"test-client","version":"1.0.0"}}}
----
200 OK
{"jsonrpc":"2.0","id":1,"result":{"protocolVersion":"2025-11-25","capabilities":{"tools":{}},"serverInfo":{"name":"materialize-mcp-agent","version":"<VERSION>"},"instructions":"You have access to Materialize data products via MCP. Prefer indexed objects (served from memory) over unindexed materialized views (read from persistent storage). `read_data_product` automatically routes the read to the cluster recorded in the data product catalog so indexes are used; you only need to set the `cluster` parameter if you intentionally want the read to run on a different cluster (e.g. one with larger or more replicas). `get_data_product_details` returns a `hydration` object with `hydrated`, `replica_count`, and `hydrated_replica_count` fields. Reads never return partial data: a read against a not-yet-hydrated product blocks until the dataflow catches up, and may hit the request timeout. Check `hydrated` before reading: if it is false and `replica_count` is greater than 0, the dataflow is still warming up, so wait and retry; if `replica_count` is 0 the cluster has no replicas and the read cannot make progress until one is added."}}

# With no data products configured, the list is empty even under
# restriction; the catalog views read by this tool are explicitly carved
# out of `restrict_to_user_objects`.
mcp-agent
{"jsonrpc":"2.0","id":10,"method":"tools/call","params":{"name":"get_data_products","arguments":{}}}
----
200 OK
{"jsonrpc":"2.0","id":10,"result":{"content":[{"type":"text","text":"[]"}],"isError":false}}

# Looking up a nonexistent data product returns DataProductNotFound, not
# a restriction error; the lookup path stays inside the user-visible
# catalog views.
mcp-agent
{"jsonrpc":"2.0","id":11,"method":"tools/call","params":{"name":"get_data_product_details","arguments":{"name":"nonexistent_product"}}}
----
200 OK
{"jsonrpc":"2.0","id":11,"error":{"code":-32602,"message":"Data product not found: nonexistent_product","data":{"error_type":"DataProductNotFound"}}}

# Before the lookup query was rewritten to LEFT JOIN
# `mz_show_my_cluster_privileges`, this call errored out with a
# `mz_roles is restricted` planning failure inside `has_cluster_privilege`.
# Now it correctly reaches the "no rows" branch and returns
# DataProductNotFound, same shape as the unrestricted variant.
mcp-agent
{"jsonrpc":"2.0","id":12,"method":"tools/call","params":{"name":"read_data_product","arguments":{"name":"nonexistent_product"}}}
----
200 OK
{"jsonrpc":"2.0","id":12,"error":{"code":-32602,"message":"Data product not found: nonexistent_product","data":{"error_type":"DataProductNotFound"}}}
10 changes: 7 additions & 3 deletions src/sql/src/rbac.rs
Original file line number Diff line number Diff line change
Expand Up @@ -109,14 +109,18 @@ pub static EMPTY_ITEM_USAGE: LazyLock<BTreeSet<CatalogItemType>> = LazyLock::new

/// System catalog objects exempted from `check_restrict_to_user_objects`.
///
/// These views are the mechanism by which the MCP agent endpoint discovers
/// data products the user has access to. Blocking them defeats the isolation
/// model, so they are explicitly allowed even in restricted sessions.
/// The `mz_mcp_data_product*` views are how the MCP agent endpoint
/// discovers data products; blocking them defeats the isolation model.
/// `mz_show_my_cluster_privileges` is joined by `read_data_product` to
/// check cluster USAGE (it replaces a `has_cluster_privilege` call whose
/// body referenced `mz_roles`) and only exposes the session role's own
/// privileges.
static RESTRICT_TO_USER_OBJECTS_ALLOWED_OIDS: LazyLock<BTreeSet<u32>> = LazyLock::new(|| {
use mz_pgrepr::oid;
btreeset! {
oid::VIEW_MZ_MCP_DATA_PRODUCTS_OID,
oid::VIEW_MZ_MCP_DATA_PRODUCT_DETAILS_OID,
oid::VIEW_MZ_SHOW_MY_CLUSTER_PRIVILEGES_OID,
}
});

Expand Down
20 changes: 20 additions & 0 deletions test/sqllogictest/rbac_mcp_agent.slt
Original file line number Diff line number Diff line change
Expand Up @@ -438,6 +438,26 @@ SELECT object_name FROM mz_internal.mz_mcp_data_product_details ORDER BY object_
"materialize"."agent_objects"."transfer_windows"
COMPLETE 2

# Regression pin for the lookup query that
# `src/environmentd/src/http/mcp.rs:read_data_product` issues under
# restriction. Joins against `mz_show_my_cluster_privileges` (allow-listed,
# `pg_has_role` swapped for `mz_session_role_memberships()`) instead of
# calling `has_cluster_privilege` directly, since that function's
# `sql_impl` body references `mz_roles` and is still blocked.
simple conn=agent_restricted,user=agent
SELECT dp.cluster,
dp.cluster IS NULL OR cp.name IS NOT NULL AS has_cluster_usage
FROM mz_internal.mz_mcp_data_products dp
LEFT JOIN mz_internal.mz_show_my_cluster_privileges cp
ON cp.name = dp.cluster AND cp.privilege_type = 'USAGE'
WHERE dp.object_name = '"materialize"."agent_objects"."next_arrivals"'
ORDER BY (dp.cluster IS NOT NULL AND cp.name IS NOT NULL) DESC,
dp.cluster NULLS LAST
LIMIT 1;
----
quickstart,t
COMPLETE 1

# mz_session_role_memberships() is callable in a restricted session (it only
# exposes the current session's role chain, not the full system graph).
# Verify it walks the full transitive chain: grant agent membership in a parent
Expand Down
Loading