Skip to content

[SS-69] Add a GCP Connection to hold Google service account creds#36694

Merged
ublubu merged 3 commits into
mainfrom
kynan/iceberg-gcp-connection
Jun 10, 2026
Merged

[SS-69] Add a GCP Connection to hold Google service account creds#36694
ublubu merged 3 commits into
mainfrom
kynan/iceberg-gcp-connection

Conversation

@ublubu

@ublubu ublubu commented May 22, 2026

Copy link
Copy Markdown
Contributor

This PR brings on gcp_auth as a dependency.

stacked on #36693

@ublubu ublubu requested a review from patrickwwbutler May 22, 2026 19:32
@ublubu ublubu marked this pull request as ready for review June 2, 2026 14:12
@ublubu ublubu requested review from a team as code owners June 2, 2026 14:12

@patrickwwbutler patrickwwbutler left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one teeny tiny non blocking nit

_id: CatalogItemId,
storage_configuration: &StorageConfiguration,
) -> Result<(), GcpConnectionValidationError> {
let json = storage_configuration

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this could prob have a more descriptive name than json but honestly the context is right there so it's not hard to read anyways

@def- def- left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With CREATE CONNECTION … TO GCP (SERVICE ACCOUNT KEY = SECRET …) the oauth token request target is taken directly from the json. Even with storage_enforce_external_addresses = true this allows accessing any http/https, even internal ones, which should be blocked! Small test to verify:

diff --git a/test/testdrive/connection-create-external.td b/test/testdrive/connection-create-external.td
index cbe04d2bea..7dc357bd38 100644
--- a/test/testdrive/connection-create-external.td
+++ b/test/testdrive/connection-create-external.td
@@ -139,6 +139,37 @@ contains:Address resolved to a private IP
   ) WITH (VALIDATE = true);
 contains:Address resolved to a private IP

+# A GCP connection validates by POSTing to the `token_uri` from the
+# service-account-key JSON. gcp_auth issues that request verbatim, bypassing the
+# resolve_address / ENFORCE_EXTERNAL_ADDRESSES path the connections above use, so
+# validation must reject any token_uri that is not an HTTPS *.googleapis.com host
+# -- otherwise it is an SSRF primitive hidden inside an opaque secret.
+#
+# The private_key is a throwaway RSA key reused from the repo (src/ccsr/tests and
+# the trufflehog allowlist) -- never a real credential. It must parse so gcp_auth
+# reaches the token-fetch step that consumes token_uri. Backslashes are doubled:
+# standard_conforming_strings is on, so the text->bytea cast turns each "\\n"
+# into the literal "\n" the JSON needs.
+$ set gcp-ssrf-key="-----BEGIN PRIVATE KEY-----\\nMIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQDDC5MP3v1BHOgI\\n5SsmrW8mjxzQGOz0IlC5jp1muW/kpEoE9TG317TEnO5Uye6zZudkFCP8YGEiN3Mc\\nFbTM7eX6PjAPdnGU7khuUt/20ZM+NX5kWZPrmPTh4WQaDCL7ah1LqzBaUAMaSXq8\\niuy7LGJNF8wdx8L5BjDiGTTxZXOg0Haxknc7Mbiwc9z8eb7omvzQzsOwyqocrF2u\\nz86TzX1jtHP48i5CxoRHKxE94De3tNxjT/Y3OZlS4QS7iekAOQ04DVV3GIHvRUXN\\n2H8ayy4+yOdhHn6ER5Jn3lti1Q5XSrxkrYn7L1Vcj6IwZQhhF5vc+ovxOYb+8ert\\nEo97tIkLAgMBAAECggEAQteHHRPKz9Mzs8Sxvo4GPv0hnzFDl0DhUE4PJCKdtYoV\\n8dADq2DJiu3LAZS4cJPt7Y63bGitMRg2oyPPM8G9pD5Goy3wq9zjRqexKDlXUCTt\\n/T7zofRny7c94m1RWb7ablGq/vBXt90BqnajvVtvDsN+iKAqccQM4ZdI3QdrEmt1\\ncHex924itzG/mqbFTAfAmVj1ZsRnJp55Txy2gqq7jX00xDM8+H49SRvUu49N64LQ\\n6BUWCgWCJePRtgjSHjboAzPqSkMdaTE/WDY2zgGF3Qfq4f6JCHKfm4QylCH4gYUU\\n1Kf7ttmhu9NoZO+hczobKkxP9RtXfyTRH2bsJXy2HQKBgQDhHgavxk/ln5mdMGGw\\nrQud2vF9n7UwFiysYxocIC5/CWD0GAhnawchjPypbW/7vKM5Z9zhW3eH1U9P13sa\\n2xHfrU5BZ16rxoBbKNpcr7VeEbUBAsDoGV24xjoecp7rB2hZ+mGik5/5Ig1Rk1KH\\ndcvYy2KSi1h4Sm+mXwimmA4VDQKBgQDdzW+5FPbdM2sUB2gLMQtn3ICjDSu6IQ+k\\nd0p3WlTIT51RUsPXXKkk96O5anUbeB3syY8tSKPGggsaXaeL3o09yIamtERgCnn3\\nd9IS+4VKPWQlFUICU1KrD+TO7IYIX04iXBuVE5ihv0q3mslhDotmX4kS38NtKEFF\\njLjA2RvAdwKBgAFkIxxw+Ett+hALnX7vAtRd5wIku4TpjisejanA1Si50RyRDXQ+\\nKBQf/+u4HmoK12Nibe4Cl7GCMvRGW59l3S1pr8MdtWsQVfi6Puc1usQzDdBMyQ5m\\nIbsjlnZbtPm02QM9Vd8gVGvAtx5a77aglrrnPtuy+r/7jccUbURCSkv9AoGAH9m3\\nWGmVRZBzqO2jWDATxjdY1ZE3nUPQHjrvG5KCKD2ehqYO72cj9uYEwcRyyp4GFhGf\\nmM4cjo3wEDowrBoqSBv6kgfC5dO7TfkL1qP9sPp93gFeeD0E2wGuRrSaTqt46eA2\\nKcMloNx6W0FD98cB55KCeY5eXtdwAA/EHBVRMeMCgYAd3n6PcL6rVXyE3+wRTKK4\\n+zvx5sjTAnljr5ttbEnpZafzrYIfDpB8NNjexy83AeC0O13LvSHIFoTwP8sywJRO\\nRxbPMjhEBdVZ5NxlxYer7yKN+h5OBJfrLswPku7y4vdFYK3x/lMuNQO61hb1VFHc\\nT2BDTbF0QSlPxFsv18B9zg==\\n-----END PRIVATE KEY-----\\n"
+
+# Private address (loopback): reaching it at all is the SSRF. Listed first -- it
+# fails fast on vulnerable code instead of hanging like a black-holed metadata
+# IP, and testdrive stops at the first failure.
+> CREATE SECRET gcp_ssrf_private AS '{"type":"service_account","project_id":"x","client_email":"a@b.com","private_key":"${gcp-ssrf-key}","token_uri":"http://127.0.0.1:1/token"}'
+! CREATE CONNECTION gcp_ssrf_private_conn TO GCP (SERVICE ACCOUNT KEY = SECRET gcp_ssrf_private) WITH (VALIDATE = true);
+contains:disallowed token_uri
+
+# Correct host, but plaintext HTTP -- the scheme must be HTTPS.
+> CREATE SECRET gcp_ssrf_scheme AS '{"type":"service_account","project_id":"x","client_email":"a@b.com","private_key":"${gcp-ssrf-key}","token_uri":"http://oauth2.googleapis.com/token"}'
+! CREATE CONNECTION gcp_ssrf_scheme_conn TO GCP (SERVICE ACCOUNT KEY = SECRET gcp_ssrf_scheme) WITH (VALIDATE = true);
+contains:disallowed token_uri
+
+# Public host that merely ends in the allowlisted suffix -- only a host allowlist,
+# not a private-IP check, rejects this.
+> CREATE SECRET gcp_ssrf_host AS '{"type":"service_account","project_id":"x","client_email":"a@b.com","private_key":"${gcp-ssrf-key}","token_uri":"https://oauth2.googleapis.com.evil.com/token"}'
+! CREATE CONNECTION gcp_ssrf_host_conn TO GCP (SERVICE ACCOUNT KEY = SECRET gcp_ssrf_host) WITH (VALIDATE = true);
+contains:disallowed token_uri
+
 $ postgres-execute connection=postgres://mz_system:materialize@${testdrive.materialize-internal-sql-addr}
 ALTER SYSTEM SET storage_enforce_external_addresses = false

@@ -148,3 +179,9 @@ ALTER SYSTEM SET storage_enforce_external_addresses = false
 # check.
 ! COPY INTO copy_ssrf_target FROM 'http://127.0.0.1:1/file.csv' (FORMAT CSV);
 contains:Connection refused
+
+# Unlike the resolve_address checks above, the GCP allowlist is unconditional --
+# it does not consult storage_enforce_external_addresses. With enforcement now
+# off, the malicious token_uri is still rejected.
+! CREATE CONNECTION gcp_ssrf_private_conn TO GCP (SERVICE ACCOUNT KEY = SECRET gcp_ssrf_private) WITH (VALIDATE = true);
+contains:disallowed token_uri

@ublubu ublubu force-pushed the kynan/iceberg-unused-storage branch from 08ff859 to 41cc423 Compare June 4, 2026 15:32
@ublubu ublubu force-pushed the kynan/iceberg-gcp-connection branch 2 times, most recently from 0a4293d to fb3cc5b Compare June 4, 2026 16:33
@ublubu ublubu force-pushed the kynan/iceberg-unused-storage branch from 41cc423 to a20f683 Compare June 4, 2026 16:33
@ublubu ublubu force-pushed the kynan/iceberg-gcp-connection branch 2 times, most recently from c92b606 to 8eee70d Compare June 4, 2026 18:35
@ublubu ublubu force-pushed the kynan/iceberg-unused-storage branch from a92b75b to 825ccff Compare June 4, 2026 18:35
@ublubu

ublubu commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

@def- I made it a direct string equality check because Google says that token_uri should always be https://oauth2.googleapis.com/token

/// `gcp_auth` parses the Service Account Key JSON into a private type,
/// so we have to write our own deserializer to check whether it's safe.
#[derive(Deserialize)]
struct GcpStorageAccountKeyTokenUri {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be GcpServiceAccountKeyTokenUri?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol yes

}

impl GcpStorageAccountKeyTokenUri {
fn validate_json(json: &str) -> Result<(), anyhow::Error> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intention is that we only ever allow https://oauth2.googleapis.com/token, should validate_json be run regardless of validate to avoid persisting a bad token URI?

@ublubu ublubu force-pushed the kynan/iceberg-gcp-connection branch from 240661f to 4bf1a89 Compare June 8, 2026 16:30

@martykulma martykulma left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦈

@ublubu ublubu force-pushed the kynan/iceberg-unused-storage branch from 825ccff to 3b52de4 Compare June 8, 2026 18:48
@ublubu ublubu force-pushed the kynan/iceberg-gcp-connection branch 2 times, most recently from 3e15e10 to a45821e Compare June 8, 2026 19:28
@ublubu ublubu force-pushed the kynan/iceberg-unused-storage branch from 3b52de4 to 046b00a Compare June 8, 2026 19:28

@def- def- left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SSRF guard is only for CREATE, not ALTER CONNECTION and ALTER SECRET

@ublubu ublubu force-pushed the kynan/iceberg-unused-storage branch 2 times, most recently from 5c4b86d to 67073cc Compare June 10, 2026 15:03
Base automatically changed from kynan/iceberg-unused-storage to main June 10, 2026 15:29
@ublubu ublubu force-pushed the kynan/iceberg-gcp-connection branch from a45821e to f9afd3e Compare June 10, 2026 15:44
@ublubu

ublubu commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

I'm going to deal with the SSRF guard for ALTER CONNECTION and ALTER SECRET as a followup PR.

https://linear.app/materializeinc/issue/SS-188/guard-against-gcp-connection-ssrf-via-alter-secret-and-alter

resolve conflicts in ru (GCP Connection): GlueSchemaRegistry+Gcp, gcp_auth dep
ublubu added 2 commits June 10, 2026 12:03
resolve conflicts in nk (Dennis SSRF repro): keep AWS IP-encoding + GCP token_uri tests
@ublubu ublubu force-pushed the kynan/iceberg-gcp-connection branch from f9afd3e to bdcdec3 Compare June 10, 2026 16:03
@ublubu ublubu enabled auto-merge (squash) June 10, 2026 16:07
@ublubu ublubu merged commit b5d02e9 into main Jun 10, 2026
117 of 120 checks passed
@ublubu ublubu deleted the kynan/iceberg-gcp-connection branch June 10, 2026 16:36
ublubu added a commit that referenced this pull request Jun 11, 2026
…36997)

This PR adds a simple framework for checking the Secrets used by a
Connection when either the Secret or the Connection changes.

So far, it's just a hook to check GCP secrets for a bad `token_uri`
field.

This is a followup to #36694.

_Just for fun: [Opus vs
Fable](https://github.com/ublubu/materialize/compare/f1cc6f85bd0e70bfafc6aec843e54b64087fc762..ublubu:materialize:3eb3a022489401038eab4f84cbdff0a184ff2063?diff=split&w)
code comparison_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants