[SS-69] Add a GCP Connection to hold Google service account creds#36694
Conversation
patrickwwbutler
left a comment
There was a problem hiding this comment.
LGTM with one teeny tiny non blocking nit
| _id: CatalogItemId, | ||
| storage_configuration: &StorageConfiguration, | ||
| ) -> Result<(), GcpConnectionValidationError> { | ||
| let json = storage_configuration |
There was a problem hiding this comment.
nit: this could prob have a more descriptive name than json but honestly the context is right there so it's not hard to read anyways
def-
left a comment
There was a problem hiding this comment.
With CREATE CONNECTION … TO GCP (SERVICE ACCOUNT KEY = SECRET …) the oauth token request target is taken directly from the json. Even with storage_enforce_external_addresses = true this allows accessing any http/https, even internal ones, which should be blocked! Small test to verify:
diff --git a/test/testdrive/connection-create-external.td b/test/testdrive/connection-create-external.td
index cbe04d2bea..7dc357bd38 100644
--- a/test/testdrive/connection-create-external.td
+++ b/test/testdrive/connection-create-external.td
@@ -139,6 +139,37 @@ contains:Address resolved to a private IP
) WITH (VALIDATE = true);
contains:Address resolved to a private IP
+# A GCP connection validates by POSTing to the `token_uri` from the
+# service-account-key JSON. gcp_auth issues that request verbatim, bypassing the
+# resolve_address / ENFORCE_EXTERNAL_ADDRESSES path the connections above use, so
+# validation must reject any token_uri that is not an HTTPS *.googleapis.com host
+# -- otherwise it is an SSRF primitive hidden inside an opaque secret.
+#
+# The private_key is a throwaway RSA key reused from the repo (src/ccsr/tests and
+# the trufflehog allowlist) -- never a real credential. It must parse so gcp_auth
+# reaches the token-fetch step that consumes token_uri. Backslashes are doubled:
+# standard_conforming_strings is on, so the text->bytea cast turns each "\\n"
+# into the literal "\n" the JSON needs.
+$ set gcp-ssrf-key="-----BEGIN PRIVATE KEY-----\\nMIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQDDC5MP3v1BHOgI\\n5SsmrW8mjxzQGOz0IlC5jp1muW/kpEoE9TG317TEnO5Uye6zZudkFCP8YGEiN3Mc\\nFbTM7eX6PjAPdnGU7khuUt/20ZM+NX5kWZPrmPTh4WQaDCL7ah1LqzBaUAMaSXq8\\niuy7LGJNF8wdx8L5BjDiGTTxZXOg0Haxknc7Mbiwc9z8eb7omvzQzsOwyqocrF2u\\nz86TzX1jtHP48i5CxoRHKxE94De3tNxjT/Y3OZlS4QS7iekAOQ04DVV3GIHvRUXN\\n2H8ayy4+yOdhHn6ER5Jn3lti1Q5XSrxkrYn7L1Vcj6IwZQhhF5vc+ovxOYb+8ert\\nEo97tIkLAgMBAAECggEAQteHHRPKz9Mzs8Sxvo4GPv0hnzFDl0DhUE4PJCKdtYoV\\n8dADq2DJiu3LAZS4cJPt7Y63bGitMRg2oyPPM8G9pD5Goy3wq9zjRqexKDlXUCTt\\n/T7zofRny7c94m1RWb7ablGq/vBXt90BqnajvVtvDsN+iKAqccQM4ZdI3QdrEmt1\\ncHex924itzG/mqbFTAfAmVj1ZsRnJp55Txy2gqq7jX00xDM8+H49SRvUu49N64LQ\\n6BUWCgWCJePRtgjSHjboAzPqSkMdaTE/WDY2zgGF3Qfq4f6JCHKfm4QylCH4gYUU\\n1Kf7ttmhu9NoZO+hczobKkxP9RtXfyTRH2bsJXy2HQKBgQDhHgavxk/ln5mdMGGw\\nrQud2vF9n7UwFiysYxocIC5/CWD0GAhnawchjPypbW/7vKM5Z9zhW3eH1U9P13sa\\n2xHfrU5BZ16rxoBbKNpcr7VeEbUBAsDoGV24xjoecp7rB2hZ+mGik5/5Ig1Rk1KH\\ndcvYy2KSi1h4Sm+mXwimmA4VDQKBgQDdzW+5FPbdM2sUB2gLMQtn3ICjDSu6IQ+k\\nd0p3WlTIT51RUsPXXKkk96O5anUbeB3syY8tSKPGggsaXaeL3o09yIamtERgCnn3\\nd9IS+4VKPWQlFUICU1KrD+TO7IYIX04iXBuVE5ihv0q3mslhDotmX4kS38NtKEFF\\njLjA2RvAdwKBgAFkIxxw+Ett+hALnX7vAtRd5wIku4TpjisejanA1Si50RyRDXQ+\\nKBQf/+u4HmoK12Nibe4Cl7GCMvRGW59l3S1pr8MdtWsQVfi6Puc1usQzDdBMyQ5m\\nIbsjlnZbtPm02QM9Vd8gVGvAtx5a77aglrrnPtuy+r/7jccUbURCSkv9AoGAH9m3\\nWGmVRZBzqO2jWDATxjdY1ZE3nUPQHjrvG5KCKD2ehqYO72cj9uYEwcRyyp4GFhGf\\nmM4cjo3wEDowrBoqSBv6kgfC5dO7TfkL1qP9sPp93gFeeD0E2wGuRrSaTqt46eA2\\nKcMloNx6W0FD98cB55KCeY5eXtdwAA/EHBVRMeMCgYAd3n6PcL6rVXyE3+wRTKK4\\n+zvx5sjTAnljr5ttbEnpZafzrYIfDpB8NNjexy83AeC0O13LvSHIFoTwP8sywJRO\\nRxbPMjhEBdVZ5NxlxYer7yKN+h5OBJfrLswPku7y4vdFYK3x/lMuNQO61hb1VFHc\\nT2BDTbF0QSlPxFsv18B9zg==\\n-----END PRIVATE KEY-----\\n"
+
+# Private address (loopback): reaching it at all is the SSRF. Listed first -- it
+# fails fast on vulnerable code instead of hanging like a black-holed metadata
+# IP, and testdrive stops at the first failure.
+> CREATE SECRET gcp_ssrf_private AS '{"type":"service_account","project_id":"x","client_email":"a@b.com","private_key":"${gcp-ssrf-key}","token_uri":"http://127.0.0.1:1/token"}'
+! CREATE CONNECTION gcp_ssrf_private_conn TO GCP (SERVICE ACCOUNT KEY = SECRET gcp_ssrf_private) WITH (VALIDATE = true);
+contains:disallowed token_uri
+
+# Correct host, but plaintext HTTP -- the scheme must be HTTPS.
+> CREATE SECRET gcp_ssrf_scheme AS '{"type":"service_account","project_id":"x","client_email":"a@b.com","private_key":"${gcp-ssrf-key}","token_uri":"http://oauth2.googleapis.com/token"}'
+! CREATE CONNECTION gcp_ssrf_scheme_conn TO GCP (SERVICE ACCOUNT KEY = SECRET gcp_ssrf_scheme) WITH (VALIDATE = true);
+contains:disallowed token_uri
+
+# Public host that merely ends in the allowlisted suffix -- only a host allowlist,
+# not a private-IP check, rejects this.
+> CREATE SECRET gcp_ssrf_host AS '{"type":"service_account","project_id":"x","client_email":"a@b.com","private_key":"${gcp-ssrf-key}","token_uri":"https://oauth2.googleapis.com.evil.com/token"}'
+! CREATE CONNECTION gcp_ssrf_host_conn TO GCP (SERVICE ACCOUNT KEY = SECRET gcp_ssrf_host) WITH (VALIDATE = true);
+contains:disallowed token_uri
+
$ postgres-execute connection=postgres://mz_system:materialize@${testdrive.materialize-internal-sql-addr}
ALTER SYSTEM SET storage_enforce_external_addresses = false
@@ -148,3 +179,9 @@ ALTER SYSTEM SET storage_enforce_external_addresses = false
# check.
! COPY INTO copy_ssrf_target FROM 'http://127.0.0.1:1/file.csv' (FORMAT CSV);
contains:Connection refused
+
+# Unlike the resolve_address checks above, the GCP allowlist is unconditional --
+# it does not consult storage_enforce_external_addresses. With enforcement now
+# off, the malicious token_uri is still rejected.
+! CREATE CONNECTION gcp_ssrf_private_conn TO GCP (SERVICE ACCOUNT KEY = SECRET gcp_ssrf_private) WITH (VALIDATE = true);
+contains:disallowed token_uri08ff859 to
41cc423
Compare
0a4293d to
fb3cc5b
Compare
41cc423 to
a20f683
Compare
c92b606 to
8eee70d
Compare
a92b75b to
825ccff
Compare
|
@def- I made it a direct string equality check because Google says that |
| /// `gcp_auth` parses the Service Account Key JSON into a private type, | ||
| /// so we have to write our own deserializer to check whether it's safe. | ||
| #[derive(Deserialize)] | ||
| struct GcpStorageAccountKeyTokenUri { |
There was a problem hiding this comment.
Should this be GcpServiceAccountKeyTokenUri?
| } | ||
|
|
||
| impl GcpStorageAccountKeyTokenUri { | ||
| fn validate_json(json: &str) -> Result<(), anyhow::Error> { |
There was a problem hiding this comment.
If the intention is that we only ever allow https://oauth2.googleapis.com/token, should validate_json be run regardless of validate to avoid persisting a bad token URI?
240661f to
4bf1a89
Compare
825ccff to
3b52de4
Compare
3e15e10 to
a45821e
Compare
3b52de4 to
046b00a
Compare
def-
left a comment
There was a problem hiding this comment.
The SSRF guard is only for CREATE, not ALTER CONNECTION and ALTER SECRET
5c4b86d to
67073cc
Compare
a45821e to
f9afd3e
Compare
|
I'm going to deal with the SSRF guard for |
resolve conflicts in ru (GCP Connection): GlueSchemaRegistry+Gcp, gcp_auth dep
resolve conflicts in nk (Dennis SSRF repro): keep AWS IP-encoding + GCP token_uri tests
f9afd3e to
bdcdec3
Compare
…36997) This PR adds a simple framework for checking the Secrets used by a Connection when either the Secret or the Connection changes. So far, it's just a hook to check GCP secrets for a bad `token_uri` field. This is a followup to #36694. _Just for fun: [Opus vs Fable](https://github.com/ublubu/materialize/compare/f1cc6f85bd0e70bfafc6aec843e54b64087fc762..ublubu:materialize:3eb3a022489401038eab4f84cbdff0a184ff2063?diff=split&w) code comparison_
This PR brings on
gcp_authas a dependency.stacked on #36693