feat(google_cloud_storage): support bucket-level IAM service accounts#2835
feat(google_cloud_storage): support bucket-level IAM service accounts#2835Kota-Maeda wants to merge 1 commit intolanggenius:mainfrom
Conversation
Add optional `bucket_names` credential field to allow service accounts with only bucket-level IAM permissions (e.g. Storage Object Viewer on specific buckets) to work without requiring project-level storage.buckets.list permission. Changes: - Add `bucket_names` text-input field to provider credential schema - Validate credentials using list_blobs instead of list_buckets when bucket_names is specified, with per-bucket error reporting - Use configured bucket names for browse file listing instead of calling list_buckets when bucket_names is set - Maintain full backward compatibility when bucket_names is not set - Bump version to 0.2.9
There was a problem hiding this comment.
Code Review
This pull request adds a bucket_names configuration option to the Google Cloud Storage datasource, enabling access to specific buckets when project-level permissions are restricted. The implementation includes updates to file browsing, credential validation, and the provider schema. Feedback indicates that Chinese translations should be added for the new configuration fields to maintain consistency with the rest of the project. Additionally, the credential validation logic needs adjustment because list_buckets() is a lazy iterator and does not verify connectivity unless an item is explicitly fetched.
| - name: bucket_names | ||
| type: text-input | ||
| required: false | ||
| label: | ||
| en_US: Bucket Names (optional, comma-separated) | ||
| ja_JP: バケット名(任意、カンマ区切りで複数指定可) | ||
| description: | ||
| en_US: "Restrict access to specific GCS buckets. Enter one or more bucket names separated by commas. This setting is REQUIRED when the service account only has bucket-level IAM permissions (e.g. 'Storage Object Viewer' granted on individual buckets). Without this, the plugin tries to list all project buckets, which fails with 403 if the service account lacks the project-level 'storage.buckets.list' permission. Leave empty only if the service account has project-level storage permissions." | ||
| ja_JP: "アクセス対象のGCSバケットを指定します。複数のバケットはカンマ(,)で区切って入力してください。サービスアカウントに特定バケットへのバケットレベルIAM権限(例: 「Storage Object Viewer」)のみ付与されている場合、この設定は必須です。未設定の場合、プロジェクト内の全バケット一覧を取得しようとするため、プロジェクトレベルの「storage.buckets.list」権限がないと403エラーになります。プロジェクトレベルのストレージ権限がある場合のみ空欄にしてください。" | ||
| placeholder: | ||
| en_US: "my-bucket-1, my-bucket-2, my-bucket-3" | ||
| ja_JP: "バケット名1, バケット名2, バケット名3" |
There was a problem hiding this comment.
The new bucket_names field is missing zh_Hans translations, which are consistently provided for all other fields in this file (e.g., lines 6, 9, 15, 24, 27). To maintain consistency with the rest of the plugin and support the project's primary user base, please add zh_Hans translations for the label, description, and placeholder keys.
| "Failed to access the following bucket(s):\n" + "\n".join(failed_buckets) | ||
| ) | ||
| else: | ||
| google_client.list_buckets() |
There was a problem hiding this comment.
google_client.list_buckets() returns a lazy iterator and does not perform a network request upon invocation. Consequently, this line does not actually validate the credentials when bucket_names is not provided. To ensure validation occurs, you should attempt to fetch at least one item from the iterator.
| google_client.list_buckets() | |
| next(google_client.list_buckets(max_results=1), None) |
Related Issues or Context
Fixes #2834
The Google Cloud Storage datasource plugin currently requires project-level
storage.buckets.listpermission during credential validation and file browsing. This makes it impossible to use service accounts with only bucket-level IAM permissions (e.g.roles/storage.objectVieweron specific buckets), which is a common setup following the principle of least privilege.This PR contains Changes to Non-LLM Models Plugin
What changed
provider/google_cloud_storage.yaml— Added optionalbucket_namestext-input credential field (comma-separated) that allows users to specify which buckets the service account can access.provider/google_cloud_storage.py— Whenbucket_namesis set, credential validation useslist_blobs(bucket, max_results=1)per bucket instead oflist_buckets(). This only requiresstorage.objects.listpermission. Failed buckets are reported individually with clear error messages.datasources/google_cloud_storage.py— Whenbucket_namesis set and no bucket is selected, returns the configured buckets as the bucket list instead of callinglist_buckets().Backward compatibility
When
bucket_namesis left empty, the plugin behaves exactly as before (callslist_buckets()for validation and browsing). No existing functionality is affected.Test cases verified
bucket_nameswith single bucket + bucket-level IAM SAbucket_nameswith multiple comma-separated bucketsbucket_nameswith unauthorized bucketbucket_namesempty + project-level IAM SAVersion Control (Any Changes to the Plugin Will Require Bumping the Version)
VersionField, Not in Meta Section)Version bumped from 0.2.8 to 0.2.9 (PATCH: backward-compatible feature addition)
Dify Plugin SDK Version
dify_plugin>=0.3.0,<0.6.0is in requirements.txt (SDK docs)Environment Verification (If Any Code Changes)
Local Deployment Environment