Skip to content

feat(google_cloud_storage): support bucket-level IAM service accounts#2835

Open
Kota-Maeda wants to merge 1 commit intolanggenius:mainfrom
Kota-Maeda:feat/gcs-bucket-level-iam
Open

feat(google_cloud_storage): support bucket-level IAM service accounts#2835
Kota-Maeda wants to merge 1 commit intolanggenius:mainfrom
Kota-Maeda:feat/gcs-bucket-level-iam

Conversation

@Kota-Maeda
Copy link
Copy Markdown
Contributor

Related Issues or Context

Fixes #2834

The Google Cloud Storage datasource plugin currently requires project-level storage.buckets.list permission during credential validation and file browsing. This makes it impossible to use service accounts with only bucket-level IAM permissions (e.g. roles/storage.objectViewer on specific buckets), which is a common setup following the principle of least privilege.

This PR contains Changes to Non-LLM Models Plugin

  • I have Run Comprehensive Tests Relevant to My Changes

What changed

  1. provider/google_cloud_storage.yaml — Added optional bucket_names text-input credential field (comma-separated) that allows users to specify which buckets the service account can access.

  2. provider/google_cloud_storage.py — When bucket_names is set, credential validation uses list_blobs(bucket, max_results=1) per bucket instead of list_buckets(). This only requires storage.objects.list permission. Failed buckets are reported individually with clear error messages.

  3. datasources/google_cloud_storage.py — When bucket_names is set and no bucket is selected, returns the configured buckets as the bucket list instead of calling list_buckets().

Backward compatibility

When bucket_names is left empty, the plugin behaves exactly as before (calls list_buckets() for validation and browsing). No existing functionality is affected.

Test cases verified

# Condition Result
1 bucket_names with single bucket + bucket-level IAM SA ✅ Validation succeeds
2 bucket_names with multiple comma-separated buckets ✅ Validation succeeds
3 Browse files with no bucket selected ✅ Configured buckets shown as list
4 Select a bucket and browse ✅ Files/folders displayed correctly
5 Download a file from bucket ✅ Downloads successfully
6 bucket_names with unauthorized bucket ✅ Validation fails with clear per-bucket error
7 bucket_names empty + project-level IAM SA ✅ All buckets listed (backward compatible)

Version Control (Any Changes to the Plugin Will Require Bumping the Version)

  • I have Bumped Up the Version in Manifest.yaml (Top-Level Version Field, Not in Meta Section)

Version bumped from 0.2.8 to 0.2.9 (PATCH: backward-compatible feature addition)

Dify Plugin SDK Version

  • I have Ensured dify_plugin>=0.3.0,<0.6.0 is in requirements.txt (SDK docs)

Environment Verification (If Any Code Changes)

Local Deployment Environment

  • Dify Version is: 1.9.0, I have Tested My Changes on Local Deployment Dify with a Clean Environment That Matches the Production Configuration.

Add optional `bucket_names` credential field to allow service accounts
with only bucket-level IAM permissions (e.g. Storage Object Viewer on
specific buckets) to work without requiring project-level
storage.buckets.list permission.

Changes:
- Add `bucket_names` text-input field to provider credential schema
- Validate credentials using list_blobs instead of list_buckets when
  bucket_names is specified, with per-bucket error reporting
- Use configured bucket names for browse file listing instead of
  calling list_buckets when bucket_names is set
- Maintain full backward compatibility when bucket_names is not set
- Bump version to 0.2.9
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Apr 6, 2026
@Kota-Maeda Kota-Maeda deployed to datasources/google_cloud_storage April 6, 2026 08:12 — with GitHub Actions Active
@dosubot dosubot bot added the enhancement New feature or request label Apr 6, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a bucket_names configuration option to the Google Cloud Storage datasource, enabling access to specific buckets when project-level permissions are restricted. The implementation includes updates to file browsing, credential validation, and the provider schema. Feedback indicates that Chinese translations should be added for the new configuration fields to maintain consistency with the rest of the project. Additionally, the credential validation logic needs adjustment because list_buckets() is a lazy iterator and does not verify connectivity unless an item is explicitly fetched.

Comment on lines +29 to +40
- name: bucket_names
type: text-input
required: false
label:
en_US: Bucket Names (optional, comma-separated)
ja_JP: バケット名(任意、カンマ区切りで複数指定可)
description:
en_US: "Restrict access to specific GCS buckets. Enter one or more bucket names separated by commas. This setting is REQUIRED when the service account only has bucket-level IAM permissions (e.g. 'Storage Object Viewer' granted on individual buckets). Without this, the plugin tries to list all project buckets, which fails with 403 if the service account lacks the project-level 'storage.buckets.list' permission. Leave empty only if the service account has project-level storage permissions."
ja_JP: "アクセス対象のGCSバケットを指定します。複数のバケットはカンマ(,)で区切って入力してください。サービスアカウントに特定バケットへのバケットレベルIAM権限(例: 「Storage Object Viewer」)のみ付与されている場合、この設定は必須です。未設定の場合、プロジェクト内の全バケット一覧を取得しようとするため、プロジェクトレベルの「storage.buckets.list」権限がないと403エラーになります。プロジェクトレベルのストレージ権限がある場合のみ空欄にしてください。"
placeholder:
en_US: "my-bucket-1, my-bucket-2, my-bucket-3"
ja_JP: "バケット名1, バケット名2, バケット名3"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new bucket_names field is missing zh_Hans translations, which are consistently provided for all other fields in this file (e.g., lines 6, 9, 15, 24, 27). To maintain consistency with the rest of the plugin and support the project's primary user base, please add zh_Hans translations for the label, description, and placeholder keys.

"Failed to access the following bucket(s):\n" + "\n".join(failed_buckets)
)
else:
google_client.list_buckets()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

google_client.list_buckets() returns a lazy iterator and does not perform a network request upon invocation. Consequently, this line does not actually validate the credentials when bucket_names is not provided. To ensure validation occurs, you should attempt to fetch at least one item from the iterator.

Suggested change
google_client.list_buckets()
next(google_client.list_buckets(max_results=1), None)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Google Cloud Storage plugin fails with bucket-level IAM service accounts (403 on list_buckets)

1 participant