Skip to content

Port: client-supplied S3 credentials (s3_key, s3_secret, s3_endpoint, s3_scope) to query tool #7

@boettiger-lab-llm-agent

Description

@boettiger-lab-llm-agent

Summary

mcp-data-server added support for client-supplied S3 credentials on the query tool in #20 and a scoping fix in b67b168. This allows querying private S3 data without exposing credentials in server config.

Changes needed

The GPU server's query tool currently accepts only sql_query. It needs:

def query(sql_query: str, s3_key: str = None, s3_secret: str = None,
          s3_endpoint: str = None, s3_scope: str = None) -> str:

In the DuckDB version this injects credentials via CREATE OR REPLACE SECRET in an isolated per-request connection. For the GPU/Polars version the equivalent is injecting into storage_options — passed to S3_STORAGE_OPTIONS override at query time, scoped to the request and never persisted.

Design notes

  • The query_engine.execute() function would need to accept optional storage_options override
  • The s3_scope parameter is used when a query mixes private and public S3 paths — credentials should only apply to paths matching the scope prefix
  • Credentials must never appear in logs
  • Connection/storage_options must be destroyed after each request

Security properties to preserve

Property Requirement
Request isolation Credentials apply only to the current query
No logging s3_key and s3_secret must not appear in stderr/logs
Scope enforcement s3_scope prefix limits which paths use the credential

Reference implementation

See mcp-data-server server.py get_isolated_db for the DuckDB pattern.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions