Skip to content

feat(cli): add agent-first Airbyte CLI for Cloud operations#1010

Draft
Aaron ("AJ") Steers (aaronsteers) wants to merge 11 commits intomainfrom
devin/1775171846-airbyte-cli
Draft

feat(cli): add agent-first Airbyte CLI for Cloud operations#1010
Aaron ("AJ") Steers (aaronsteers) wants to merge 11 commits intomainfrom
devin/1775171846-airbyte-cli

Conversation

@aaronsteers
Copy link
Copy Markdown
Member

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Apr 2, 2026

Summary

Adds a new CLI entry point (airbyte / uvx airbyte) designed for agent and programmatic consumption of Airbyte Cloud. Two new files:

  • airbyte/cli/_cli_auth.py — Credential resolution with cascading fallbacks: CLI flags → short env vars (AIRBYTE_CLIENT_ID) → long env vars (AIRBYTE_CLOUD_CLIENT_ID) → ~/.airbyte/credentials YAML file → error.
  • airbyte/cli/cloud_cli.py — Click-based CLI with commands for workspaces, sources, destinations, connections, and jobs. All output is structured JSON. Every command supports --describe for schema discovery.

The CLI is a thin wrapper over the existing airbyte._util.api_util module (no business logic in the CLI layer). A new [project.scripts] entry point is registered in pyproject.toml.

Local Python Requirements

None. This works even if no Python is installed. The only prereq is that you install uv using brew install uv or similar.

Local Pre-Merge CLI Testing

You can test the CLI directly from this branch without cloning:

Basic testing with no pre-install:

REPO_REF=git+https://github.com/airbytehq/PyAirbyte.git@devin/1775171846-airbyte-cli

# Show top-level help
uvx --from "airbyte @ $REPO_REF" airbyte --help

Testing with pre-install:

REPO_REF=git+https://github.com/airbytehq/PyAirbyte.git@devin/1775171846-airbyte-cli
uv tool install $REPO_REF

# Show top-level help
airbyte --help

These example assume no pre-install. You can skip the --from clause if have already pre-installed.

# Schema discovery
uvx --from "airbyte @ $REPO_REF" airbyte workspaces list --describe
uvx --from "airbyte @ $REPO_REF" airbyte sources list --describe
uvx --from "airbyte @ $REPO_REF" airbyte connections create --describe

# With credentials (via env vars or ~/.airbyte/credentials)
export AIRBYTE_CLIENT_ID="your-client-id"
export AIRBYTE_CLIENT_SECRET="your-client-secret"
uvx --from "airbyte @ $REPO_REF" airbyte workspaces list --workspace-id <id>
uvx --from "airbyte @ $REPO_REF" airbyte sources list --workspace-id <id>

# Or with inline flags
uvx --from "airbyte @ $REPO_REF" airbyte --client-id "..." --client-secret "..." workspaces list --workspace-id <id>

Review & Testing Checklist for Human

  • Verify api_util function signatures match the call sites. This is the highest-risk area — the CLI passes positional args to run_connection(workspace_id, connection_id, ...), delete_connection(connection_id, ...), and get_job_logs(workspace_id, connection_id, limit, ...). These were read from source but never executed. A mismatch here will only surface at runtime.
  • workspace_id is unconditionally resolved in workspaces_list. The --describe output advertises workspace_id as optional, but the handler always calls resolve_workspace_id() which errors when no workspace is configured. The "list all accessible workspaces" path is unreachable.
  • sources get, destinations get, jobs get unnecessarily require workspace_id. These handlers call _get_auth_context() and destructure workspace_id only to discard it (_ = workspace_id). The downstream API calls don't use it. These commands will fail without a default workspace configured even though credentials alone are sufficient.
  • Click parse/usage errors bypass the JSON error contract. main() invokes cli() with default standalone_mode=True, so unknown flags, missing required options, and Abort produce human-readable stderr + SystemExit instead of structured JSON. The except SystemExit: raise re-raises these unmodified.
  • connections create --describe advertises stream_configurations but the handler ignores it. Only selected_stream_names and prefix are extracted from --json and forwarded to api_util.create_connection.

Suggested test plan: Run the uvx --from commands above to verify --describe works without credentials. Then test at least workspaces list, sources list, sources get, and connections sync with real credentials against a dev workspace. Verify that error output is JSON when credentials are missing or invalid.

Notes

  • --describe exits via sys.exit(0) before credential resolution, so it works without auth configured — useful for agents doing discovery.
  • _json_output uses default=str as a serialization fallback, which silently stringifies unexpected types rather than failing.
  • _cli_auth.py reimplements credential resolution rather than delegating to the existing cloud/auth.py resolve_cloud_client_id / resolve_cloud_client_secret functions. This is intentional to support the CLI's additional resolution sources (short env vars, credentials file) but could be consolidated later.
  • Delete commands support --force to bypass the safe_mode naming guard. Without --force, safe_mode=True is the default.

Link to Devin session: https://app.devin.ai/sessions/95a68701a83c4b458be703ee54c0b4f4
Requested by: Aaron ("AJ") Steers (@aaronsteers)

Summary by CodeRabbit

  • New Features
    • Introduced new airbyte command-line tool for Airbyte Cloud resource management
    • Supports managing workspaces, sources, destinations, connections, and job monitoring
    • JSON output format for machine-readable results and tool integration
    • Multiple authentication methods: CLI flags, environment variables, or local credentials file

Implements a new CLI invokable as 'uvx airbyte ...' or 'airbyte' when installed.

Commands:
- airbyte workspaces list/get
- airbyte sources list/get/create/delete
- airbyte destinations list/get/create/delete
- airbyte connections list/get/create/delete/sync
- airbyte jobs list/get

Features:
- Structured JSON output for agent consumption
- --describe flag for schema discovery
- Credential resolution: env vars -> ~/.airbyte/credentials file
- Thin wrappers over existing api_util core module
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1775171846-airbyte-cli' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1775171846-airbyte-cli'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /uv-lock - Updates uv.lock file
  • /test-pr - Runs tests with the updated PyAirbyte
  • /prerelease - Builds and publishes a prerelease version to PyPI
📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

Warning

Rate limit exceeded

@devin-ai-integration[bot] has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 9 minutes and 52 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 9 minutes and 52 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c8d32d8b-7d7b-4ad4-a4f4-e9f220a33540

📥 Commits

Reviewing files that changed from the base of the PR and between afd3645 and f090fa8.

📒 Files selected for processing (1)
  • airbyte/cli/cloud_cli.py
📝 Walkthrough

Walkthrough

Adds a new Click-based Airbyte Cloud CLI with JSON-first outputs and structured error handling, a new authentication resolver module that sources credentials from explicit args, environment variables, or ~/.airbyte/credentials YAML, and registers a new airbyte console script entry point.

Changes

Cohort / File(s) Summary
Auth Infrastructure
airbyte/cli/_cli_auth.py
New module implementing ordered credential resolution from explicit arguments, short and long env vars, and a YAML credentials file at ~/.airbyte/credentials. Exports resolve_client_id, resolve_client_secret, resolve_workspace_id, and resolve_api_url. Required resolvers raise PyAirbyteInputError when missing; api_url defaults to CLOUD_API_ROOT.
Cloud CLI Implementation
airbyte/cli/cloud_cli.py
New Click-based Cloud CLI root group with global auth options and --format JSON mode. Adds JSON output/error helpers, JSON option parsing and help, lazy auth-context resolution using the auth resolvers, serializers for API models, subcommands (workspaces, sources, destinations, connections, jobs), and main() entrypoint with structured error handling.
Project Configuration
pyproject.toml
Added console script airbyte = "airbyte.cli.cloud_cli:main" under [project.scripts].

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI as Cloud CLI
    participant Auth as Auth Resolver
    participant Env as Env / Credentials File
    participant API as API Utility
    participant Cloud as Airbyte Cloud API

    User->>CLI: Run command with options
    CLI->>Auth: resolve_* (explicit)
    Auth->>Env: check explicit → short env vars → long env vars → credentials file
    Env-->>Auth: return value or empty
    Auth-->>CLI: resolved credentials or raise
    CLI->>API: call operation (workspace/source/... endpoint)
    API->>Cloud: HTTP request
    Cloud-->>API: HTTP response
    API-->>CLI: SDK response
    CLI->>CLI: serialize to dict/JSON
    CLI-->>User: emit JSON (stdout) or error JSON (stderr)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Would you like me to flag specific places to validate error messages and environment variable name consistency, wdyt?

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: introducing a new agent-first CLI for Airbyte Cloud operations, which aligns with the additions of credential resolution and the Cloud CLI modules.
Docstring Coverage ✅ Passed Docstring coverage is 84.09% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1775171846-airbyte-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

PyTest Results (Fast Tests Only, No Creds)

343 tests  ±0   343 ✅ ±0   5m 29s ⏱️ -21s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit f090fa8. ± Comparison against base commit ce1a589.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
airbyte/cli/_cli_auth.py (1)

29-35: Could we explore consolidating the auth resolution logic to reduce drift risk, wdyt?

Both modules implement similar env-var and credentials-resolution chains (e.g., env var precedence, credentials file lookup), and duplicating this logic risks future divergence. That said, consolidation may be non-trivial: _cli_auth.py uses direct env vars + YAML file parsing, while cloud/auth.py uses the secret management utilities (which don't currently support credentials file reading) and wraps sensitive values in SecretString. Would it be worth designing a shared resolution layer, or does the architectural separation between CLI and Cloud SDK make independent implementations preferable?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/cli/_cli_auth.py` around lines 29 - 35, Extract the duplicated
env-var and credentials-file resolution into a shared auth resolution utility
and update both _cli_auth.py and cloud/auth.py to use it: create a single
resolver that honors the same precedence (explicit env vars
CLI_CLIENT_ID_ENV_VAR, CLI_CLIENT_SECRET_ENV_VAR, CLI_WORKSPACE_ID_ENV_VAR,
CLI_API_URL_ENV_VAR; then credentials file at CREDENTIALS_FILE_PATH; then
fallbacks), have the resolver optionally return plain strings or
SecretString-wrapped values depending on a flag so cloud/auth.py can keep secret
handling, and migrate YAML parsing from _cli_auth.py and any secret-manager
logic from cloud/auth.py into that shared module to avoid drift while preserving
each module's external behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@airbyte/cli/_cli_auth.py`:
- Around line 38-56: The _read_credentials_file function currently calls
CREDENTIALS_FILE_PATH.read_text() and yaml.safe_load() without handling errors;
update _read_credentials_file to wrap the file read and YAML parse in a
try/except that catches file I/O errors (OSError) and YAML parse errors
(yaml.YAMLError) and returns an empty dict on any such exception so the
function's behavior matches its docstring; ensure you still check for empty
content and for parsed being a dict (keep existing checks) and reference
CREDENTIALS_FILE_PATH, read_text(), and yaml.safe_load() when making the change.

In `@airbyte/cli/cloud_cli.py`:
- Around line 379-387: The delete calls (e.g., api_util.delete_source) are
currently forcing safe_mode=False which removes the CLI's guardrail; change the
default to keep safe_mode=True and only pass safe_mode=False when the user
explicitly supplies a --force flag. Update the CLI command handlers that call
api_util.delete_source (and the similar delete_* calls around the other spots)
to read the parsed --force boolean and forward safe_mode=not force (or
safe_mode= True unless force), so mistyped IDs trigger validation unless the
caller opted in with --force.
- Around line 209-237: The command treats workspace_id as optional but still
calls resolve_workspace_id(None), breaking the no-flag path; change the logic in
the workspaces list handler to not call resolve_workspace_id when
ctx.obj["_raw_workspace_id"] is falsy and instead pass the raw value through to
api_util.list_workspaces (or explicitly pass None/empty to represent “no
filter”), or alternately make the CLI flag required; update the branch that
currently does workspace_id = resolve_workspace_id(raw_ws) so it only resolves
when raw_ws is truthy and otherwise forwards raw_ws (or raises/marks required)
for list_workspaces, referencing resolve_workspace_id, api_util.list_workspaces,
_get_auth_no_workspace, and ctx.obj["_raw_workspace_id"] to locate the code to
change.
- Around line 49-53: Uncaught exceptions are currently allowed to bubble as
tracebacks, breaking the structured JSON error contract; wrap the CLI entrypoint
(the top-level click command function / main/cli handler that invokes the
codepath using _error_json) with a broad try/except that catches common failure
types (e.g., json.JSONDecodeError for --json parsing, credential/workspace
missing errors, and any SDK/RuntimeError) and funnels them into
_error_json(message=str(err), type=err.__class__.__name__) to print a normalized
JSON error and exit with code 1; ensure the handler re-raises
KeyboardInterrupt/SystemExit where appropriate or maps them to sensible
messages.
- Around line 183-189: The group callback eagerly calls resolve_client_id,
resolve_client_secret, and resolve_api_url and stores resolved values in ctx.obj
which forces auth/workspace resolution for every invocation; instead, store the
raw inputs (client_id, client_secret, api_url, _raw_workspace_id) on ctx.obj and
remove eager calls to resolve_client_id/resolve_client_secret/resolve_api_url so
subcommands can call
resolve_client_id/resolve_client_secret/resolve_api_url/resolve_workspace_id
lazily when they actually need credentials or workspace; update the other
callback sites that currently resolve these values to follow the same pattern
(i.e., stop calling resolve_* in the group callback and let subcommands invoke
resolve_*), and ensure any subcommand that requires auth/workspace explicitly
calls the appropriate resolve_ function early to surface errors.

---

Nitpick comments:
In `@airbyte/cli/_cli_auth.py`:
- Around line 29-35: Extract the duplicated env-var and credentials-file
resolution into a shared auth resolution utility and update both _cli_auth.py
and cloud/auth.py to use it: create a single resolver that honors the same
precedence (explicit env vars CLI_CLIENT_ID_ENV_VAR, CLI_CLIENT_SECRET_ENV_VAR,
CLI_WORKSPACE_ID_ENV_VAR, CLI_API_URL_ENV_VAR; then credentials file at
CREDENTIALS_FILE_PATH; then fallbacks), have the resolver optionally return
plain strings or SecretString-wrapped values depending on a flag so
cloud/auth.py can keep secret handling, and migrate YAML parsing from
_cli_auth.py and any secret-manager logic from cloud/auth.py into that shared
module to avoid drift while preserving each module's external behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: efc7d204-d6a9-4c0f-8057-92915a2a3af3

📥 Commits

Reviewing files that changed from the base of the PR and between ce1a589 and 3b13032.

📒 Files selected for processing (3)
  • airbyte/cli/_cli_auth.py
  • airbyte/cli/cloud_cli.py
  • pyproject.toml

…ructured errors

- Store raw auth values in ctx.obj; resolve lazily in subcommands so
  --describe works without credentials configured
- Catch OSError and yaml.YAMLError separately in _read_credentials_file
  using guard statements instead of broad try/except
- Add --force flag to delete commands (sources, destinations, connections)
  with safe_mode=True by default
- Wrap main() entry point to produce structured JSON errors on stderr
- Add PyAirbyteInputError import for structured error handling
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
airbyte/cli/cloud_cli.py (3)

213-216: ⚠️ Potential issue | 🟠 Major

Could we keep the workspace_id filter truly optional here, wdyt?

Because resolve_workspace_id() falls back to env/file when the raw flag is unset, this path either errors or silently narrows the results instead of doing the advertised “list all accessible workspaces.” I’d either only resolve when the flag was explicitly passed, or make the filter required so the contract matches the behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/cli/cloud_cli.py` around lines 213 - 216, The code currently always
calls resolve_workspace_id(ctx.obj["_raw_workspace_id"]) which falls back to
env/file and unintentionally filters results; change to only resolve when the
raw flag was explicitly provided: check ctx.obj["_raw_workspace_id"] (or
equivalent presence indicator) and if it's set call resolve_workspace_id to
obtain workspace_id, otherwise set workspace_id = None (or omit the filter) and
pass that into api_util.list_workspaces so the SDK will list all accessible
workspaces; update references around resolve_workspace_id and list_workspaces
accordingly.

296-297: ⚠️ Potential issue | 🟠 Major

Could we stop resolving workspace_id for these ID-only reads, wdyt?

These handlers only use _ = workspace_id, and the downstream calls don’t accept a workspace argument. Requiring _get_auth_context() here makes sources get, destinations get, and jobs get fail unless a default workspace is configured, even though credentials alone are enough for the actual API calls. The --describe output for the first two should drop workspace_id too if you make this change.

Also applies to: 426-427, 734-735

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/cli/cloud_cli.py` around lines 296 - 297, The handlers currently call
_get_auth_context(ctx) and destructure workspace_id even though it's unused,
causing failures when no default workspace exists; change those callers (the
"sources get", "destinations get", and "jobs get" handlers that assign api_url,
client_id, client_secret, workspace_id = _get_auth_context(ctx) and then do _ =
workspace_id) to use an auth helper that does not require resolving a workspace
(either add a new _get_credentials(ctx) that returns only api_url, client_id,
client_secret or add a skip_workspace flag to _get_auth_context), update the
affected call sites to only request the three credentials, and remove
workspace_id from the --describe output for sources and destinations so the CLI
no longer expects or prints workspace IDs for these ID-only reads.

757-768: ⚠️ Potential issue | 🟠 Major

Could we invoke Click in non-standalone mode here, wdyt?

cli() still lets Click turn parse/usage failures into human-readable stderr plus SystemExit, so cases like unknown flags, missing required options, and Abort won’t consistently follow the JSON error contract. Calling cli(standalone_mode=False) and catching click.ClickException / click.Abort here would normalize those too.

💡 Possible patch
 def main() -> None:
@@
     try:
-        cli()
+        cli(standalone_mode=False)
     except SystemExit:
         raise
-    except KeyboardInterrupt:
+    except (KeyboardInterrupt, click.Abort):
         _error_json("Operation cancelled.")
+    except click.ClickException as exc:
+        _error_json(exc.format_message(), type=exc.__class__.__name__)
     except json.JSONDecodeError as exc:
         _error_json(str(exc), type="JSONDecodeError")
     except PyAirbyteInputError as exc:
         _error_json(str(exc), type="PyAirbyteInputError")
In Click, when a command is invoked as `cli()` with default `standalone_mode=True`, are usage and parse errors converted into human-readable stderr output plus `SystemExit` instead of being raised to the caller? How does `standalone_mode=False` change that behavior for `ClickException` and `Abort`?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/cli/cloud_cli.py` around lines 757 - 768, The current invocation
cli() allows Click to exit with SystemExit and print human-readable stderr for
parse/usage errors, so update the call to cli(standalone_mode=False) and add
exception handlers for click.ClickException and click.Abort (alongside the
existing SystemExit/KeyboardInterrupt handlers) so those Click errors are caught
and forwarded to _error_json with an appropriate type (e.g., "ClickException" or
"Abort"); ensure click is imported and that click.ClickException provides its
message when passed to _error_json.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@airbyte/cli/cloud_cli.py`:
- Around line 584-617: The describe output advertises "stream_configurations"
but the handler doesn't read or forward it; update the JSON parsing logic in the
CLI handler (where _parse_json_option populates config and variables like
selected_streams, prefix, name, source_id, destination_id are read) to also
extract config.get("stream_configurations") into a local variable (e.g.,
stream_configurations) and pass that through to api_util.create_connection
(alongside selected_stream_names and prefix) so the create_connection call
forwards the stream_configurations the schema describes; ensure you use the
exact symbol names from this file (config, _parse_json_option, selected_streams,
api_util.create_connection) when making the change.

---

Duplicate comments:
In `@airbyte/cli/cloud_cli.py`:
- Around line 213-216: The code currently always calls
resolve_workspace_id(ctx.obj["_raw_workspace_id"]) which falls back to env/file
and unintentionally filters results; change to only resolve when the raw flag
was explicitly provided: check ctx.obj["_raw_workspace_id"] (or equivalent
presence indicator) and if it's set call resolve_workspace_id to obtain
workspace_id, otherwise set workspace_id = None (or omit the filter) and pass
that into api_util.list_workspaces so the SDK will list all accessible
workspaces; update references around resolve_workspace_id and list_workspaces
accordingly.
- Around line 296-297: The handlers currently call _get_auth_context(ctx) and
destructure workspace_id even though it's unused, causing failures when no
default workspace exists; change those callers (the "sources get", "destinations
get", and "jobs get" handlers that assign api_url, client_id, client_secret,
workspace_id = _get_auth_context(ctx) and then do _ = workspace_id) to use an
auth helper that does not require resolving a workspace (either add a new
_get_credentials(ctx) that returns only api_url, client_id, client_secret or add
a skip_workspace flag to _get_auth_context), update the affected call sites to
only request the three credentials, and remove workspace_id from the --describe
output for sources and destinations so the CLI no longer expects or prints
workspace IDs for these ID-only reads.
- Around line 757-768: The current invocation cli() allows Click to exit with
SystemExit and print human-readable stderr for parse/usage errors, so update the
call to cli(standalone_mode=False) and add exception handlers for
click.ClickException and click.Abort (alongside the existing
SystemExit/KeyboardInterrupt handlers) so those Click errors are caught and
forwarded to _error_json with an appropriate type (e.g., "ClickException" or
"Abort"); ensure click is imported and that click.ClickException provides its
message when passed to _error_json.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2236b45d-ca71-404e-ace5-01aea74f773f

📥 Commits

Reviewing files that changed from the base of the PR and between 3b13032 and 30b016e.

📒 Files selected for processing (2)
  • airbyte/cli/_cli_auth.py
  • airbyte/cli/cloud_cli.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte/cli/_cli_auth.py

Comment on lines +584 to +617
if describe:
_describe_output(
description="Create a new connection between a source and destination.",
required_params={
"workspace_id": "The workspace ID.",
"name": "Display name for the connection.",
"source_id": "The source ID.",
"destination_id": "The destination ID.",
},
optional_params={
"stream_configurations": "List of stream configuration objects.",
},
)
api_url, client_id, client_secret, workspace_id = _get_auth_context(ctx)
config = _parse_json_option(json_str)
name = config.get("name")
source_id = config.get("source_id")
destination_id = config.get("destination_id")
if not name or not source_id or not destination_id:
_error_json("'name', 'source_id', and 'destination_id' are required in --json config.")
selected_streams: list[str] = config.get("selected_stream_names", [])
prefix: str = config.get("prefix", "")
result = api_util.create_connection(
name=str(name),
source_id=str(source_id),
destination_id=str(destination_id),
workspace_id=workspace_id,
prefix=prefix,
selected_stream_names=selected_streams,
api_root=api_url,
client_id=client_id,
client_secret=client_secret,
bearer_token=None,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Could we align connections create --describe with what the handler actually honors, wdyt?

The schema advertises stream_configurations, but this implementation never reads or forwards it; it only uses name, source_id, destination_id, selected_stream_names, and prefix. An agent that builds its payload from --describe can end up creating a connection with the default stream selection instead of the requested config.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/cli/cloud_cli.py` around lines 584 - 617, The describe output
advertises "stream_configurations" but the handler doesn't read or forward it;
update the JSON parsing logic in the CLI handler (where _parse_json_option
populates config and variables like selected_streams, prefix, name, source_id,
destination_id are read) to also extract config.get("stream_configurations")
into a local variable (e.g., stream_configurations) and pass that through to
api_util.create_connection (alongside selected_stream_names and prefix) so the
create_connection call forwards the stream_configurations the schema describes;
ensure you use the exact symbol names from this file (config,
_parse_json_option, selected_streams, api_util.create_connection) when making
the change.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed in d155f0a. The --describe output now accurately reflects the actual handler parameters:

optional_params={
    "selected_stream_names": "List of stream names to sync.",
    "prefix": "Optional table prefix for destination.",
},

The old stream_configurations was replaced with selected_stream_names and prefix — which are the two values the handler actually extracts from --json and forwards to api_util.create_connection. An agent using --describe will now get the correct parameter names.


Devin session

All commands now defer required-option validation until after the
--describe check.  This lets agents discover schemas without providing
auth or resource IDs.
- Add _register_schema() and _emit_json_help() helpers
- Create _JsonHelpGroup and _JsonHelpCommand Click classes
- Add --format option to root cli group (text|json)
- Remove all --describe flags from all commands
- All commands now support --format json --help for JSON schemas
- Root-level --format json --help works via sys.argv fallback
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
airbyte/cli/cloud_cli.py (1)

866-882: ⚠️ Potential issue | 🟠 Major

Could we re-add a final except Exception JSON normalization path for agent consistency, wdyt?

Line 867-Line 868 says unknown exceptions should traceback, but that breaks the structured JSON error contract for machine consumers on unexpected failures.

💡 Proposed fix
 def main() -> None:
@@
     except PyAirbyteInputError as exc:
         _error_json(str(exc), type="PyAirbyteInputError")
+    except Exception as exc:
+        _error_json(str(exc), type=exc.__class__.__name__)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte/cli/cloud_cli.py` around lines 866 - 882, Add a final catch-all
exception handler after the existing specific except blocks in the wrapper that
calls cli(standalone_mode=False) so unexpected errors are also emitted as
structured JSON; specifically, add an except Exception as exc: that calls
_error_json with the exception message and a type (e.g., exc.__class__.__name__)
to preserve the JSON error contract alongside the existing handlers (refer to
cli(), _error_json, and the PyAirbyteInputError/ClickException handlers to place
and mirror behavior).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@airbyte/cli/cloud_cli.py`:
- Around line 298-305: The code only calls resolve_workspace_id() when
--workspace-id is supplied, preventing env/credentials fallbacks; change to
always call resolve_workspace_id(raw_ws) (even if raw_ws is None) to let the
resolver consult AIRBYTE_WORKSPACE_ID and credentials, then keep the existing
None check and call _error_json("workspace_id is required...",
type="MissingWorkspaceId") if resolve returns None; refer to
resolve_workspace_id, raw_ws, workspace_id and _error_json to locate and update
the logic.
- Around line 50-54: The function _error_json always exits the process, so
change its return annotation from -> None to -> NoReturn and add the required
import for NoReturn from typing; update the function signature in cloud_cli.py
to use NoReturn (and add "from typing import NoReturn" near other imports) so
type checkers can narrow types after guards that call _error_json.

---

Duplicate comments:
In `@airbyte/cli/cloud_cli.py`:
- Around line 866-882: Add a final catch-all exception handler after the
existing specific except blocks in the wrapper that calls
cli(standalone_mode=False) so unexpected errors are also emitted as structured
JSON; specifically, add an except Exception as exc: that calls _error_json with
the exception message and a type (e.g., exc.__class__.__name__) to preserve the
JSON error contract alongside the existing handlers (refer to cli(),
_error_json, and the PyAirbyteInputError/ClickException handlers to place and
mirror behavior).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a21ee381-df38-4638-baa8-41de3b62a527

📥 Commits

Reviewing files that changed from the base of the PR and between 30b016e and afd3645.

📒 Files selected for processing (2)
  • airbyte/cli/_cli_auth.py
  • airbyte/cli/cloud_cli.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte/cli/_cli_auth.py

- Change _error_json return type to NoReturn for type narrowing
- Always call resolve_workspace_id() in workspaces_list for env/creds fallback
- Add catch-all except Exception handler in main() for JSON error contract
Broad exception catching violates team coding standards. Only catch
specific exceptions that can be handled meaningfully.
Pyrefly does not narrow Optional types after NoReturn calls without
an explicit return statement in the guard clause.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

PyTest Results (Full)

413 tests  ±0   395 ✅ ±0   25m 47s ⏱️ - 1m 40s
  1 suites ±0    18 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit f090fa8. ± Comparison against base commit ce1a589.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant