feat: add Nvidia backend and OpenAI-compatible adapter#20
Conversation
|
Warning Review limit reached
More reviews will be available in 3 minutes and 6 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis PR adds invocation and transport negotiation metadata, extracts a shared OpenAI-compatible backend layer, adds NVIDIA backend support and emulation, wires transport-aware runtime behavior and metrics, and updates docs, configs, tests, and review automation. ChangesTransport negotiation and NVIDIA backend support
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
internal/core/runtime/executor_open_attempt.go (1)
208-210: 💤 Low valueConsider adding a comment for the transport mode fallback logic.
The fallback from
transportRes.SelectedtotransportRes.ModewhenSelectedis empty might not be immediately obvious to future readers. A brief comment explaining when this occurs (e.g., during rejection when no mode is selected) would improve code clarity.📝 Suggested comment
transportRes := lipapi.NegotiateTransport(attempt.Invocation, transportCaps, e.effectiveTransportFallbackPolicy()) transportMode := transportRes.Selected if transportMode == "" { + // Fall back to Mode when Selected is empty (e.g., during rejection when no specific mode was chosen) transportMode = transportRes.Mode }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/core/runtime/executor_open_attempt.go` around lines 208 - 210, Add a brief explanatory comment above or within the transportMode fallback logic (where transportMode is checked for empty string and assigned to transportRes.Mode) to clarify when this fallback occurs and why. The comment should explain that this fallback happens during rejection scenarios when no specific mode is selected, so that future readers understand the context and purpose of this conditional assignment.pkg/lipapi/call.go (1)
66-66: ⚡ Quick winClarify JSON serialization intent for internal metadata.
The
Invocationfield is framework-populated metadata (frontends derive it from client request parameters), not a direct client-provided JSON field.MaxPendingWireEventson line 70 follows a similar pattern and includes both ajson:"-"tag and a comment explaining it's "Not client API; the core executor sets this...".Consider adding
json:"-"toInvocationif it should not appear in marshaled Call JSON, or document its wire-format role if it should be serialized.Proposed consistency fix
Extensions map[string]json.RawMessage - Invocation Invocation + // Invocation carries protocol operation and delivery metadata populated by frontend decoders. + Invocation Invocation `json:"-"` // MaxPendingWireEvents caps backend adapter-internal pending event queues per stream (0 = unlimited).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/lipapi/call.go` at line 66, The Invocation field is framework-populated metadata that should be handled consistently with other internal fields like MaxPendingWireEvents. Add a json:"-" tag to the Invocation field declaration to indicate it should not be included in marshaled JSON, matching the pattern used by MaxPendingWireEvents on line 70. Alternatively, if the Invocation field is intended to be serialized, add a comment above it explaining its wire-format role and why it differs from the json:"-" approach, similar to the comment provided for MaxPendingWireEvents.internal/plugins/openrouterwire/extensions.go (1)
114-116: 💤 Low valueConsider the naming/organizational alignment of this NVIDIA-specific constant.
The
ExtraBodyExtPrefix = "nvidia.extra_body."constant is NVIDIA-specific but resides in theopenrouterwirepackage (OpenRouter-focused). While the PR context shows this is intentional (NVIDIA backend support through OpenAI-compatible frontends), future maintainers may find this placement unexpected.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/plugins/openrouterwire/extensions.go` around lines 114 - 116, The `ExtraBodyExtPrefix` constant is NVIDIA-specific but located in the `openrouterwire` package which focuses on OpenRouter functionality, potentially confusing future maintainers about its purpose. Add a clarifying comment above the constant definition to explain that this NVIDIA-specific extension key is used to support NVIDIA backend integration through OpenAI-compatible frontend adapters, making the intentional placement and purpose clear to future readers.internal/plugins/backends/openaicompat/backend_test.go (1)
76-102: ⚡ Quick winAdd cleanup for httptest.Server instances.
The test creates an httptest.Server but doesn't register cleanup. For consistency with the integration tests (which use
t.Cleanup(srv.Close)at internal/plugins/backends/openrouter/integration_test.go:726) and to prevent resource leaks, add cleanup for the test server.🧹 Suggested fix
func TestNewBackend_routesByFlavor(t *testing.T) { t.Parallel() rec := newInvokeRecorder() srv := newInvokeServer(t, rec) + t.Cleanup(srv.Close) be := NewBackend(validBackendSpec(srv.URL))Apply the same pattern to the test servers created at lines 125, 165, 193, and 222.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/plugins/backends/openaicompat/backend_test.go` around lines 76 - 102, The TestNewBackend_routesByFlavor test creates an httptest.Server via the newInvokeServer function but does not register cleanup, causing a resource leak. Add t.Cleanup(srv.Close) immediately after the srv is created to ensure the server is properly closed after the test completes. Apply this same cleanup pattern consistently to all other test functions that create server instances using newInvokeServer.internal/plugins/backends/openaicompat/invoke_test.go (1)
20-48: ⚡ Quick winAdd cleanup for httptest.Server instances.
The test creates an httptest.Server but doesn't register cleanup. For consistency with the integration tests and to prevent resource leaks, add
t.Cleanup(srv.Close)after creating the server.🧹 Suggested fix
func TestOpenChat_nonStreamingUsesNonStreamEndpoint(t *testing.T) { t.Parallel() rec := newInvokeRecorder() srv := newInvokeServer(t, rec) + t.Cleanup(srv.Close) cli := openaicred.NewOpenAIClient(srv.URL, "sk-test", srv.Client(), intPtr(0))Apply the same pattern to test servers created at lines 53, 78, 103, and 128.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/plugins/backends/openaicompat/invoke_test.go` around lines 20 - 48, The httptest.Server instances created with newInvokeServer() in the TestOpenChat_nonStreamingUsesNonStreamEndpoint test function and similar test functions at lines 53, 78, 103, and 128 lack proper cleanup registration. After each srv := newInvokeServer(t, rec) call, immediately add t.Cleanup(srv.Close) to register the server's Close method as a cleanup function that will be executed when the test completes, preventing resource leaks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/archtest/runtime_transport_boundaries_test.go`:
- Line 22: The exec.Command in the boundary check test uses the pattern
"./internal/core/runtime" which only scans that single package and misses the
subpackage "./internal/core/runtime/failclosed". Change the path argument from
"./internal/core/runtime" to "./internal/core/runtime/..." to use Go's recursive
pattern that includes all subpackages, ensuring all packages under runtime are
checked for forbidden imports as intended by the architectural boundary test.
In `@internal/pluginreg/keys.go`:
- Around line 106-134: The collectNvidiaEnvKeys function duplicates the
identical logic found in collectOpenRouterEnvKeys (lines 75-104), including the
1-indexed numbering loop, gap detection, and deduplication logic. Extract this
shared logic into a new helper function that accepts the environment variable
prefix as a parameter (for example, "NVIDIA_API_KEY" or "OPENROUTER_API_KEY"),
then refactor both collectOpenRouterEnvKeys and collectNvidiaEnvKeys to call
this shared helper function instead of duplicating the implementation.
---
Nitpick comments:
In `@internal/core/runtime/executor_open_attempt.go`:
- Around line 208-210: Add a brief explanatory comment above or within the
transportMode fallback logic (where transportMode is checked for empty string
and assigned to transportRes.Mode) to clarify when this fallback occurs and why.
The comment should explain that this fallback happens during rejection scenarios
when no specific mode is selected, so that future readers understand the context
and purpose of this conditional assignment.
In `@internal/plugins/backends/openaicompat/backend_test.go`:
- Around line 76-102: The TestNewBackend_routesByFlavor test creates an
httptest.Server via the newInvokeServer function but does not register cleanup,
causing a resource leak. Add t.Cleanup(srv.Close) immediately after the srv is
created to ensure the server is properly closed after the test completes. Apply
this same cleanup pattern consistently to all other test functions that create
server instances using newInvokeServer.
In `@internal/plugins/backends/openaicompat/invoke_test.go`:
- Around line 20-48: The httptest.Server instances created with
newInvokeServer() in the TestOpenChat_nonStreamingUsesNonStreamEndpoint test
function and similar test functions at lines 53, 78, 103, and 128 lack proper
cleanup registration. After each srv := newInvokeServer(t, rec) call,
immediately add t.Cleanup(srv.Close) to register the server's Close method as a
cleanup function that will be executed when the test completes, preventing
resource leaks.
In `@internal/plugins/openrouterwire/extensions.go`:
- Around line 114-116: The `ExtraBodyExtPrefix` constant is NVIDIA-specific but
located in the `openrouterwire` package which focuses on OpenRouter
functionality, potentially confusing future maintainers about its purpose. Add a
clarifying comment above the constant definition to explain that this
NVIDIA-specific extension key is used to support NVIDIA backend integration
through OpenAI-compatible frontend adapters, making the intentional placement
and purpose clear to future readers.
In `@pkg/lipapi/call.go`:
- Line 66: The Invocation field is framework-populated metadata that should be
handled consistently with other internal fields like MaxPendingWireEvents. Add a
json:"-" tag to the Invocation field declaration to indicate it should not be
included in marshaled JSON, matching the pattern used by MaxPendingWireEvents on
line 70. Alternatively, if the Invocation field is intended to be serialized,
add a comment above it explaining its wire-format role and why it differs from
the json:"-" approach, similar to the comment provided for MaxPendingWireEvents.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a8c2a9cf-12dd-431d-adc8-8bc19f985bd7
📒 Files selected for processing (96)
.coderabbit.yamlREADME.mdcmd/lipstd/testdata/dogfood-local-stub/inventory.golden.jsoncmd/lipstd/testdata/dogfood-local-stub/routes.golden.jsonconfig/config.yamlconfig/examples/anthropic-stub.yamlconfig/examples/dogfood-local-stub.yamlconfig/examples/gemini-stub.yamlconfig/examples/openai-legacy-stub.yamlconfig/examples/openai-responses-stub.yamldocs/architecture-guardrails.mddocs/backend-adapter-boundaries.mddocs/conformance-golden-coverage.mdinternal/archtest/backend_lifecycle_contract_test.gointernal/archtest/openaicompat_boundaries_test.gointernal/archtest/runtime_transport_boundaries_test.gointernal/core/config/model.gointernal/core/config/transport.gointernal/core/config/transport_test.gointernal/core/execbackend/backend.gointernal/core/runtime/attempt_stream.gointernal/core/runtime/completion_recv.gointernal/core/runtime/executor.gointernal/core/runtime/executor_invocation_metadata_test.gointernal/core/runtime/executor_open_attempt.gointernal/core/runtime/executor_open_attempt_diagnostics_test.gointernal/core/runtime/executor_test.gointernal/core/runtime/executor_transport_span_test.gointernal/core/runtime/executor_transport_test.gointernal/core/runtime/metrics_sink.gointernal/core/runtime/parallel_race.gointernal/core/runtime/parallel_race_test.gointernal/core/runtime/secure_session_recorder_mandatory_test.gointernal/core/runtime/secure_session_stream_record.gointernal/infra/metrics/bundle_test.gointernal/infra/metrics/executor_prom.gointernal/infra/runtimebundle/build.gointernal/pluginreg/backends_install.gointernal/pluginreg/default_wire_model.gointernal/pluginreg/hosted_backend_build_test.gointernal/pluginreg/keys.gointernal/pluginreg/resolve_upstream_api_keys_test.gointernal/pluginreg/spec_bundle_standard_inventory_test.gointernal/pluginreg/standard_bundle_posture_test.gointernal/pluginreg/standard_table.gointernal/plugins/backends/nvidia/doc.gointernal/plugins/backends/nvidia/integration_test.gointernal/plugins/backends/nvidia/payload_mutate.gointernal/plugins/backends/nvidia/payload_mutate_test.gointernal/plugins/backends/nvidia/plugin.gointernal/plugins/backends/nvidia/plugin_config_test.gointernal/plugins/backends/nvidia/resolve.gointernal/plugins/backends/nvidia/resolve_test.gointernal/plugins/backends/openaicompat/backend.gointernal/plugins/backends/openaicompat/backend_test.gointernal/plugins/backends/openaicompat/chat_events.gointernal/plugins/backends/openaicompat/chat_events_test.gointernal/plugins/backends/openaicompat/chat_stream.gointernal/plugins/backends/openaicompat/doc.gointernal/plugins/backends/openaicompat/invoke.gointernal/plugins/backends/openaicompat/invoke_test.gointernal/plugins/backends/openaicompat/lifecycle_contract_test.gointernal/plugins/backends/openaicompat/responses_events.gointernal/plugins/backends/openaicompat/responses_events_test.gointernal/plugins/backends/openrouter/integration_test.gointernal/plugins/backends/openrouter/invoke_chat.gointernal/plugins/backends/openrouter/invoke_responses.gointernal/plugins/backends/openrouter/plugin.gointernal/plugins/backends/openrouter/plugin_config_test.gointernal/plugins/frontends/openailegacy/decode.gointernal/plugins/frontends/openailegacy/decode_test.gointernal/plugins/frontends/openailegacy/encode.gointernal/plugins/frontends/openailegacy/encode_test.gointernal/plugins/frontends/openairesponses/decode.gointernal/plugins/frontends/openairesponses/decode_test.gointernal/plugins/frontends/openairesponses/encode.gointernal/plugins/frontends/openairesponses/encode_test.gointernal/plugins/openrouterwire/extensions.gointernal/plugins/openrouterwire/extensions_test.gointernal/refbackend/nvidia/doc.gointernal/refbackend/nvidia/server.gointernal/refbackend/nvidia/server_test.gointernal/stdhttp/testdata/dogfood_gemini_dual_stub_failover.yamlinternal/testkit/conformance/error_upstream.gointernal/testkit/conformance/harness.gointernal/testkit/conformance/matrix.gointernal/testkit/conformance/parity_evidence.gointernal/testkit/conformance/refparity.gointernal/testkit/conformance/sanity_emulator_wiring_test.gointernal/testkit/conformance/tools_refbackend.gopkg/lipapi/call.gopkg/lipapi/invocation.gopkg/lipapi/transport.gopkg/lipapi/transport_incomplete_test.gopkg/lipapi/transport_test.gopkg/lipsdk/standard_bundle.go
💤 Files with no reviewable changes (2)
- internal/plugins/backends/openrouter/invoke_responses.go
- internal/plugins/backends/openrouter/invoke_chat.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/lipapi/call.go (1)
66-66: ⚡ Quick winDocument
Call.Invocationas internal execution metadata.
Invocationis exported but excluded from JSON, so this boundary is easy to misread bypkg/lipapiconsumers. Add a short field comment clarifying it is runtime/internal-only metadata and not part of the wire contract.Suggested diff
type Call struct { @@ - Invocation Invocation `json:"-"` + // Invocation carries runtime-selected operation/transport metadata. + // Internal-only: excluded from wire JSON and set by execution paths. + Invocation Invocation `json:"-"`As per coding guidelines,
pkg/lipapipublic contracts “must be versionable, documented, and minimal,” so this explicit contract note helps keep the public boundary clear.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/lipapi/call.go` at line 66, Add a field documentation comment above the Invocation field in the Call struct to clarify that it contains internal execution metadata and is not part of the public wire contract. The comment should explicitly note that while the field is exported, it is excluded from JSON serialization and is intended for runtime use only, helping consumers understand the boundary between the public API contract and internal implementation details.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/lipapi/call.go`:
- Line 66: Add a field documentation comment above the Invocation field in the
Call struct to clarify that it contains internal execution metadata and is not
part of the public wire contract. The comment should explicitly note that while
the field is exported, it is excluded from JSON serialization and is intended
for runtime use only, helping consumers understand the boundary between the
public API contract and internal implementation details.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cb58e41b-f363-4d1f-ae3a-85d51cc71976
📒 Files selected for processing (24)
Makefileinternal/archtest/runtime_transport_boundaries_test.gointernal/core/affinity/memorystore/store_test.gointernal/core/extensions/pre_request.gointernal/core/extensions/pre_request_test.gointernal/core/runtime/executor_affinity_test.gointernal/core/runtime/executor_open_attempt.gointernal/core/runtime/parallel_race.gointernal/core/runtime/parallel_race_test.gointernal/pluginreg/feature_yaml_test.gointernal/pluginreg/keys.gointernal/plugins/backends/nvidia/integration_test.gointernal/plugins/backends/openaicompat/backend_test.gointernal/plugins/backends/openaicompat/invoke_test.gointernal/plugins/backends/openrouter/integration_test.gointernal/plugins/openrouterwire/extensions.gointernal/refbackend/nvidia/server_test.gointernal/refbackend/openrouter/server_test.gointernal/testkit/conformance/conformance_multimodal_test.gointernal/testkit/conformance/refparity.gointernal/testkit/conformance/tools_refbackend.gopkg/lipapi/call.gopkg/lipsdk/prerequest/handler_test.gopkg/lipsdk/prerequest/sort_test.go
💤 Files with no reviewable changes (1)
- internal/core/runtime/executor_affinity_test.go
✅ Files skipped from review due to trivial changes (2)
- internal/core/extensions/pre_request.go
- internal/core/extensions/pre_request_test.go
🚧 Files skipped from review as they are similar to previous changes (11)
- internal/archtest/runtime_transport_boundaries_test.go
- internal/testkit/conformance/tools_refbackend.go
- internal/plugins/openrouterwire/extensions.go
- internal/core/runtime/executor_open_attempt.go
- internal/pluginreg/keys.go
- internal/testkit/conformance/refparity.go
- internal/refbackend/nvidia/server_test.go
- internal/core/runtime/parallel_race.go
- internal/plugins/backends/openaicompat/invoke_test.go
- internal/plugins/backends/openrouter/integration_test.go
- internal/plugins/backends/openaicompat/backend_test.go
Summary
Verification
Summary by CodeRabbit
Release Notes
New Features
exactvscompatibilityfallback behavior for matching requested streaming/non-streaming modes.Documentation