Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,8 @@ Use YAML under [`config/examples/`](config/examples/) with the bundled **`local-
- [`lipapi.SessionRef`](pkg/lipapi/call.go) — the proxy-owned session id field is **`AuthoritativeSessionID`** (renamed from `SessionID` to avoid stutter with the struct name). JSON still uses the key **`SessionID`** for wire compatibility.
- [`lipsdk.FrontendMount`](pkg/lipsdk/factory.go) now takes a single [`lipsdk.FrontendMountOptions`](pkg/lipsdk/factory.go) after the [`http.ServeMux`](https://pkg.go.dev/net/http#ServeMux). [`pluginreg.(*Registry).MountFrontend`](internal/pluginreg/reg.go) matches that shape (factory id, mux, options).
- [`stdhttp.MountBundledFrontends`](internal/stdhttp/mount.go) takes [`stdhttp.MountBundledFrontendsInput`](internal/stdhttp/mount.go) instead of six separate parameters.
- Composition roots: [`pluginreg.InstallStandardBundleOn`](internal/pluginreg/standard_table.go) / [`InstallStandardBackendsOn`](internal/pluginreg/standard_table.go) take [`pluginreg.UpstreamAPIKeys`](internal/pluginreg/keys.go) (per-family **ordered slices**; use [`ResolveUpstreamAPIKeysFromEnv`](internal/pluginreg/keys.go) in `main`, or `UpstreamAPIKeys{}` in tests). Env resolution includes numbered keys (`OPENAI_API_KEY_2`, … **contiguous** suffixes only: scanning stops at the first missing or empty `_N`, so a gap like `_2` unset while `_3` is set will not load `_3`; same pattern for Anthropic/Gemini/OpenRouter/NVIDIA). Hosted backend YAML accepts optional `api_keys` alongside `api_key` (see [`config/config.multi-instance.example.yaml`](config/config.multi-instance.example.yaml)). [`runtime.New`](internal/core/runtime/app.go), [`runtimebundle.Build`](internal/infra/runtimebundle/build.go), [`stdhttp.Run`](internal/stdhttp/server.go), and [`stdhttp.RunWithRuntime`](internal/stdhttp/server.go) require a **non-nil** `*slog.Logger`. [`pluginreg.(*Registry).RegisterBackend`](internal/pluginreg/reg.go) factories return [`execbackend.Backend`](internal/core/execbackend/backend.go) directly (not `any`). [`sqlitestore.New`](internal/core/continuity/sqlitestore/store.go) accepts an existing `*sql.DB` for tests.
- **Hosted provider backends (multi-key pools)** — The bundled `openai-responses`, `openai-legacy`, `anthropic`, `gemini`, `openrouter`, and `nvidia` backends keep ordered credentials per instance, classify pre-output 401/429 from the official SDKs, and may return [`lipapi.RecoverablePreOutputError`](pkg/lipapi/upstream.go) from [`execbackend.Backend.Open`](internal/core/execbackend/backend.go) when no key is usable before the first canonical stream event (including single-key rate limit or auth failure). They do not return an [`lipapi.EventStream`](pkg/lipapi/events.go) that fails only during [`lipapi.Collect`](pkg/lipapi/events.go) for that case. **401 handling:** HTTP 401 from the hosted upstream is treated as **permanently invalid** for that credential inside the process (the pool marks it unusable until restart); this matches static API keys but is not suitable for short-lived tokens that might recover without a restart. **429 / Gemini:** OpenAI and Anthropic read `Retry-After` from the SDK error where available; the genai client does not attach response headers to [`genai.APIError`](https://pkg.go.dev/google.golang.org/genai#APIError), so the Gemini adapter also reads [`google.rpc.RetryInfo`](https://cloud.google.com/apis/design/errors#error_details) from error JSON `details` when present, otherwise it uses a conservative fixed cooldown fallback.
- Composition roots: [`pluginreg.InstallStandardBundleOn`](internal/pluginreg/standard_table.go) / [`InstallStandardBackendsOn`](internal/pluginreg/standard_table.go) take [`pluginreg.UpstreamAPIKeys`](internal/pluginreg/keys.go) (per-family **ordered slices**; use [`ResolveUpstreamAPIKeysFromEnv`](internal/pluginreg/keys.go) in `main`, or `UpstreamAPIKeys{}` in tests). Env resolution includes numbered keys (`OPENAI_API_KEY_2`, … **contiguous** suffixes only: scanning stops at the first missing or empty `_N`, so a gap like `_2` unset while `_3` is set will not load `_3`; same pattern for Anthropic/Gemini/OpenRouter/NVIDIA/Hugging Face). Hosted backend YAML accepts optional `api_keys` alongside `api_key` (see [`config/config.multi-instance.example.yaml`](config/config.multi-instance.example.yaml)). [`runtime.New`](internal/core/runtime/app.go), [`runtimebundle.Build`](internal/infra/runtimebundle/build.go), [`stdhttp.Run`](internal/stdhttp/server.go), and [`stdhttp.RunWithRuntime`](internal/stdhttp/server.go) require a **non-nil** `*slog.Logger`. [`pluginreg.(*Registry).RegisterBackend`](internal/pluginreg/reg.go) factories return [`execbackend.Backend`](internal/core/execbackend/backend.go) directly (not `any`). [`sqlitestore.New`](internal/core/continuity/sqlitestore/store.go) accepts an existing `*sql.DB` for tests.
- **Hosted provider backends (multi-key pools)** — The bundled `openai-responses`, `openai-legacy`, `anthropic`, `gemini`, `openrouter`, `nvidia`, and `huggingface` backends keep ordered credentials per instance, classify pre-output 401/429 from the official SDKs, and may return [`lipapi.RecoverablePreOutputError`](pkg/lipapi/upstream.go) from [`execbackend.Backend.Open`](internal/core/execbackend/backend.go) when no key is usable before the first canonical stream event (including single-key rate limit or auth failure). They do not return an [`lipapi.EventStream`](pkg/lipapi/events.go) that fails only during [`lipapi.Collect`](pkg/lipapi/events.go) for that case. **401 handling:** HTTP 401 from the hosted upstream is treated as **permanently invalid** for that credential inside the process (the pool marks it unusable until restart); this matches static API keys but is not suitable for short-lived tokens that might recover without a restart. **429 / Gemini:** OpenAI and Anthropic read `Retry-After` from the SDK error where available; the genai client does not attach response headers to [`genai.APIError`](https://pkg.go.dev/google.golang.org/genai#APIError), so the Gemini adapter also reads [`google.rpc.RetryInfo`](https://cloud.google.com/apis/design/errors#error_details) from error JSON `details` when present, otherwise it uses a conservative fixed cooldown fallback.
Comment on lines +87 to +88

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

README now disagrees with itself about supported env key families.

These lines add Hugging Face correctly, but the earlier cmd/lipstd startup-flow bullet in this same README still omits HUGGINGFACE_API_KEY / numbered suffixes. That leaves two conflicting descriptions of what ResolveUpstreamAPIKeysFromEnv loads. Please update the earlier bullet too so operators do not miss the new credential family.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 87 - 88, README has inconsistent documentation for
ResolveUpstreamAPIKeysFromEnv because the earlier startup-flow bullet still
omits the Hugging Face key family. Update that existing bullet to mention
HUGGINGFACE_API_KEY and its numbered suffixes, matching the later Composition
roots note, so both descriptions list the same supported env key families and
ordering rules.

- **Terminal stream errors** — [`lipapi.Collect`](pkg/lipapi/events.go) and bundled frontend SSE encoders surface terminal upstream failures as [`lipapi.ErrStreamTerminal`](pkg/lipapi/errors.go) / [`*lipapi.StreamError`](pkg/lipapi/errors.go) (stable `Error()` string `lipapi: stream error`; use [`errors.As`](https://pkg.go.dev/errors#As) for `Code` and `Message`). This replaces embedding provider text directly in `err.Error()`.
- **ACP JSON-RPC errors** — The bundled ACP client returns [`*acp.RPCError`](internal/plugins/backends/acp/rpc_error.go) with stable `Error()` text per RPC method; use `Code` / `Message` for vendor detail. Optional [`acp.Config.Log`](internal/plugins/backends/acp/plugin.go) enables debug logs when a best-effort cancel RPC fails after consumer cancellation.

Expand Down
27 changes: 19 additions & 8 deletions cmd/model-inventory-proof/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import (
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/anthropic"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/bedrock"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/gemini"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/huggingface"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/nvidia"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/openailegacy"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/openairesponses"
Expand Down Expand Up @@ -97,8 +98,21 @@ func main() {
}

func credentialedBackendCandidates(keys pluginreg.UpstreamAPIKeys, awsEnv awsEnvironment) ([]backendCandidate, []skipReport) {
candidates := make([]backendCandidate, 0, 7)
skipped := make([]skipReport, 0, 7)
staticBackends := []struct {
id string
env string
values []string
}{
{openairesponses.ID, "OPENAI_API_KEY", keys.OpenAI},
{openailegacy.ID, "OPENAI_API_KEY", keys.OpenAI},
{anthropic.ID, "ANTHROPIC_API_KEY", keys.Anthropic},
{gemini.ID, "GEMINI_API_KEY", keys.Gemini},
{openrouter.ID, "OPENROUTER_API_KEY", keys.OpenRouter},
{nvidia.ID, "NVIDIA_API_KEY", keys.Nvidia},
{huggingface.ID, "HUGGINGFACE_API_KEY", keys.HuggingFace},
}
candidates := make([]backendCandidate, 0, len(staticBackends)+1)
skipped := make([]skipReport, 0, len(staticBackends)+1)

addStaticKeyBackend := func(id, envName string, values []string) {
if len(values) == 0 {
Expand All @@ -108,12 +122,9 @@ func credentialedBackendCandidates(keys pluginreg.UpstreamAPIKeys, awsEnv awsEnv
candidates = append(candidates, backendCandidate{ID: id, Config: "{}"})
}

addStaticKeyBackend(openairesponses.ID, "OPENAI_API_KEY", keys.OpenAI)
addStaticKeyBackend(openailegacy.ID, "OPENAI_API_KEY", keys.OpenAI)
addStaticKeyBackend(anthropic.ID, "ANTHROPIC_API_KEY", keys.Anthropic)
addStaticKeyBackend(gemini.ID, "GEMINI_API_KEY", keys.Gemini)
addStaticKeyBackend(openrouter.ID, "OPENROUTER_API_KEY", keys.OpenRouter)
addStaticKeyBackend(nvidia.ID, "NVIDIA_API_KEY", keys.Nvidia)
for _, backend := range staticBackends {
addStaticKeyBackend(backend.id, backend.env, backend.values)
}

if ok, reason := awsEnv.usableForBedrock(); ok {
candidates = append(candidates, backendCandidate{
Expand Down
7 changes: 6 additions & 1 deletion config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,11 @@ plugins:
config: {}
# base_url: https://integrate.api.nvidia.com/v1 # default
# api_key: "" # or use NVIDIA_API_KEY / NVIDIA_API_KEY_N env vars
- id: huggingface
enabled: false
config: {}
# base_url: https://router.huggingface.co/v1 # default
# api_key: "" # or use HUGGINGFACE_API_KEY / HUGGINGFACE_API_KEY_N env vars
- id: opencode-go
enabled: false
config: {}
Expand Down Expand Up @@ -352,7 +357,7 @@ plugins:
# timeout: 15s
# Custom compatible backends let operators add API-compatible providers without code.
# Use kind to select the generic factory; id remains the runtime route backend instance.
# backend_prefix must be unique, cannot contain / or :, and cannot use standard connector prefixes such as nvidia/openrouter/anthropic.
# backend_prefix must be unique, cannot contain / or :, and cannot use standard connector prefixes such as nvidia/openrouter/huggingface/anthropic.
# api_key_env_var_root reads ROOT, ROOT_2, ROOT_3, ... using the standard static key convention.
# - id: provider123
# kind: custom-openai-legacy-compatible
Expand Down
1 change: 1 addition & 0 deletions docs/backend-adapter-boundaries.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Regression tests **must** cover mapping behavior (streaming order, tool events,
| `acp` | Parity + ACP subset (tools deferred per matrix) | HTTP client + ACP-specific session/update flows |
| `nvidia` | Parity + NVIDIA NIM chat/responses wire | `openai-go` client; `NVIDIA_API_KEY` env pool; `max_tokens` remap, `stream_options` strip, `extra_body` pass-through |
| `openrouter` | Parity + OpenRouter chat/responses wire | `openai-go` client; shared invoke/event mapping via [`openaicompat`](../internal/plugins/backends/openaicompat/); OpenRouter-specific headers and extensions |
| `huggingface` | Parity + Hugging Face Inference Providers chat wire | `openai-go` client; shared invoke/event mapping via [`openaicompat`](../internal/plugins/backends/openaicompat/); chat completions only (`TransportChatOnly`); `HUGGINGFACE_API_KEY` env pool |
| `local-stub` | Dogfood YAML + executor stub tests | No upstream credentials ([`CredentialNone`](../pkg/lipsdk/backend_security.go)); deterministic text |

## Shared OpenAI-compatible adapter layer
Expand Down
1 change: 1 addition & 0 deletions internal/archtest/backend_lifecycle_contract_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ func TestOfficialBackendsHaveLifecycleContractTests(t *testing.T) {
"lmstudio": "openaicompat",
"openrouter": "openaicompat",
"nvidia": "openaicompat",
"huggingface": "openaicompat",
"vllm": "openaicompat",
"opencodego": "opencodecommon",
"opencodezen": "opencodecommon",
Expand Down
4 changes: 2 additions & 2 deletions internal/pluginreg/backend_prefix_inventory_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ func TestReservedStandardBackendPrefixes_coverStandardBackendPrefixes(t *testing
t.Fatalf("BuildBackend(%q) error = %v", id, err)
}
for _, prefix := range be.BackendPrefixes {
if _, ok := reservedStandardBackendPrefixes[prefix]; !ok {
if !isReservedStandardBackendPrefix(prefix) {
t.Fatalf("standard backend %q exposes prefix %q not reserved for custom connectors", id, prefix)
}
}
Expand All @@ -89,7 +89,7 @@ func standardBackendFactoryIDs(t *testing.T) []string {

func standardBackendBuildYAML(id string) string {
switch id {
case "acp", "anthropic", "openai-legacy", "openai-responses", "openrouter", "nvidia", "opencode-go", "opencode-zen":
case "acp", "anthropic", "openai-legacy", "openai-responses", "openrouter", "nvidia", "huggingface", "opencode-go", "opencode-zen":
return "base_url: http://127.0.0.1:9\n"
case "openai-codex":
return "base_url: http://127.0.0.1:9\naccess_token: test\n"
Expand Down
18 changes: 18 additions & 0 deletions internal/pluginreg/backends_misc.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import (
"github.com/matdev83/go-llm-interactive-proxy/internal/core/config"
"github.com/matdev83/go-llm-interactive-proxy/internal/core/execbackend"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/acp"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/huggingface"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/llamacpp"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/lmstudio"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/localstub"
Expand Down Expand Up @@ -93,3 +94,20 @@ func backendNvidia(n yaml.Node, upstream *http.Client, keys UpstreamAPIKeys) (ex
}
return applyConfiguredModelInventory(nvidia.New(cfg), y.Models)
}

func backendHuggingface(n yaml.Node, upstream *http.Client, keys UpstreamAPIKeys) (execbackend.Backend, error) {
var y openAIStyleYAML
if err := config.DecodeYAMLNode(n, &y); err != nil {
return execbackend.Backend{}, fmt.Errorf("huggingface backend config: %w", err)
}
base := cmp.Or(strings.TrimSpace(y.BaseURL), huggingface.DefaultBaseURL)
ek, primaryKey := firstAPIKey(y.APIKey, y.APIKeys, y.Credentials, keys.HuggingFace)
cfg := huggingface.Config{
BaseURL: base,
APIKey: primaryKey,
APIKeys: ek,
Credentials: hostedCredentials(y.Credentials),
HTTPClient: resolveUpstreamHTTP(upstream),
}
return applyConfiguredModelInventory(huggingface.New(cfg), y.Models)
}
2 changes: 1 addition & 1 deletion internal/pluginreg/custom_backend_prefix_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ func TestValidateCustomBackendPrefix_rejectsInvalidCharacters(t *testing.T) {

func TestValidateCustomBackendPrefix_rejectsReservedStandardPrefixes(t *testing.T) {
t.Parallel()
for _, prefix := range []string{"nvidia", "openrouter", "anthropic", "openai-legacy", "openai-responses", "opencode-go", "opencode-zen"} {
for prefix := range standardBackendPrefixSet() {
err := validateCustomBackendPrefix(prefix)
if err == nil {
t.Fatalf("expected error for reserved backend_prefix %q", prefix)
Expand Down
60 changes: 25 additions & 35 deletions internal/pluginreg/custom_backends.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,25 +8,9 @@ import (

"github.com/matdev83/go-llm-interactive-proxy/internal/core/config"
"github.com/matdev83/go-llm-interactive-proxy/internal/core/execbackend"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/acp"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/anthropic"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/bedrock"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/credpool"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/gemini"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/llamacpp"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/lmstudio"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/localstub"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/modeldiscover"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/nvidia"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/ollama"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/openaicodex"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/openaicompat"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/openailegacy"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/openairesponses"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/opencodego"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/opencodezen"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/openrouter"
"github.com/matdev83/go-llm-interactive-proxy/internal/plugins/backends/vllm"
"github.com/matdev83/go-llm-interactive-proxy/pkg/lipapi"
"gopkg.in/yaml.v3"
)
Expand Down Expand Up @@ -91,7 +75,7 @@ func validateCustomBackendPrefix(prefix string) error {
if strings.Contains(prefix, "/") || strings.Contains(prefix, ":") {
return fmt.Errorf("custom backend: backend_prefix %q must not contain '/' or ':'", prefix)
}
if _, reserved := reservedStandardBackendPrefixes[prefix]; reserved {
if isReservedStandardBackendPrefix(prefix) {
return fmt.Errorf("custom backend: backend_prefix %q is reserved by a standard connector", prefix)
}
return nil
Expand Down Expand Up @@ -217,22 +201,28 @@ func backendCustomOpenAIResponsesCompatible(n yaml.Node, upstream *http.Client)
return buildCustomOpenAICompatibleBackend(y, upstream, openaicompat.FlavorResponses, customOpenAIResponsesTransportCaps())
}

var reservedStandardBackendPrefixes = map[string]struct{}{
openairesponses.ID: {},
openailegacy.ID: {},
anthropic.ID: {},
gemini.ID: {},
bedrock.ID: {},
acp.ID: {},
openrouter.ID: {},
nvidia.ID: {},
opencodego.ID: {},
opencodezen.ID: {},
openaicodex.ID: {},
ollama.ID: {},
ollama.CloudID: {},
llamacpp.ID: {},
lmstudio.ID: {},
vllm.ID: {},
localstub.ID: {},
func isReservedStandardBackendPrefix(prefix string) bool {
_, ok := standardBackendPrefixSet()[prefix]
return ok
}

func standardBackendPrefixSet() map[string]struct{} {
backends := StandardBackendBundle(UpstreamAPIKeys{}).Backends
out := make(map[string]struct{}, len(backends))
for _, entry := range backends {
if IsCustomCompatibleBackendKind(entry.ID) {
continue
}
out[entry.ID] = struct{}{}
be, err := entry.Factory(yaml.Node{}, nil, BackendFactoryDeps{})
if err != nil {
continue
}
for _, prefix := range be.BackendPrefixes {
if prefix = strings.TrimSpace(prefix); prefix != "" {
out[prefix] = struct{}{}
}
}
}
return out
Comment thread
coderabbitai[bot] marked this conversation as resolved.
}
Loading