Releases: planetarium/vicoop-bridge
Release list
@vicoop-bridge/client@0.35.4
Patch Changes
- ab945bb: upgrade: resolve the latest release from the public GitHub Atom feed
(github.com/<repo>/releases.atom) instead of theapi.github.qkg1.topREST
API. The REST API caps unauthenticated requests at 60/hr per IP, so
vicoop-client upgrade/upgrade --checkwould fail with a hard
403 rate limit exceededon shared provider egress IPs — exactly when
an operator is rolling out a release (#405). The web feed has its own,
far more generous anonymous limit and needs noGITHUB_TOKEN.
@vicoop-bridge/client@0.35.3
Patch Changes
-
daebd59: fix(claude): classify claude's terminal reason into a structured failure code
When claude exits non-zero with zero usage it first emits a terminal
result
event whoseresultfield carries the human-readable cause — "You've hit your
session limit · resets 3pm (UTC)", "... · Rate limited", "529 Overloaded",
"Prompt is too long". The bridge captured this infinalTextbut dropped it,
so the only thing reaching the router was an opaqueclaude exited with code 1 [stdout: <raw JSON tail>], forcing the router to scrape keywords out of a
truncated stdout dump.Now that reason is used as the failure
messageverbatim (when present) and
run throughnormalizeTaskFailError, so it maps onto a structured terminal
code —quota_exceeded/rate_limited/upstream_error/login_required
/ … — which the router consumes directly viareasonForTerminalCode(it
prefers the code over message-pattern matching). The cause travels as
structured data in theterminal_error.codechannel, not a string baked into
the diagnostic. When claude emits no such reason (a real crash) the message
falls back to the exit/stdout diagnostic dump so triage data is preserved
(#119).Also teaches
normalizeTaskFailErrorthe claude subscription "session limit"
cap →quota_exceeded(server-side "529 Overloaded" already classifies as
upstream_errorvia the existing numeric-status match).
@vicoop-bridge/client@0.35.2
Patch Changes
-
b763c48: fix(vicoop-codex): surface in-band SSE error frames as task failures
vicoop-codex serverelays an upstream/responseserror (e.g. "input
exceeds the context window") as a{"error":{...}}frame on an otherwise-200
SSE stream. The stream consumer only understoodchoices-bearing chunks, so
it silently dropped the error frame, accumulated nothing, and synthesized an
emptyfinish_reason:"stop"completion with no usage — the silent
"Response Generated" with an empty body (and a$0-billed turn).Detect an in-band error frame and fail the task with
upstream_error
carrying the upstream message, instead of completing empty.
@vicoop-bridge/client@0.35.1
Patch Changes
-
5b47bf5: fix(client): fixed-segment splitting for openai-compat history cache
The default-on history cache (#372) emitted the frozen prefix as a single
growing block with onecache_controlbreakpoint. When the conversation
advances a turn the block's bytes change, and because Anthropic's cache matches
at block boundaries (read-lookback walks back up to 20 blocks), there is no
boundary at the previous turn's freeze point — so the lookback only re-matches
at the stable system+tools boundary and the entire history is re-created every
forward turn.A controlled fresh-data A/B (no pre-warm) confirmed this and corrected the #372
"caching already works" reading, which had measured pre-warmed cache from
repeated deterministic runs:rollover turn (200→210 entries) non_cached (≈creation) cache_read single growing block (before) 220,929 0 fixed segments (after) 16,013 180,468 formatChatHistoryBlocksnow serializes the frozen prefix as one block per
FREEZE_STEP_ENTRIESentries at absolute boundaries (so older segments never
re-flow), withcache_controlon only the last complete segment. On a
rollover the new breakpoint's read-lookback finds the prior turn's entry one
block back at the previous segment boundary and reads the whole frozen prefix,
re-creating only the new segment + tail — i.e. creation becomes O(step) per turn
instead of O(full history). Validated at production depth (20 segments): the
lookback match is always one block back, well within the 20-block window.This is what the rolling-anchor approach was reaching for, but it uses one
breakpoint (reads ride the lookback, not a second explicit anchor) → claude's
system+tools+1 plus this one = 4, fitting Anthropic's budget. That is why the
rolling-anchor patch tripped400 maximum of 4 blocksand this does not. The
concatenated text the model reads is byte-identical
(serialize(a) + ",\n" + serialize(b) == serialize(a ++ b)); the existing latch
still falls back to the unsplit block if a breakpoint is ever rejected.
@vicoop-bridge/client@0.35.0
Minor Changes
-
29b701d: The bridge usage API now reports a canonical, backend-agnostic shape (
BridgeUsage) for both backends, so a single consumer can read remaining quota regardless of backend:- claude (new): reports the operator's Claude subscription quota. Reads the Claude Code OAuth token from the host (macOS Keychain, or
~/.claude/.credentials.jsonon Linux/Windows), calls the authenticatedapi/oauth/usageendpoint (5-hour / weekly / Sonnet windows + monetary extra-usage), cached ~5 min to respect the endpoint's self-rate-limit. When the read fails it serves the last successful snapshot (stale, annotated) or an explicitsource: 'none'. - vicoop-codex: its serve
/usagepayload is now normalised into the same shape (was forwarded verbatim).
Canonical shape:
{ backend, source, fetchedAt, accounts: [{ id, label?, plan?, windows: [{ id, label, usedPercent, resetsAt, severity }], spend? }], note?, raw }. Conventions are fixed —usedPercentis 0–100 percent used (remaining = 100 − usedPercent),resetsAtis ISO 8601 — and the verbatim upstream payload is preserved underraw.The claude OAuth read path also gained, to match the reference monitor's robustness:
$CLAUDE_CONFIG_DIRsupport for the credentials file; an official-clientUser-Agent: claude-code/<version>(discovered from the CLI) +Content-Type;Retry-After-aware backoff on 429; serving the last successful snapshot (stale, annotated) on a transient failure; best-effort CLI-delegated token refresh on auth expiry/401; and a retry-storm guard that won't re-send a known-dead token until it rotates.The stream's
rate_limit_eventis captured only to enrichspend.resetsAt(the monthly overage reset the oauthextra_usageblock omits); it is deliberately NOT used as a usage fallback, because it reports only the single most-constrained window (e.g. a near-cap overage meter) and would misrepresent the subscription quota. - claude (new): reports the operator's Claude subscription quota. Reads the Claude Code OAuth token from the host (macOS Keychain, or
@vicoop-bridge/client@0.34.0
Minor Changes
-
314a0a9: feat(client): openai-compat/v1 reasoning channel (claude + vicoop-codex) +
shared liveness heartbeat (all backends).Reasoning channel. The claude and vicoop-codex backends now forward the
model's reasoning on a dedicated openai-compat/v1reasoningchannel — a
separate artifact (claude-reasoning/vicoop-codex-reasoning) carrying
metadata[openai-compat/v1] = { channel: "reasoning" }, kept on a distinct
artifact id so reasoning never co-mingles with the answer. The claude side
surfacesthinking_deltastream events and injects aMAX_THINKING_TOKENS
budget on openai-compat spawns so Claude Code emits thinking on the wire
(budget defaults to 8000, configurable via--claude-thinking-budget/
backends.claude.thinking_budget); the vicoop-codex side surfaces
delta.reasoning_contentchunks (already enabled serve-side via
summary:"auto", no thinking-enablement injection needed). This lets the
a2x-internal-router treat a long silent reasoning turn as alive instead of
false-failing-over it (planetarium/a2x-internal-router#95, #375, #376).Each reasoning channel is ON by default and individually disablable:
--no-claude-reasoning/backends.claude.reasoning: falseand
--no-vicoop-codex-reasoning/backends['vicoop-codex'].reasoning: false.
Disable the reasoning channel when the deployed oai2a2a codec predates
0.6.0 — an old codec doesn't understand the channel marker and would fold the
reasoning artifact into the answer (the #95 rollout-order hazard). Claude
redacted-thinking blocks are never forwarded.Liveness heartbeat. Every backend's shared task loop (claude, codex,
openclaw, and now vicoop-codex) emits a tagged liveness heartbeat: the idle
workingtask.statusbeat now fires every 10s of silence (was a per-backend
30s beat) and carriesmetadata[openai-compat/v1] = { heartbeat: true }. The
bridge server maps this onto the A2ATaskStatusUpdateEvent.metadata, where the
oai2a2a codec (≥0.6.0) translates it to a: a2a-heartbeatSSE comment that
re-arms the router's first-content / stall watchdog. This keeps a backend that
is alive but byte-silent (long reasoning, tool runs) observably alive so it
isn't false-failed-over, while a backend that errors (task.fail) ends the loop
and stops heartbeating so failover still works
(planetarium/a2x-internal-router#95). The 10s cadence sits at or below half the
router's tightened 25–30s window; heartbeats carry no content and are safe to
emit unconditionally.
@vicoop-bridge/client@0.33.1
Patch Changes
-
d153dd8: Cache openai-compat
chat_historyon the claude backend (on by default).
Splits the replayed<chat_history>into a frozen prefix carrying a
cache_controlbreakpoint plus a small tail, so stable conversation history
reads from Anthropic's prompt cache instead of re-billing at full price every
turn. The split is byte-identical to the previous single block, so the model
reads the same history.It relies on claude's stream-json input forwarding caller
cache_control
(undocumented) and shares the API's 4-breakpoint budget with claude's own
system/tools markers. If claude ever rejects the breakpoint (e.g. a future CLI
build whose own markers exhaust the budget), a process-wide latch auto-disables
the split — that task fails, every later task falls back to the unsplit block,
and a daemon restart re-arms it. Hard-disable with
VICOOP_DISABLE_OAI_HISTORY_CACHE=1. -
e136d46: vicoop-codex backend: emit a per-task
timingbreadcrumb (debug-gated) that
stamps serveReady / firstByte / firstDelta / total milestones, so operators
can split model-wait from streaming time on a slow turn. Opt in with
VICOOP_CLIENT_LOG_LEVEL=debug; no new output at the defaultinfolevel.
@vicoop-bridge/client@0.33.0
Minor Changes
- 9627f0e: Add opt-in crash telemetry. Off by default — the client only loads or
initializes the Sentry SDK when config.json has"telemetry": "on". Opt in
withvicoop-client agent register --enable-telemetry(persists the field) or
by hand-editing config.json; disable by removing the field. When on, only
crash reports are sent: exception class + stack trace with the operator's home
path redacted. Tracing is disabled, breadcrumbs/console capture are suppressed,
andsendDefaultPiiis off — so prompts, code, agent output, tokens, and logs
are never transmitted. The daemon prints a one-line disclosure at registration
and at startup. DSN is configurable viaVICOOP_CLIENT_SENTRY_DSN. - 8b91fe3: vicoop-codex backend: report per-account Codex usage to the bridge on request. The client answers the new
usage.requestframe by querying its localvicoop-codex serve/usageendpoint, which backs the server's admin/owner-onlyGET /admin-api/agents/:id/usageAPI.
Patch Changes
- 7b92dbf: claude backend: report the actual response model in the OpenAI-compatible
envelope. The envelope's top-levelmodel(and its embeddedusage.model)
now resolve from model ids claude itself reports — the model named on the
assistantturn, falling back to thesystem/initresolved model — instead
of theresult.modelUsagelargest-output-share heuristic, which on short
responses could be dominated by an internal sub-model (e.g.
claude-haiku-4-5-*used for title generation) and mislabel the envelope even
when the requested override model handled the request. The requested
envelope.modelis deliberately not used as a fallback, since it may be a
routing slug or an A2A card url rather than a real model id.modelUsageis
now used only to sum token counts (#348). - 00c9a6a: codex backend: recover real token usage on OpenAI-compatible tool-call
turns by deferringturn/interruptuntil codex's
thread/tokenUsage/updatedlands (#351). Previously the bridge
interrupted the turn the moment the model invoked a caller tool, which
raced ahead of codex app-server's token accounting — the accounting only
runs after the bridge answers theitem/tool/callrequest, and an
interrupt in flight at that point drops the turn's usage everywhere (no
notification, noturn/completedpayload,info: nulleven in codex's
own rollout record), so the router billed the request as
total_tokens=0. The interrupt is now held until the usage notification
for the turn arrives (measured at 15–40ms on codex 0.139, well ahead of
the ~500ms a next model iteration needs to start) with a 1s backstop
timer, configurable viatoolCallUsageWaitMs. When codex still reports
nothing, the{0,0,0}placeholder remains, and the bridge now logs a
tokenUsage unavailablediagnostic so zero-usage records are
explainable without--openai-compat-trace.
@vicoop-bridge/client@0.32.0
Minor Changes
-
2dab8db: feat(client): multi-model support on the claude backend via
--claude-supported-modelsClaude Code has no headless "list models" interface, so the claude backend
used to advertise — and accept per-request openai-compatmodeloverrides
for — only a single model (the--claude-modelpin or the startup-probed
default). Operators can now declare additional models their install can
serve with--claude-supported-models claude-sonnet-4-6,claude-haiku-4-5
(comma-separated) orbackends.claude.supported_modelsinconfig.json. Declared ids
are advertised on the openai-compatparams.models[]block after the
default, and a matching per-requestmodelrides to the spawnedclaudeas
--model <id>. Theenvelope.modelgate now also matches on the normalized
(tier-suffix-stripped) form, so a caller selecting e.g.
claude-opus-4-8[1m]against an advertisedclaude-opus-4-8passes through
with the tier selection intact.
@vicoop-bridge/client@0.31.0
Minor Changes
- d2d4be4: Add short aliases for the two most-typed daemon flags:
-cfor--configand-dfor--detach(e.g.vicoop-client start -d -c ./config.json). The detached child is now kept in the foreground daemon path by theVICOOP_DETACHEDenv guard rather than by argv stripping, so the re-exec stays correct even for optique's bundled short flags (-dc valueparses as-d -c value).