Add multi-host SDK to the TypeScript client#158
Merged
connor4312 merged 6 commits intoMay 28, 2026
Conversation
Adds the multi-host SDK as a fourth subpath export `@microsoft/agent-host-protocol/hosts`, mirroring the Rust `ahp::hosts` module: - `MultiHostClient` — per-host reconnect supervisor, generation-checked client handles, fan-in event streams, aggregated views, manual reconnect, scene-phase `reconnectAllUnavailable` helper. - `HostConfig` / `HostHandle` / `HostState` / `HostEvent` / `HostSubscriptionEvent` and the `HostMulti*Error` family. - `ReconnectPolicy` + `Backoff` with jitter, plus `disabled`/`exponential`/`immediateForever` factories. - `HostTransportFactory` — pluggable transport factory threaded with an `AbortSignal` so consumers can cancel in-flight connects. - `ClientIdStore` + `InMemoryClientIdStore` for pluggable per-host `clientId` persistence. - `MultiHostStateMirror` + `hostedResourceKey` host-aware reducer facade that keys per-resource state by `(hostId, uri)`. Cancellation in a single-threaded JS runtime uses `AbortController`s threaded through factories, stores, and a `raceWithAbort` helper that attaches a no-op rejection handler so in-flight RPCs that surface `ClientClosedError` after a forced shutdown don't become `unhandledRejection`s. Includes a `hosts.test.ts` suite (24 new tests, total 47 pass) covering single/multi-host lifecycle, ClientIdStore failure handling, generation invalidation across reconnects, `HostShutDownError` after removal, manual reconnect from the `failed` state, aggregated view sorting, hostEvents ordering, and the lossy fan-in semantics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d2db59c to
c9c2b91
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new TypeScript multi-host orchestration layer exposed as @microsoft/agent-host-protocol/hosts, bringing the TS client closer to the Rust/Swift SDK surface by providing per-host supervisors, reconnect policy/backoff, fan-in events, aggregated views, and host-aware state mirroring.
Changes:
- Introduces
MultiHostClient+ supporting types/errors (Host*,ReconnectPolicy,ClientIdStore, generation-checkedHostClientHandle). - Implements per-host runtime supervisor with reconnect/backoff, replay handling, and fan-in event publishing.
- Adds a comprehensive
hosts.test.tssuite and updates docs/package exports to expose the new/hostsentry point.
Show a summary per file
| File | Description |
|---|---|
| clients/typescript/src/client/hosts/index.ts | Exposes the public /hosts surface (types, errors, helpers, mirrors). |
| clients/typescript/src/client/hosts/multi.ts | Implements the MultiHostClient registry façade and fan-in streams. |
| clients/typescript/src/client/hosts/runtime.ts | Implements the per-host supervisor loop, connect/reconnect, and event fan-out. |
| clients/typescript/src/client/hosts/types.ts | Defines public multi-host types, errors, and config resolution. |
| clients/typescript/src/client/hosts/policy.ts | Adds reconnect policy/backoff + deterministic helpers. |
| clients/typescript/src/client/hosts/host-client-handle.ts | Adds generation-checked client handles (HostClientHandle). |
| clients/typescript/src/client/hosts/state-mirror.ts | Adds host-aware reducer mirror keyed by (hostId, uri). |
| clients/typescript/src/client/hosts/factory.ts | Defines HostTransportFactory contract and docs. |
| clients/typescript/src/client/hosts/client-id-store.ts | Defines ClientIdStore + InMemoryClientIdStore. |
| clients/typescript/test/hosts.test.ts | Adds integration/unit tests for multi-host behavior and helpers. |
| clients/typescript/README.md | Documents new /hosts entry point and multi-host usage. |
| clients/typescript/package.json | Adds ./hosts subpath export to the package. |
Copilot's findings
- Files reviewed: 12/12 changed files
- Comments generated: 2
Two bugs flagged by the Copilot reviewer; both contradicted the module's own documented contracts. 1. Manual reconnect couldn't interrupt an in-flight `connectOnce`. The supervisor raced transport/initialize/reconnect against `shutdownController.signal` only, so a manual `reconnectHost()` call would hang until the slow factory or handshake finished, even though `HostTransportFactory` documents that the signal is aborted on manual reconnect. Fix: build a combined signal (shutdown OR manualReconnect) at the top of `connectOnce`, pass it to the factory and every `raceWithAbort` call, and have the supervisor recognise the manual-reconnect path so it resets the controller and retries immediately without backoff or a phantom warning. New `linkAbortSignals` helper composes the signals. 2. `addHost` could leak a runtime when `shutdown` raced it. If `shutdown()` ran while `addHost()` was awaiting `ClientIdStore.load`, the resumed `addHost` would still create and start a HostRuntime after the multi-host client had closed its queues. Fix: re-check `assertOpen()` after the store await so the same `HostShutDownError` is surfaced and no runtime is registered. Doc updated to mention the new throw condition. Tests added: - 'manual reconnectHost aborts a slow in-flight transport factory' pins a factory on a never-resolving promise, asserts the captured signal aborts on manual reconnect, and the second attempt connects. - 'addHost throws HostShutDownError when shutdown lands during ClientIdStore.load' sequences shutdown between the load and its resolution and asserts no runtime is registered and the factory was never invoked. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Five additional findings from the Copilot reviewer, all valid:
1. linkAbortSignals listener leak (runtime.ts)
Listeners were only removed on abort, so successful connects
accumulated abort listeners on the long-lived per-runtime
shutdownController.signal across reconnect cycles. Fixed by
returning a `dispose` cleanup function and calling it in
`connectOnce`'s outer finally.
2. & 3. HostShutDownError overloaded with 'not connected yet'
(runtime.ts + types.ts)
`HostRuntime.subscribe/dispatch` used HostShutDownError for both
permanent teardown and the transient 'currentClient is null'
case, making it impossible for callers to branch on recovery.
Introduced a new HostNotConnectedError, exported from the
/hosts entry, thrown when the host is registered but not
currently connected. HostShutDownError docs updated to clarify
it is for permanent teardown only.
4. ClientIdStore signal docs vs reality (multi.ts)
The ClientIdStore.load/store signal was documented as 'aborted
on shutdown', but MultiHostClient never actually passed one.
Added a private shutdownController to MultiHostClient that
aborts at the start of shutdown(), and threaded its signal
through resolveClientId into both load() and store().
5. hostedResourceKey collision (state-mirror.ts)
The previous `${hostId}\0${uri}` encoding could collide if a
HostId happened to contain a literal NUL. Switched to a
length-prefixed encoding `${hostId.length}\0${hostId}${uri}`
that is unambiguous for any string content (including \0),
plus a private hostedResourceKeyPrefix() helper consumed by
resetHost so prefix matching stays in sync with the encoding.
Tests added (49 -> 54):
- 'hostedResourceKey is collision-safe across awkward hostId / uri
pairs' exercises the new encoding.
- 'subscribe on a registered-but-not-yet-connected host throws
HostNotConnectedError' verifies the new error and that the URI
is still tracked for replay.
- 'dispatch on a registered-but-not-yet-connected host throws
HostNotConnectedError' covers the parallel dispatch path.
- 'ClientIdStore.load / store receive the multi-host shutdown
signal' captures the signals and asserts they abort on shutdown.
- 'repeated reconnect cycles do not accumulate abort listeners on
the shutdown signal' drives 20+ disconnect/reconnect cycles and
asserts Node never emits MaxListenersExceededWarning.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…williams-typescript-hosts-module
Two findings from the Copilot reviewer, both valid: 1. HostHandle 'already-frozen arrays' doc was inaccurate (types.ts) snapshotHandle() builds snapshots via spread/Array.from — there is no Object.freeze involved. Reworded the doc to 'shallow-cloned arrays' and explained that the supervisor doesn't mutate snapshots after construction, so consumers should treat them as immutable without claiming deep immutability. 2. generateClientId Math.random fallback (types.ts) When crypto.randomUUID was unavailable we jumped straight to Math.random for the 16 random bytes, which is weaker than the crypto.getRandomValues path that most browsers without randomUUID still expose. Inserted getRandomValues() as the middle tier of the fallback chain; Math.random is only used as a last resort when no Web Crypto API is exposed at all. Tests added (54 -> 56): - 'generateClientId prefers crypto.getRandomValues over Math.random when randomUUID is missing' stubs globalThis.crypto with only getRandomValues, fills with a deterministic pattern, and asserts it was called once and the resulting UUID is correctly framed (version 4, variant 8/9/a/b). - 'generateClientId still returns a UUIDv4-shaped string when only Math.random is available' stubs globalThis.crypto to undefined and asserts the format is preserved. Both stub via a withStubbedCrypto() helper that uses Object.defineProperty since globalThis.crypto is a getter in Node and plain assignment throws. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…williams-typescript-hosts-module
connor4312
approved these changes
May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds the multi-host SDK to the TypeScript client as a fourth subpath export —
@microsoft/agent-host-protocol/hosts— mirroring the Rustahp::hostsmodule and the SwiftMultiHostClientactor.Why
This is the largest remaining gap in the TypeScript client's surface relative to the Rust SDK. A real product that talks to two or more hosts at once (a local sessions server + a tunnel-attached remote, a personal host + a teammate's, N project hosts in a desktop sidebar, …) otherwise hand-rolls the same boilerplate: N
AhpClients, N transports + reconnect supervisors with backoff, a per-host metadata registry, per-host scoping of resource URIs, a fan-in of inbound events tagged with host of origin, andclientIdpersistence so reconnect identity survives launches. This PR ships that layer.Surface
What ships:
MultiHostClient—single,addHost,removeHost,reconnectHost,reconnectAllUnavailable,host/hostsSnapshot,client(returns a generation-checkedHostClientHandle),subscribe/unsubscribe/dispatch,events/hostEvents,aggregatedSessions/aggregatedAgents,shutdown.HostConfig/HostHandle/HostStatediscriminated union (disconnected|connecting|connected|reconnecting|failed) /HostEvent/HostSubscriptionEvent/HostedSessionSummary/HostedAgent.AhpClientError:HostMultiError,UnknownHostError,DuplicateHostError,HostReconnectedError(carries bothhandleGenerationandcurrentGenerationfor clean stale-handle recovery),HostShutDownError,ClientIdStoreError.ReconnectPolicy+Backoffwith jitter, plusdisabledPolicy()/exponentialPolicy()(default: 250 ms → 30 s, ×2, 25 % jitter, retry forever) /immediateForeverPolicy(). Pure functionsbackoffDelayForAttempt,delayWithJitter,attemptsExhaustedare exposed for deterministic testing.HostTransportFactory—(hostId, signal) => Promise<AhpTransport>. TheAbortSignalis aborted on remove / shutdown so factories can bail out of a slow handshake instead of blocking teardown.ClientIdStore+InMemoryClientIdStore— pluggable persistence for stable per-hostclientIds. Explicit > stored > generated; resolved id is always written back. Store failures surface asClientIdStoreErrorfromaddHost.MultiHostStateMirror+hostedResourceKey— host-aware reducer façade that keys session/terminal/changeset state by(hostId, uri)so URIs that legitimately collide across hosts (the normal case for session URIs) don't clobber.Design notes
tokio::select!racing futures againstshutdown_signal.notified(). JS can't preempt an in-flightawait, so the port threadsAbortSignalthroughHostTransportFactoryandClientIdStore.load/store, and routes every internal handshake through araceWithAbort(promise, signal)helper that resolves with anABORTEDsentinel on signal abort. That helper attaches a no-op rejection handler to the inner promise so lateClientClosedErrors (from a half-builtAhpClient.initializewhose client gotshutdown()'d in thefinally) don't surface asunhandledRejections.generation.HostClientHandlereadsgeneration+shutdownReasonthrough a shared reference on everycheckAlive— handle-after-reconnect surfacesHostReconnectedError; handle-after-removeHostsurfacesHostShutDownError. Both errors are recoverable via a freshmulti.client(hostId)(or, for removal, the host being re-added).connected. Reconnect-replay envelopes are fanned through the per-host state mirror and the cross-host event tap beforetransitionTo({status: 'connected'})/HostEvent.connected, so consumers observing the connected event already see catch-up state.ReconnectResult.Snapshotonly drops URIs that were in the prior subscriptions set AND missing from returned snapshots — URIs added locally between disconnect and reconnect survive.addHostrace safety. A pending-host id set guards against concurrentaddHostcalls slipping past the duplicate check while one is awaiting theClientIdStore. The reservation is cleared infinallyso any error path (including store failures) frees the slot for a retry.initialize/reconnectso notifications pushed between the handshake response and the moment the supervisor enters its event loop are captured instead of dropped by the broadcast queue's no-replay-for-late-readers semantics.HostIdasstringalias. Kept a plainstringfor ergonomics — equality is===and tests pass string literals directly. A branded type was considered but added too much friction for the surface gain.Not in this PR
FileClientIdStore. A filesystem-backedClientIdStorewould pull innode:fsand break browser bundlers even if consumers never instantiate it. Browsers can implementClientIdStoreagainstlocalStorage/ IndexedDB; Node/Electron consumers can wrapnode:fs/promisesorsafeStorage. The interface is the contract. If a Node-only adapter ends up wanted later, it can ship as a separate@microsoft/agent-host-protocol/hosts/file-client-id-storesubpath without re-exporting from the browser-safe/hostsentry point.docs/guide/connecting-to-multiple-hosts.mdpage. The TS client's README has a full Multi-host orchestration section with a runnable example; mirroring the Rust SDK'sMULTI_HOST.mdinto the docs site can ship in a follow-up if the docs site wants a language-agnostic narrative.Verification
From a fresh
npm run generate:typescriptat the repo root, then inclients/typescript:npm run typecheckclean.npm test— 47 tests pass (24 new). Coverage includes:MultiHostClient.single/addHost/removeHost/DuplicateHostError.addHostsurfacesClientIdStoreErrorand frees the reservation so a retry succeeds.HostClientHandle.generationinvalidates after a reconnect (carries both generations).HostClientHandleafterremoveHostthrowsHostShutDownError.ReconnectPolicy.delayWithJitter/backoffDelayForAttempt/attemptsExhaustedshapes (mirrors the Rust unit tests).InMemoryClientIdStoreround-trip + overwrite.hostedResourceKeydistinguishes same-URI hosts;MultiHostStateMirror.applySnapshot/resetHostscoping.aggregatedSessionssorts bymodifiedAtdescending and tagshostLabel;aggregatedAgentstags every agent with host.hostEventslifecycle ordering (added→stateChanged→connected→removed).eventsfan-in delivers notifications tagged withhostId.reconnectAllUnavailableskips connected/connecting hosts and returns per-host errors without throwing.reconnectHostwakes a host whoseReconnectPolicyis exhausted (the post-failedresumption path).MultiHostClient.shutdownis idempotent.subscribe/unsubscribe/dispatchon an unknown host throwsUnknownHostError.npm run buildclean.npm pack --dry-runshipsdist/client/hosts/.npm run lint+npm run typecheckunchanged (the ESLint config already excludesclients/typescript/**and the roottsconly coverstypes/).