Skip to content

Add multi-host SDK to the TypeScript client#158

Merged
connor4312 merged 6 commits into
mainfrom
colbylwilliams/colbylwilliams-typescript-hosts-module
May 28, 2026
Merged

Add multi-host SDK to the TypeScript client#158
connor4312 merged 6 commits into
mainfrom
colbylwilliams/colbylwilliams-typescript-hosts-module

Conversation

@colbylwilliams

@colbylwilliams colbylwilliams commented May 27, 2026

Copy link
Copy Markdown
Member

What

Adds the multi-host SDK to the TypeScript client as a fourth subpath export — @microsoft/agent-host-protocol/hosts — mirroring the Rust ahp::hosts module and the Swift MultiHostClient actor.

Why

This is the largest remaining gap in the TypeScript client's surface relative to the Rust SDK. A real product that talks to two or more hosts at once (a local sessions server + a tunnel-attached remote, a personal host + a teammate's, N project hosts in a desktop sidebar, …) otherwise hand-rolls the same boilerplate: N AhpClients, N transports + reconnect supervisors with backoff, a per-host metadata registry, per-host scoping of resource URIs, a fan-in of inbound events tagged with host of origin, and clientId persistence so reconnect identity survives launches. This PR ships that layer.

Originally drafted as a follow-up to #156 (the now-closed publish pipeline PR). Rebased onto main since #156 was closed without merging; the multi-host work stands on its own and doesn't depend on the publish pipeline.

Surface

import {
  MultiHostClient,
  type HostTransportFactory,
} from '@microsoft/agent-host-protocol/hosts';

// Single-host: never see "registry" concepts.
const { multi, host } = await MultiHostClient.single({
  id: 'local',
  label: 'Local sessions server',
  transportFactory: openLocal,
});

// Or multi-host: add as many as you need.
await multi.addHost({ id: 'tunnel', label: 'Tunnel', transportFactory: openTunnel });
for await (const event of multi.events()) {
  console.log(`[${event.hostId}] ${event.channel}`, event.event.type);
}

What ships:

  • MultiHostClientsingle, addHost, removeHost, reconnectHost, reconnectAllUnavailable, host/hostsSnapshot, client (returns a generation-checked HostClientHandle), subscribe/unsubscribe/dispatch, events/hostEvents, aggregatedSessions/aggregatedAgents, shutdown.
  • HostConfig / HostHandle / HostState discriminated union (disconnected | connecting | connected | reconnecting | failed) / HostEvent / HostSubscriptionEvent / HostedSessionSummary / HostedAgent.
  • Error family extending AhpClientError: HostMultiError, UnknownHostError, DuplicateHostError, HostReconnectedError (carries both handleGeneration and currentGeneration for clean stale-handle recovery), HostShutDownError, ClientIdStoreError.
  • ReconnectPolicy + Backoff with jitter, plus disabledPolicy() / exponentialPolicy() (default: 250 ms → 30 s, ×2, 25 % jitter, retry forever) / immediateForeverPolicy(). Pure functions backoffDelayForAttempt, delayWithJitter, attemptsExhausted are exposed for deterministic testing.
  • HostTransportFactory(hostId, signal) => Promise<AhpTransport>. The AbortSignal is aborted on remove / shutdown so factories can bail out of a slow handshake instead of blocking teardown.
  • ClientIdStore + InMemoryClientIdStore — pluggable persistence for stable per-host clientIds. Explicit > stored > generated; resolved id is always written back. Store failures surface as ClientIdStoreError from addHost.
  • MultiHostStateMirror + hostedResourceKey — host-aware reducer façade that keys session/terminal/changeset state by (hostId, uri) so URIs that legitimately collide across hosts (the normal case for session URIs) don't clobber.

Design notes

  • Cancellation in single-threaded JS. The Rust runtime uses tokio::select! racing futures against shutdown_signal.notified(). JS can't preempt an in-flight await, so the port threads AbortSignal through HostTransportFactory and ClientIdStore.load/store, and routes every internal handshake through a raceWithAbort(promise, signal) helper that resolves with an ABORTED sentinel on signal abort. That helper attaches a no-op rejection handler to the inner promise so late ClientClosedErrors (from a half-built AhpClient.initialize whose client got shutdown()'d in the finally) don't surface as unhandledRejections.
  • Generation invalidation. Every successful (re)connect bumps a per-host generation. HostClientHandle reads generation + shutdownReason through a shared reference on every checkAlive — handle-after-reconnect surfaces HostReconnectedError; handle-after-removeHost surfaces HostShutDownError. Both errors are recoverable via a fresh multi.client(hostId) (or, for removal, the host being re-added).
  • Replay-before-connected. Reconnect-replay envelopes are fanned through the per-host state mirror and the cross-host event tap before transitionTo({status: 'connected'}) / HostEvent.connected, so consumers observing the connected event already see catch-up state.
  • Snapshot pruning. ReconnectResult.Snapshot only drops URIs that were in the prior subscriptions set AND missing from returned snapshots — URIs added locally between disconnect and reconnect survive.
  • addHost race safety. A pending-host id set guards against concurrent addHost calls slipping past the duplicate check while one is awaiting the ClientIdStore. The reservation is cleared in finally so any error path (including store failures) frees the slot for a retry.
  • Events stream lifetime. The client's all-events broadcast is attached before initialize / reconnect so notifications pushed between the handshake response and the moment the supervisor enters its event loop are captured instead of dropped by the broadcast queue's no-replay-for-late-readers semantics.
  • HostId as string alias. Kept a plain string for ergonomics — equality is === and tests pass string literals directly. A branded type was considered but added too much friction for the surface gain.

Not in this PR

  • FileClientIdStore. A filesystem-backed ClientIdStore would pull in node:fs and break browser bundlers even if consumers never instantiate it. Browsers can implement ClientIdStore against localStorage / IndexedDB; Node/Electron consumers can wrap node:fs/promises or safeStorage. The interface is the contract. If a Node-only adapter ends up wanted later, it can ship as a separate @microsoft/agent-host-protocol/hosts/file-client-id-store subpath without re-exporting from the browser-safe /hosts entry point.
  • A dedicated docs/guide/connecting-to-multiple-hosts.md page. The TS client's README has a full Multi-host orchestration section with a runnable example; mirroring the Rust SDK's MULTI_HOST.md into the docs site can ship in a follow-up if the docs site wants a language-agnostic narrative.

Verification

From a fresh npm run generate:typescript at the repo root, then in clients/typescript:

  • npm run typecheck clean.
  • npm test47 tests pass (24 new). Coverage includes:
    • MultiHostClient.single / addHost / removeHost / DuplicateHostError.
    • addHost surfaces ClientIdStoreError and frees the reservation so a retry succeeds.
    • HostClientHandle.generation invalidates after a reconnect (carries both generations).
    • HostClientHandle after removeHost throws HostShutDownError.
    • ReconnectPolicy.delayWithJitter / backoffDelayForAttempt / attemptsExhausted shapes (mirrors the Rust unit tests).
    • InMemoryClientIdStore round-trip + overwrite.
    • hostedResourceKey distinguishes same-URI hosts; MultiHostStateMirror.applySnapshot / resetHost scoping.
    • aggregatedSessions sorts by modifiedAt descending and tags hostLabel; aggregatedAgents tags every agent with host.
    • hostEvents lifecycle ordering (addedstateChangedconnectedremoved).
    • events fan-in delivers notifications tagged with hostId.
    • reconnectAllUnavailable skips connected/connecting hosts and returns per-host errors without throwing.
    • Manual reconnectHost wakes a host whose ReconnectPolicy is exhausted (the post-failed resumption path).
    • MultiHostClient.shutdown is idempotent.
    • subscribe / unsubscribe / dispatch on an unknown host throws UnknownHostError.
  • npm run build clean. npm pack --dry-run ships dist/client/hosts/.
  • ✅ Root npm run lint + npm run typecheck unchanged (the ESLint config already excludes clients/typescript/** and the root tsc only covers types/).

Adds the multi-host SDK as a fourth subpath export
`@microsoft/agent-host-protocol/hosts`, mirroring the Rust
`ahp::hosts` module:

- `MultiHostClient` — per-host reconnect supervisor, generation-checked
  client handles, fan-in event streams, aggregated views, manual
  reconnect, scene-phase `reconnectAllUnavailable` helper.
- `HostConfig` / `HostHandle` / `HostState` / `HostEvent` /
  `HostSubscriptionEvent` and the `HostMulti*Error` family.
- `ReconnectPolicy` + `Backoff` with jitter, plus
  `disabled`/`exponential`/`immediateForever` factories.
- `HostTransportFactory` — pluggable transport factory threaded with
  an `AbortSignal` so consumers can cancel in-flight connects.
- `ClientIdStore` + `InMemoryClientIdStore` for pluggable per-host
  `clientId` persistence.
- `MultiHostStateMirror` + `hostedResourceKey` host-aware reducer
  facade that keys per-resource state by `(hostId, uri)`.

Cancellation in a single-threaded JS runtime uses `AbortController`s
threaded through factories, stores, and a `raceWithAbort` helper that
attaches a no-op rejection handler so in-flight RPCs that surface
`ClientClosedError` after a forced shutdown don't become
`unhandledRejection`s.

Includes a `hosts.test.ts` suite (24 new tests, total 47 pass)
covering single/multi-host lifecycle, ClientIdStore failure handling,
generation invalidation across reconnects, `HostShutDownError` after
removal, manual reconnect from the `failed` state, aggregated view
sorting, hostEvents ordering, and the lossy fan-in semantics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@colbylwilliams colbylwilliams force-pushed the colbylwilliams/colbylwilliams-typescript-hosts-module branch from d2db59c to c9c2b91 Compare May 27, 2026 22:39
@colbylwilliams colbylwilliams changed the base branch from colbylwilliams/publish-typescript-package to main May 27, 2026 22:40
@colbylwilliams colbylwilliams requested a review from Copilot May 27, 2026 22:41

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new TypeScript multi-host orchestration layer exposed as @microsoft/agent-host-protocol/hosts, bringing the TS client closer to the Rust/Swift SDK surface by providing per-host supervisors, reconnect policy/backoff, fan-in events, aggregated views, and host-aware state mirroring.

Changes:

  • Introduces MultiHostClient + supporting types/errors (Host*, ReconnectPolicy, ClientIdStore, generation-checked HostClientHandle).
  • Implements per-host runtime supervisor with reconnect/backoff, replay handling, and fan-in event publishing.
  • Adds a comprehensive hosts.test.ts suite and updates docs/package exports to expose the new /hosts entry point.
Show a summary per file
File Description
clients/typescript/src/client/hosts/index.ts Exposes the public /hosts surface (types, errors, helpers, mirrors).
clients/typescript/src/client/hosts/multi.ts Implements the MultiHostClient registry façade and fan-in streams.
clients/typescript/src/client/hosts/runtime.ts Implements the per-host supervisor loop, connect/reconnect, and event fan-out.
clients/typescript/src/client/hosts/types.ts Defines public multi-host types, errors, and config resolution.
clients/typescript/src/client/hosts/policy.ts Adds reconnect policy/backoff + deterministic helpers.
clients/typescript/src/client/hosts/host-client-handle.ts Adds generation-checked client handles (HostClientHandle).
clients/typescript/src/client/hosts/state-mirror.ts Adds host-aware reducer mirror keyed by (hostId, uri).
clients/typescript/src/client/hosts/factory.ts Defines HostTransportFactory contract and docs.
clients/typescript/src/client/hosts/client-id-store.ts Defines ClientIdStore + InMemoryClientIdStore.
clients/typescript/test/hosts.test.ts Adds integration/unit tests for multi-host behavior and helpers.
clients/typescript/README.md Documents new /hosts entry point and multi-host usage.
clients/typescript/package.json Adds ./hosts subpath export to the package.

Copilot's findings

  • Files reviewed: 12/12 changed files
  • Comments generated: 2

Comment thread clients/typescript/src/client/hosts/runtime.ts Outdated
Comment thread clients/typescript/src/client/hosts/multi.ts
Two bugs flagged by the Copilot reviewer; both contradicted the
module's own documented contracts.

1. Manual reconnect couldn't interrupt an in-flight `connectOnce`.
   The supervisor raced transport/initialize/reconnect against
   `shutdownController.signal` only, so a manual `reconnectHost()`
   call would hang until the slow factory or handshake finished,
   even though `HostTransportFactory` documents that the signal is
   aborted on manual reconnect.

   Fix: build a combined signal (shutdown OR manualReconnect) at the
   top of `connectOnce`, pass it to the factory and every
   `raceWithAbort` call, and have the supervisor recognise the
   manual-reconnect path so it resets the controller and retries
   immediately without backoff or a phantom warning. New
   `linkAbortSignals` helper composes the signals.

2. `addHost` could leak a runtime when `shutdown` raced it.
   If `shutdown()` ran while `addHost()` was awaiting
   `ClientIdStore.load`, the resumed `addHost` would still create
   and start a HostRuntime after the multi-host client had closed
   its queues. Fix: re-check `assertOpen()` after the store await
   so the same `HostShutDownError` is surfaced and no runtime is
   registered. Doc updated to mention the new throw condition.

Tests added:
- 'manual reconnectHost aborts a slow in-flight transport factory'
  pins a factory on a never-resolving promise, asserts the captured
  signal aborts on manual reconnect, and the second attempt connects.
- 'addHost throws HostShutDownError when shutdown lands during
  ClientIdStore.load' sequences shutdown between the load and its
  resolution and asserts no runtime is registered and the factory
  was never invoked.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 12/12 changed files
  • Comments generated: 5

Comment thread clients/typescript/src/client/hosts/runtime.ts Outdated
Comment thread clients/typescript/src/client/hosts/runtime.ts
Comment thread clients/typescript/src/client/hosts/types.ts Outdated
Comment thread clients/typescript/src/client/hosts/client-id-store.ts
Comment thread clients/typescript/src/client/hosts/state-mirror.ts Outdated
colbylwilliams and others added 2 commits May 27, 2026 18:13
Five additional findings from the Copilot reviewer, all valid:

1. linkAbortSignals listener leak (runtime.ts)
   Listeners were only removed on abort, so successful connects
   accumulated abort listeners on the long-lived per-runtime
   shutdownController.signal across reconnect cycles. Fixed by
   returning a `dispose` cleanup function and calling it in
   `connectOnce`'s outer finally.

2. & 3. HostShutDownError overloaded with 'not connected yet'
   (runtime.ts + types.ts)
   `HostRuntime.subscribe/dispatch` used HostShutDownError for both
   permanent teardown and the transient 'currentClient is null'
   case, making it impossible for callers to branch on recovery.
   Introduced a new HostNotConnectedError, exported from the
   /hosts entry, thrown when the host is registered but not
   currently connected. HostShutDownError docs updated to clarify
   it is for permanent teardown only.

4. ClientIdStore signal docs vs reality (multi.ts)
   The ClientIdStore.load/store signal was documented as 'aborted
   on shutdown', but MultiHostClient never actually passed one.
   Added a private shutdownController to MultiHostClient that
   aborts at the start of shutdown(), and threaded its signal
   through resolveClientId into both load() and store().

5. hostedResourceKey collision (state-mirror.ts)
   The previous `${hostId}\0${uri}` encoding could collide if a
   HostId happened to contain a literal NUL. Switched to a
   length-prefixed encoding `${hostId.length}\0${hostId}${uri}`
   that is unambiguous for any string content (including \0),
   plus a private hostedResourceKeyPrefix() helper consumed by
   resetHost so prefix matching stays in sync with the encoding.

Tests added (49 -> 54):
- 'hostedResourceKey is collision-safe across awkward hostId / uri
  pairs' exercises the new encoding.
- 'subscribe on a registered-but-not-yet-connected host throws
  HostNotConnectedError' verifies the new error and that the URI
  is still tracked for replay.
- 'dispatch on a registered-but-not-yet-connected host throws
  HostNotConnectedError' covers the parallel dispatch path.
- 'ClientIdStore.load / store receive the multi-host shutdown
  signal' captures the signals and asserts they abort on shutdown.
- 'repeated reconnect cycles do not accumulate abort listeners on
  the shutdown signal' drives 20+ disconnect/reconnect cycles and
  asserts Node never emits MaxListenersExceededWarning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 12/12 changed files
  • Comments generated: 2

Comment thread clients/typescript/src/client/hosts/types.ts Outdated
Comment thread clients/typescript/src/client/hosts/types.ts Outdated
Two findings from the Copilot reviewer, both valid:

1. HostHandle 'already-frozen arrays' doc was inaccurate (types.ts)
   snapshotHandle() builds snapshots via spread/Array.from — there is
   no Object.freeze involved. Reworded the doc to 'shallow-cloned
   arrays' and explained that the supervisor doesn't mutate snapshots
   after construction, so consumers should treat them as immutable
   without claiming deep immutability.

2. generateClientId Math.random fallback (types.ts)
   When crypto.randomUUID was unavailable we jumped straight to
   Math.random for the 16 random bytes, which is weaker than the
   crypto.getRandomValues path that most browsers without
   randomUUID still expose. Inserted getRandomValues() as the
   middle tier of the fallback chain; Math.random is only used as
   a last resort when no Web Crypto API is exposed at all.

Tests added (54 -> 56):
- 'generateClientId prefers crypto.getRandomValues over Math.random
  when randomUUID is missing' stubs globalThis.crypto with only
  getRandomValues, fills with a deterministic pattern, and asserts
  it was called once and the resulting UUID is correctly framed
  (version 4, variant 8/9/a/b).
- 'generateClientId still returns a UUIDv4-shaped string when only
  Math.random is available' stubs globalThis.crypto to undefined
  and asserts the format is preserved.

Both stub via a withStubbedCrypto() helper that uses
Object.defineProperty since globalThis.crypto is a getter in Node
and plain assignment throws.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@colbylwilliams colbylwilliams requested a review from connor4312 May 27, 2026 23:27
@connor4312 connor4312 merged commit 32c0c4f into main May 28, 2026
7 checks passed
@connor4312 connor4312 deleted the colbylwilliams/colbylwilliams-typescript-hosts-module branch May 28, 2026 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants