feat(atoms): realtime agent — register-call + wire-truth corrections + Python guide by abhishekmishragithub · Pull Request #274 · smallest-inc/smallest-ai-documentation

abhishekmishragithub · 2026-06-26T21:27:37Z

Why

Three independent gaps in the Realtime Agent WebSocket docs surfaced when an FDE tried to build a Python client:

Wire event name wrong in spec. Streaming text fires as transcript.delta (DOT) — current spec only documents transcript (final, rarely fires on prod webcall). The JS SDK normalizes dot→underscore client-side; raw WS clients (Python, mobile) miss it entirely.
role enum wrong. Spec says user | agent; wire emits user | assistant (pipecat owns the role string; the platform's typed enum is stale).
POST /conversation/register-call missing from spec. Endpoint is live on platform main since 2026-06-21 (atoms-platform PR #2870), referenced by the JS cookbook, but not in our OpenAPI.

What

AsyncAPI (`agent-ws.yaml`)

Add transcript.delta message: cumulative streaming text per role.
Update transcript and transcript.delta role enum to user | assistant.
Add explicit "PCM 16-bit signed little-endian, mono" to audio messages.
Add 200-300 ms jitter-buffer guidance on output_audio.delta.

OpenAPI (`openapi.yaml`)

Add POST /conversation/register-call as GA. Body: agent_id, optional mode (webcall|chat), optional variables. Returns access_token (wct_ prefix, 30 s TTL, single-use) + expires_in + sample_rate.

New page

/atoms/developer-guide/integrate/realtime-agent-python — full working Python client mirroring the JS Web SDK guide. Uses the two-step flow (POST /register-call → wct_ → WS ?token=). Includes jitter-buffer rationale, event-handling table, common-errors table. No numpy dependency.

Verification

Created a fresh test agent from the sp-medical-centre-receptionist-in template. Ran the Python client from the new docs page end-to-end against prod:

POST /register-call returned 201 with valid wct_ token, expires_in=30, sample_rate=24000
WS ?token=wct_… accepted, session.created fired
Agent fired opening turn, ~1 MB of output_audio.delta decoded cleanly, zero buffer underruns with the 250 ms prebuffer
transcript.delta events captured with role: "assistant", displayed correctly

fern check: 0 errors.

Closes

docs(waves): Pulse STT — keywords format + punctuate/capitalize params #50, docs(atoms): add AsyncAPI spec for realtime agent WebSocket endpoint (PR A of 4) #53, docs(lightning-v3.1 + pulse-stt): 15 languages + pre-recorded encoding param #125

…rections + Python guide Verified end-to-end against a fresh agent created from the sp-medical-centre-receptionist-in template via POST /agent/from-template. Spec changes (AsyncAPI agent-ws.yaml): - Add `transcript.delta` message (cumulative streaming text per role). Wire emits dot notation; the JS SDK normalizes to underscore client-side. Raw WS clients (Python, mobile) must listen for the dot form. - Update `role` enum on `transcript` and `transcript.delta` to `user | assistant` — pipecat emits `assistant` for the agent. - Make audio format explicit on both `input_audio_buffer.append` and `output_audio.delta` — PCM 16-bit signed little-endian, mono, at the sample rate echoed in `session.created`. - Add jitter-buffer guidance on `output_audio.delta` (200-300 ms pre-buffer). Spec changes (OpenAPI openapi.yaml): - Add POST /conversation/register-call. Issues a short-lived `wct_` token (30 s TTL, single-use) for browser/client WebSocket auth. Live on platform main since 2026-06-21 (atoms-platform PR #2870). Body accepts `agent_id` (required), `mode` (webcall|chat, default webcall), and `variables` (string|number|boolean values for prompt templating). New page: - Agent WebSocket (Python) at /atoms/developer-guide/integrate/realtime-agent-python. Mirror of the JS Web SDK guide for headless Python clients. Self-contained working script: register-call → wct_ token → WS, plus mic/speaker callbacks, base64 PCM16 handling, transcript.delta + transcript event handling, 250 ms jitter buffer. No numpy dependency — sounddevice CFFI buffers support direct slice assignment. Closes #50, #53, #125.

github-actions · 2026-06-26T21:29:00Z

🌿 Preview your docs: https://smallest-ai-preview-docs-atoms-realtime-agent-cookbook-flow.docs.buildwithfern.com

Here are the markdown pages you've updated:

https://smallest-ai-preview-docs-atoms-realtime-agent-cookbook-flow.docs.buildwithfern.com/atoms/atoms-platform/integrate/realtime-agent-python

…rity note - New realtime-agent-token.sh: shell helper that fetches a wct_ token from ATOMS_API_KEY + ATOMS_AGENT_ID. Useful when testing the JS playground, Framer voice components, or one-off HTML files without writing a server. - Document webcall vs chat mode on the same WebSocket (mode query param). - Add explicit security warning against hard-coding sk_ keys in browser/Framer/no-code embeds — always issue wct_ tokens server-side.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(atoms): realtime agent — register-call + wire-truth corrections + Python guide#274

feat(atoms): realtime agent — register-call + wire-truth corrections + Python guide#274
abhishekmishragithub wants to merge 2 commits into
mainfrom
docs/atoms-realtime-agent-cookbook-flow

abhishekmishragithub commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

abhishekmishragithub commented Jun 26, 2026

Why

What

AsyncAPI (agent-ws.yaml)

OpenAPI (openapi.yaml)

New page

Verification

Closes

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AsyncAPI (`agent-ws.yaml`)

OpenAPI (`openapi.yaml`)