feat(voice): widget voice-as-chat — stream mode through the chat brain by swaroopvarma1 · Pull Request #845 · juspay/clairvoyance

swaroopvarma1 · 2026-06-18T11:45:06Z

What

Re-architects Buddy Assist widget voice from a dual-brain design (a separate voice FlowManager+LLM kept in sync with the chat ChatAgent) to stream mode (ExecutionMode.DAILY_STREAM — STT in / TTS out, no LLM in the pipeline) driven by the existing chat ChatAgent. Voice becomes pure audio I/O around one brain, so cart_id, content-block history, agent_state, carousels, and HITL are all inherited from chat for free.

Telephony (Twilio/Plivo/Exotel) is untouched — it never uses DAILY_STREAM and runs no chat session.

Key changes

chat/turn_core.py (new) — channel-agnostic run_chat_turn (+ approval continuation) factored out of the chat HTTP handler so the voice subprocess drives the same brain.
chat/voice_bridge.py (new) — WidgetVoiceBridge taps on_user_turn_stopped → run_chat_turn → adapts the SSE stream into TTSSpeakFrame + RTVI events; holds the per-session Redis lock per turn; barge-in cancels the in-flight turn.
HITL over voice rides the chat approval path: the gated call surfaces an inline, persistent approval card (same kind:'approval' message as chat); the prompt is spoken audio-only so the card text isn't duplicated.
Forward function-call-started/function-call-completed RTVI events so the widget can show a "thinking / executing " state on the voice orb.
Carousel-click injection (ui-action) during voice.
Deletes the dual-brain sync layer (prepare_resume_node, voice drain, ui-blocks-for-voice, agent_state seeding).

Deferred (out of this PR)

Generative UI output over RTVI for non-widget Daily agent-mode voice (VoiceUiStreamProcessor) is deferred. The processor module is kept but not plugged into the pipeline; the wiring was removed from agent/__init__.py, agent/pipeline.py, agent/flow.py. Widget voice (stream mode) doesn't use it — the chat ChatAgent emits/persists ui_op itself. See docs/widget/VOICE_GENERATIVE_UI_TODO.md for the re-wiring checklist. (coerce_ui_action_text / click-to-talk is unrelated and stays.)

Testing

uv run pyrefly check — 0 errors
uv run black --check / isort / autoflake — clean
JWT_SECRET_KEY=test JWT_ALGORITHM=HS256 uv run pytest tests/ -q — 553 passed, 1 xfailed
Manual E2E on the local harness: voice↔chat parity (cart inheritance, carousels), inline HITL approval (appears mid-call, persists with resolved badge), barge-in stops TTS mid-sentence.

Pairs with the loom frontend PR (SDK + widget).

🤖 Generated with Claude Code

coderabbitai · 2026-06-18T11:45:21Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e193bdaa-51d9-40be-8be5-f5b6987b515b

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

Implements widget voice-as-chat Architecture v2: a new WidgetVoiceBridge connects stream-mode Daily voice sessions to the existing ChatAgent via an extracted turn_core module, replacing the v1 resume-seed/drain path. A VoiceUiStreamProcessor strips <ui_stream> markers from LLM text and emits RTVI ui-op events; claim_tool_approval provides atomic HITL decision claiming; AccumulatingSpeechTimeoutStrategy is removed in favor of the Pipecat 1.1.0 native strategy.

Changes

Widget Voice-as-Chat v2: WidgetVoiceBridge + Generative Voice UI

Layer / File(s)	Summary
VoiceUiStreamProcessor and coerce_ui_action_text `app/ai/voice/agents/breeze_buddy/processors/voice_ui_stream.py`, `app/ai/voice/agents/breeze_buddy/processors/__init__.py`	New `VoiceUiStreamProcessor` buffers LLM text frames, extracts `<ui_stream>` op lines via `UiStreamExtractor`, strips markers from prose forwarded to TTS, emits validated RTVI `ui-op` events via an async callback, and resets dedup state per response. `coerce_ui_action_text` validates/trims client `ui-action` msg payloads. Registered in the package `__all__`.
Shared chat brain: turn_core + claim_tool_approval `app/ai/voice/agents/breeze_buddy/chat/turn_core.py`, `app/ai/voice/agents/breeze_buddy/chat/approvals.py`	`turn_core.py` is a new channel-agnostic module exporting `run_chat_turn`, `run_chat_approval_turn`, `run_chat_approval_continuation`, `build_render_template_vars`, and `resolve_llm_configuration`—encapsulating history replay, dangling tool repair, agent state loading, and `ChatAgent` streaming. `approvals.py` adds `ApprovalClaim` dataclass and `claim_tool_approval` with atomic `decide_tool_approval` + race-fallback re-read + sibling tracking.
WidgetVoiceBridge: SSE→TTS/RTVI adaptation `app/ai/voice/agents/breeze_buddy/chat/voice_bridge.py`	New `WidgetVoiceBridge` manages a single in-flight turn task with generation counters for barge-in cancellation, serializes session writes via a Redis lock (skipping with RTVI `error` when busy), adapts chat SSE events into sentence-chunked `TTSSpeakFrame` audio and RTVI events (`ui-op`, HITL approval request/resolution, `turn-end`, `error`), speaks a filler phrase before slow tool calls, and gates greeting on first attachment.
Agent state, pipeline wiring, and flow config `app/ai/voice/agents/breeze_buddy/agent/__init__.py`, `app/ai/voice/agents/breeze_buddy/agent/pipeline.py`, `app/ai/voice/agents/breeze_buddy/agent/flow.py`, `app/ai/voice/agents/breeze_buddy/template/interruption.py`, `app/ai/voice/agents/breeze_buddy/template/session_state.py`	`Agent` adds `_voice_ui_allowlist` and `_voice_bridge` state, wires `WidgetVoiceBridge` to user aggregator events, routes RTVI `approval`/`ui-action` messages to bridge or agent path, and adds `_resolve_voice_ui_allowlist`. `build_pipeline` gains `ui_emit`/`ui_allowlist` params with conditional `VoiceUiStreamProcessor` splice; `create_pipeline_task` adds `ignored_rtvi_sources`. `build_flow_config` gains `ui_allowlist`; `prepare_resume_node` is removed. `AccumulatingSpeechTimeoutStrategy` is replaced by `SpeechTimeoutUserTurnStopStrategy`.
Chat HTTP handler delegation to turn_core `app/api/routers/breeze_buddy/chat/handlers.py`, `app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py`	`/message` handler delegates to `run_chat_turn`; `/approve` handler uses `claim_tool_approval` and delegates resume to `run_chat_approval_continuation`; inline history replay, `ChatAgent` construction, and approval row logic are removed. `end_conversation` replaces the drain pathway with `voice_bridge.aclose()` then `flip_chat_session_to_chat`.
Widget session management, execution mode, and DB cleanup `app/api/routers/breeze_buddy/widget/handlers.py`, `app/database/accessor/breeze_buddy/...`, `app/database/queries/breeze_buddy/...`, `app/schemas/breeze_buddy/chat.py`	Widget handlers add `_template_voice_enabled` gating, set `ExecutionMode.DAILY_STREAM` for lead creation/reset, simplify `voice_connect_handler` seed construction, and add guarded `flip_chat_session_to_chat` in `voice_end_handler`. `reset_widget_voice_lead` gains `execution_mode` param. `drain_voice_into_chat_session` accessor and query are deleted. `voice_enabled: bool` added to `CreateWidgetSessionResponse` and `WidgetSessionStateResponse`.
Tests and documentation `tests/test_turn_core.py`, `tests/test_voice_bridge.py`, `tests/test_voice_ui_stream.py`, `tests/test_turn_stop_strategy.py`, `docs/DAILY_RTVI_EVENTS.md`, `docs/widget/VOICE_AS_CHAT.md`	Tests for `turn_core` (missing session/template, supersede ordering, approval outcomes), `VoiceUiStreamProcessor` (marker stripping, allowlist gating, known-id reset per response), `WidgetVoiceBridge` (sentence aggregation, filler, barge-in, lock semantics, HITL cards), and `SpeechTimeoutUserTurnStopStrategy` (regression confirming `AccumulatingSpeechTimeoutStrategy` removal). `DAILY_RTVI_EVENTS.md` documents `ui-op`/`ui-action` events; `VOICE_AS_CHAT.md` defines Architecture v2.

Sequence Diagram(s)

sequenceDiagram
  participant Widget as Widget Client
  participant VoiceBot as Daily Voice Bot (Agent)
  participant WidgetVoiceBridge
  participant TurnCore as turn_core (ChatAgent)
  participant Redis as Redis Lock
  participant DB as Database

  Widget->>VoiceBot: user speech (STT finalized)
  VoiceBot->>WidgetVoiceBridge: handle_user_turn(transcript)
  WidgetVoiceBridge->>WidgetVoiceBridge: cancel_inflight(), bump generation
  WidgetVoiceBridge->>Redis: acquire(session_lock)
  Redis-->>WidgetVoiceBridge: acquired
  WidgetVoiceBridge->>TurnCore: run_chat_turn(session_id, content)
  TurnCore->>DB: load session, template, history, agent_state
  TurnCore->>TurnCore: ChatAgent.run_turn(history, agent_state)
  TurnCore-->>WidgetVoiceBridge: SSEEvent stream (assistant_token, ui_op, turn_end)
  WidgetVoiceBridge->>VoiceBot: TTSSpeakFrame(sentence)
  WidgetVoiceBridge->>Widget: RTVI emit("ui-op", op)
  WidgetVoiceBridge->>Widget: RTVI emit("turn-end", status)
  WidgetVoiceBridge->>Redis: release(session_lock)

  Widget->>VoiceBot: RTVI "ui-action" (carousel click)
  VoiceBot->>WidgetVoiceBridge: handle_user_turn(coerced_text)
  note over WidgetVoiceBridge,TurnCore: same turn flow as above

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

juspay/clairvoyance#778: Introduced prepare_resume_node and widget-mode resume/seed logic in agent/__init__.py and flow.py that this PR removes in favor of the WidgetVoiceBridge stream path.
juspay/clairvoyance#824: Established ApprovalManager wiring and on_client_message decision validation that this PR extends with the _voice_bridge.handle_approval_decision(...) branch for stream widget voice.
juspay/clairvoyance#835: Added agent state persistence and rehydration during CHAT↔VOICE resume that this PR supersedes by removing the resume seed/carry-forward pattern entirely.

Suggested reviewers

Tara-ag

Poem

🐰 Hop, hop! The drain is gone, no seeds to stow,
The bridge now streams the chat brain's SSE flow.
ui_stream markers stripped, so TTS won't say JSON—
The filler says "Just a sec!" and then the turn rolls on.
One chat brain rules them all, from widget voice to chat,
Generation counters guard the barge-in, just like that! 🎙️

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.40% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and specifically describes the main architectural change: widget voice now operates through the chat brain in stream mode, which is the primary objective of the PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/widget-voice-as-chat-backend

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

app/ai/voice/agents/breeze_buddy/chat/turn_core.py (1)

366-372: 💤 Low value

Consider sorting __all__ alphabetically.

Ruff flags this as unsorted. While not a functional issue, sorting aids readability and reduces merge conflicts.

♻️ Suggested sort

 __all__ = [
-    "run_chat_turn",
-    "run_chat_approval_turn",
+    "build_render_template_vars",
+    "resolve_llm_configuration",
     "run_chat_approval_continuation",
-    "build_render_template_vars",
-    "resolve_llm_configuration",
+    "run_chat_approval_turn",
+    "run_chat_turn",
 ]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/chat/turn_core.py` around lines 366 - 372,
The __all__ list in turn_core.py is not sorted alphabetically, which Ruff flags
as unsorted. Rearrange the items in the __all__ list (containing
"run_chat_turn", "run_chat_approval_turn", "run_chat_approval_continuation",
"build_render_template_vars", and "resolve_llm_configuration") in alphabetical
order to comply with Ruff's linting requirements and improve readability.

app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py (1)

21-21: 💤 Low value

Add noqa comment for consistency with the drain exception handler.

The exception handler at line 154 intentionally catches a blind Exception for best-effort behavior (matching line 141's pattern), but is missing the # noqa: BLE001 comment for consistency and to silence the static analysis warning.
Suggested fix
-            except Exception as flip_err:
+            except Exception as flip_err:  # noqa: BLE001
Also applies to: 120-159
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py` at
line 21, Add the `# noqa: BLE001` comment to the bare exception handler(s) in
the end_conversation.py file to suppress the static analysis warning about
catching a blind Exception. Locate the exception handler(s) that catch generic
Exception (particularly around line 154) and add the noqa comment at the end of
the except clause line to maintain consistency with other intentional bare
exception catches in the file that already have this annotation for best-effort
error handling behavior.
Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/widget/VOICE_AS_CHAT.md`:
- Around line 259-264: The fenced code block containing the conditional logic
that checks voiceLive and voiceSession is missing a language identifier, which
causes markdownlint MD040 violations. Add the language identifier "typescript"
to the opening fence of this code block by changing the opening triple backticks
from ``` to ```typescript to properly mark the code language.

In `@tests/test_turn_core.py`:
- Around line 186-196: The test function `test_approval_turn_already_decided`
only validates the first event and is missing an assertion for the terminal
`turn_end` event that should close the turn. Add an additional assertion after
the existing assertions to verify that the last event in the events list has
event type equal to "turn_end" to ensure the turn properly completes and prevent
regressions that drop this terminal event.

In `@tests/test_voice_bridge.py`:
- Around line 244-263: The test function test_barge_in_cancel_drops_tail uses a
fixed asyncio.sleep(0.02) call to coordinate timing between when the first
sentence is spoken and when the barge-in should occur, which makes the test
unreliable on slower CI runners. Replace the fixed sleep with a deterministic
synchronization mechanism by introducing an asyncio.Event that signals when the
first sentence has been fully flushed and spoken, then await that event instead
of using the hardcoded sleep duration. This ensures the test waits for the
actual condition to be met rather than guessing at timing.

---

Nitpick comments:
In `@app/ai/voice/agents/breeze_buddy/chat/turn_core.py`:
- Around line 366-372: The __all__ list in turn_core.py is not sorted
alphabetically, which Ruff flags as unsorted. Rearrange the items in the __all__
list (containing "run_chat_turn", "run_chat_approval_turn",
"run_chat_approval_continuation", "build_render_template_vars", and
"resolve_llm_configuration") in alphabetical order to comply with Ruff's linting
requirements and improve readability.

In `@app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py`:
- Line 21: Add the `# noqa: BLE001` comment to the bare exception handler(s) in
the end_conversation.py file to suppress the static analysis warning about
catching a blind Exception. Locate the exception handler(s) that catch generic
Exception (particularly around line 154) and add the noqa comment at the end of
the except clause line to maintain consistency with other intentional bare
exception catches in the file that already have this annotation for best-effort
error handling behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 46690370-eb85-4293-984e-790248b7bae3

📥 Commits

Reviewing files that changed from the base of the PR and between 7c7f891 and baa991c.

📒 Files selected for processing (24)

app/ai/voice/agents/breeze_buddy/agent/__init__.py
app/ai/voice/agents/breeze_buddy/agent/flow.py
app/ai/voice/agents/breeze_buddy/agent/pipeline.py
app/ai/voice/agents/breeze_buddy/chat/approvals.py
app/ai/voice/agents/breeze_buddy/chat/turn_core.py
app/ai/voice/agents/breeze_buddy/chat/voice_bridge.py
app/ai/voice/agents/breeze_buddy/handlers/internal/end_conversation.py
app/ai/voice/agents/breeze_buddy/processors/__init__.py
app/ai/voice/agents/breeze_buddy/processors/voice_ui_stream.py
app/ai/voice/agents/breeze_buddy/template/interruption.py
app/ai/voice/agents/breeze_buddy/template/session_state.py
app/api/routers/breeze_buddy/chat/handlers.py
app/api/routers/breeze_buddy/widget/handlers.py
app/database/accessor/breeze_buddy/chat_session.py
app/database/accessor/breeze_buddy/lead_call_tracker.py
app/database/queries/breeze_buddy/chat_session.py
app/database/queries/breeze_buddy/lead_call_tracker.py
app/schemas/breeze_buddy/chat.py
docs/DAILY_RTVI_EVENTS.md
docs/widget/VOICE_AS_CHAT.md
tests/test_turn_core.py
tests/test_turn_stop_strategy.py
tests/test_voice_bridge.py
tests/test_voice_ui_stream.py

💤 Files with no reviewable changes (2)

app/database/accessor/breeze_buddy/chat_session.py
app/database/queries/breeze_buddy/chat_session.py

coderabbitai · 2026-06-18T11:54:31Z

+```
+if (voiceLive && voiceSession) {
+  store.appendUserBubble(action.display ?? action.msg);   // optimistic (no server echo)
+  store.sendUserAction({type:'to_assistant', msg: action.msg, display: action.display});
+} else { send(cleaned, bubble); }                           // chat path unchanged
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language identifier to the fenced code block.

This block is missing a language tag and trips markdownlint MD040.

✅ Suggested lint fix

-``` +```typescript if (voiceLive && voiceSession) { store.appendUserBubble(action.display ?? action.msg); // optimistic (no server echo) store.sendUserAction({type:'to_assistant', msg: action.msg, display: action.display}); } else { send(cleaned, bubble); } // chat path unchanged

</details>  <details> <summary>📝 Committable suggestion</summary> > ‼️ **IMPORTANT** > Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements. ```suggestion

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 259-259: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/widget/VOICE_AS_CHAT.md` around lines 259 - 264, The fenced code block containing the conditional logic that checks voiceLive and voiceSession is missing a language identifier, which causes markdownlint MD040 violations. Add the language identifier "typescript" to the opening fence of this code block by changing the opening triple backticks from ``` to ```typescript to properly mark the code language.

Source: Linters/SAST tools

coderabbitai · 2026-06-18T11:54:31Z

+async def test_approval_turn_already_decided(monkeypatch):
+    async def _claim(session_id, tool_call_id, approved, reason):
+        return ApprovalClaim(outcome="already_decided", winning_status="denied")
+
+    monkeypatch.setattr(tc, "claim_tool_approval", _claim)
+    events = await _collect(
+        tc.run_chat_approval_turn(session_id="s", tool_call_id="tc1", approved=False)
+    )
+    assert events[0].event == "function_approval_resolved"
+    assert events[0].data["status"] == "denied"
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert terminal turn_end in the already_decided approval test.

This case currently validates only the first event, so a regression that drops the terminal completion event would still pass.

✅ Suggested test hardening

async def test_approval_turn_already_decided(monkeypatch): @@ events = await _collect( tc.run_chat_approval_turn(session_id="s", tool_call_id="tc1", approved=False) ) - assert events[0].event == "function_approval_resolved" + assert [e.event for e in events] == ["function_approval_resolved", "turn_end"] assert events[0].data["status"] == "denied" + assert events[1].data["session_status"] == "ACTIVE"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_turn_core.py` around lines 186 - 196, The test function `test_approval_turn_already_decided` only validates the first event and is missing an assertion for the terminal `turn_end` event that should close the turn. Add an additional assertion after the existing assertions to verify that the last event in the events list has event type equal to "turn_end" to ensure the turn properly completes and prevent regressions that drop this terminal event.

coderabbitai · 2026-06-18T11:54:31Z

+async def test_barge_in_cancel_drops_tail(monkeypatch):
+    gate = asyncio.Event()
+
+    async def _gen(*, session_id, user_content, llm=None, context_placement=None):
+        yield SSEEvent(
+            "assistant_token", {"delta": "This is the first sentence here. "}
+        )
+        await gate.wait()  # block until the test releases (it won't — cancelled)
+        yield SSEEvent("assistant_token", {"delta": "Tail that must be dropped."})
+        yield SSEEvent("turn_end", {"session_status": "ACTIVE"})
+
+    monkeypatch.setattr(vb, "run_chat_turn", _gen)
+    bridge, task, _ = _make_bridge()
+    await bridge.handle_user_turn("hi")
+    inflight = bridge._inflight
+    assert inflight is not None
+    # Let the first sentence flush, then barge in.
+    await asyncio.sleep(0.02)
+    assert _spoken(task) == ["This is the first sentence here."]
+    await bridge.cancel_inflight()


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid fixed sleep in barge-in test to prevent flakiness.

Using asyncio.sleep(0.02) makes this test timing-sensitive across slower CI runners.

✅ Deterministic synchronization approach

async def test_barge_in_cancel_drops_tail(monkeypatch): gate = asyncio.Event() + first_chunk_seen = asyncio.Event() @@ async def _gen(*, session_id, user_content, llm=None, context_placement=None): yield SSEEvent( "assistant_token", {"delta": "This is the first sentence here. "} ) + first_chunk_seen.set() await gate.wait() # block until the test releases (it won't — cancelled) yield SSEEvent("assistant_token", {"delta": "Tail that must be dropped."}) yield SSEEvent("turn_end", {"session_status": "ACTIVE"}) @@ - await asyncio.sleep(0.02) + await asyncio.wait_for(first_chunk_seen.wait(), timeout=1.0) assert _spoken(task) == ["This is the first sentence here."]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_voice_bridge.py` around lines 244 - 263, The test function test_barge_in_cancel_drops_tail uses a fixed asyncio.sleep(0.02) call to coordinate timing between when the first sentence is spoken and when the barge-in should occur, which makes the test unreliable on slower CI runners. Replace the fixed sleep with a deterministic synchronization mechanism by introducing an asyncio.Event that signals when the first sentence has been fully flushed and spoken, then await that event instead of using the hardcoded sleep duration. This ensures the test waits for the actual condition to be met rather than guessing at timing.

Tara-ag

Review Summary: PR #845 — Widget Voice-as-Chat (Stream Mode)

Overview

This PR re-architects widget voice from a dual-brain design to stream mode (ExecutionMode.DAILY_STREAM), where voice becomes pure audio I/O around the existing chat ChatAgent. Key achievements:

New Core Modules:
- chat/turn_core.py — Channel-agnostic run_chat_turn factored out of HTTP handler
- chat/voice_bridge.py — WidgetVoiceBridge adapting SSE stream to TTS + RTVI events
- processors/voice_ui_stream.py — Generative UI over RTVI (deferred but kept)
Test Coverage: 4 new test files (1,082 lines) covering turn core, voice bridge, turn stop strategy, and voice UI stream

Security & Safety Analysis

Area	Status	Notes
SQL Injection	✅ Safe	All queries use `$1, $2...` positional placeholders via `run_parameterized_query()`
Migration Integrity	✅ Safe	No existing migration files modified; no new migrations needed (uses existing schema)
Secrets	✅ Clean	No hardcoded credentials; secrets flow through KMS-encrypted DB
Auth/Authorization	✅ Consistent	Voice bridge uses same Redis lock key as HTTP paths (`chat:session:{id}:lock`)
Input Validation	✅ Present	`coerce_ui_action_text()` validates/truncates ui-action messages

Key Implementation Highlights

SQL Safety Verified:
- app/database/queries/breeze_buddy/chat_session.py — All queries use parameterized placeholders
- No f-string or %-formatting in SQL construction
- JSONB values passed as bound parameters ($N::jsonb)
Race Condition Prevention:
- Per-session Redis lock (_SESSION_LOCK_TTL_SECONDS = 180) serializes writers
- Lock acquired in _drive() before DB operations, released in finally with asyncio.shield()
- Generation counter prevents tail frame leakage on barge-in cancel
Error Handling:
- CancelledError uncancel pattern (Python 3.11 safe)
- Lock release shielded from cancellation
- Turn errors emit RTVI error events rather than crashing the pipeline

Existing Comments Acknowledged

CodeRabbit's 3 minor suggestions (markdown lint, test assertion, deterministic test sync) are non-blocking and can be addressed in follow-up

Approval

No blocking issues found. The implementation follows project conventions for:

SQL parameterization (asyncpg $N style)
Layered architecture (queries → accessor → decoder)
Redis-backed distributed locking
Fail-open degradation with proper logging

Approved for merge.

Re-architect Buddy Assist widget voice from a dual-brain design (a separate voice FlowManager+LLM kept in sync with the chat ChatAgent) to stream mode (ExecutionMode.DAILY_STREAM: STT in / TTS out, no LLM in the pipeline) driven by the existing chat ChatAgent. Voice becomes pure audio I/O around one brain, so cart_id, content-block history, agent_state, carousels and HITL are all inherited from chat for free. Telephony (Twilio/Plivo/Exotel) is untouched. - chat/turn_core.py: channel-agnostic run_chat_turn (+ approval continuation) factored out of the chat HTTP handler so the voice subprocess drives the same brain. - chat/voice_bridge.py: WidgetVoiceBridge taps on_user_turn_stopped -> run_chat_turn -> adapts the SSE stream into TTSSpeakFrame + RTVI events; holds the per-session Redis lock per turn; barge-in cancels the in-flight turn. - HITL over voice rides the chat approval path: the gated call surfaces an inline, persistent approval card (same kind:'approval' message as chat) and the prompt is spoken audio-only so the card text isn't duplicated. - Forward function-call-started/completed RTVI events so the widget can show a "thinking / executing <tool>" state on the voice orb (the bridge previously consumed function_call_started only for the TTS filler). - Carousel-click injection (ui-action) during voice. Generative-UI *output* over RTVI for non-widget Daily agent-mode is DEFERRED — VoiceUiStreamProcessor is kept but not plugged into the agent pipeline; see docs/widget/VOICE_GENERATIVE_UI_TODO.md. - Delete the dual-brain sync layer (prepare_resume_node, voice drain, ui-blocks-for-voice, agent_state seeding). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

swaroopvarma1 force-pushed the feat/widget-voice-as-chat-backend branch from baa991c to ad7e4b7 Compare June 18, 2026 12:19

Tara-ag approved these changes Jun 18, 2026

View reviewed changes

swaroopvarma1 force-pushed the feat/widget-voice-as-chat-backend branch 3 times, most recently from 8ad4700 to b97e865 Compare June 18, 2026 15:26

swaroopvarma1 force-pushed the feat/widget-voice-as-chat-backend branch from b97e865 to 63c1810 Compare June 18, 2026 15:29

swaroopvarma1 merged commit 3990db9 into release Jun 18, 2026
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(voice): widget voice-as-chat — stream mode through the chat brain#845

feat(voice): widget voice-as-chat — stream mode through the chat brain#845
swaroopvarma1 merged 1 commit into
releasefrom
feat/widget-voice-as-chat-backend

swaroopvarma1 commented Jun 18, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

Review skipped

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Uh oh!

coderabbitai Bot Jun 18, 2026

Uh oh!

coderabbitai Bot Jun 18, 2026

Uh oh!

Tara-ag left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

swaroopvarma1 commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Key changes

Deferred (out of this PR)

Testing

Uh oh!

coderabbitai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Tara-ag left a comment

Choose a reason for hiding this comment

Review Summary: PR #845 — Widget Voice-as-Chat (Stream Mode)

Overview

Security & Safety Analysis

Key Implementation Highlights

Existing Comments Acknowledged

Approval

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

swaroopvarma1 commented Jun 18, 2026 •

edited

Loading

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading