Delay transfer and end_call until agent finishes speaking by drago-balto · Pull Request #204 · cartesia-ai/line

drago-balto · 2026-04-13T22:17:48Z

Summary

Adds after_speech: bool = False flag to AgentTransferCall and AgentEndCall events
When after_speech=True, ConversationRunner waits for the TTS idle signal before sending the event over the websocket, preventing the agent from being cut off mid-sentence
Built-in transfer_call and end_call tools set after_speech=True by default
Custom tools can opt out by setting after_speech=False

Problem

When the LLM generates speech text and a transfer/end_call tool call in the same turn, the transfer fires immediately while TTS is still playing, cutting off the agent mid-sentence. A fixed sleep delay is not a good solution since speech length varies.

How it works

ConversationRunner tracks TTS state via AgentStateInput (speaking/idle) using an asyncio.Event
Before sending an after_speech=True event, the runner waits for the idle signal (with a 30s safety timeout)
If no text was sent in the current turn, the wait is skipped entirely

Test plan

All existing tests pass (380 passed)
Manual test: agent says long farewell + transfers in same turn — speech completes before transfer
Manual test: after_speech=False — transfer fires immediately, cutting off speech
Manual test: transfer with no preceding text — no hang, transfers immediately

🤖 Generated with Claude Code

Note

Medium Risk
Changes call-control timing by delaying end_call/transfer_call until TTS reports idle, which could introduce waits/timeouts or ordering differences in live call flows.

Overview
Adds an after_speech flag to AgentEndCall and AgentTransferCall output events so call-control actions can be deferred until the agent finishes speaking.

Updates ConversationRunner to track TTS speaking/idle via an asyncio.Event and, when an output event has after_speech=True (and text was sent this turn), waits up to 30s for speech to complete before sending the event over the websocket.

Sets the built-in end_call and transfer_call tools to emit after_speech=True by default, preventing transfers/hangups from cutting off same-turn TTS.

^{Reviewed by Cursor Bugbot for commit e5d41ca. Bugbot is set up for automated code reviews on this repo. Configure here.}

When the LLM generates speech text and a transfer/end_call tool in the same turn, the transfer would fire immediately over the websocket while TTS was still playing, cutting off the agent mid-sentence. This adds an `after_speech` flag to AgentTransferCall and AgentEndCall that makes the ConversationRunner wait for the TTS idle signal before sending the event. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit e5d41ca. Configure here.}

cursor · 2026-04-13T22:22:41Z

+                        except asyncio.TimeoutError:
+                            logger.warning(
+                                f"Timed out waiting for speech to complete before {type(output).__name__}"
+                            )


Race condition: speech_done wait returns immediately, defeating feature

High Severity

The speech_done event is initialized as SET and only cleared when AgentStateInput(SPEAKING) arrives from the remote client. But in the primary use case (LLM generates text + end_call/transfer in the same turn), the runner task sends text and then immediately checks speech_done.wait() — long before the client has received the text, started TTS, and sent back the SPEAKING signal over the network. Since speech_done is still SET, wait() returns instantly, and the transfer/end_call fires immediately, cutting off the agent mid-sentence. The speech_done event needs to be cleared locally when AgentSendText is sent (where has_sent_text is set), not when the remote SPEAKING acknowledgement arrives.

Additional Locations (2)

line/voice_agent_app.py#L305-L312

line/voice_agent_app.py#L502-L508

^{Reviewed by Cursor Bugbot for commit e5d41ca. Configure here.}

drago-balto · 2026-04-13T22:30:04Z

Note: this implements the behavior for both end_call and transfer_call. However, the observation from testing is that the Cartesia harness may already be waiting for the agent to finish speaking before ending the call, and thus only the transfer_call changes are of any consequence.

That also suggests that this PR may not be needed at all, and that a better way of handling this might be to extend the harness behavior (of waiting for agent to stop speaking) to both end_call and transfer_call.

sauhardjain · 2026-04-18T00:58:49Z

This will be fixed on the harness-side! We're also working on making these events interruptible/uninterruptible, so that you can reliably control speech before the terminal states play out. #210

Thank you for flagging and putting up the PR anyway!

cursor bot reviewed Apr 13, 2026

View reviewed changes

sauhardjain closed this Apr 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delay transfer and end_call until agent finishes speaking#204

Delay transfer and end_call until agent finishes speaking#204
drago-balto wants to merge 1 commit intocartesia-ai:mainfrom
drago-balto:dmm/delay-transfer-end-call

drago-balto commented Apr 13, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Apr 13, 2026

Uh oh!

drago-balto commented Apr 13, 2026 •

edited

Loading

Uh oh!

sauhardjain commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

drago-balto commented Apr 13, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

How it works

Test plan

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Apr 13, 2026

Choose a reason for hiding this comment

Race condition: speech_done wait returns immediately, defeating feature

Uh oh!

drago-balto commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sauhardjain commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

drago-balto commented Apr 13, 2026 •

edited by cursor bot

Loading

Race condition: `speech_done` wait returns immediately, defeating feature

drago-balto commented Apr 13, 2026 •

edited

Loading