feat: streaming synthesis by askalf · Pull Request #7 · askalf/deepdive

askalf · 2026-04-23T01:23:38Z

Summary

A deep query's final synthesis is a single 30–60s LLM call. Until now the user stared at a frozen terminal for the whole duration, then got the full answer at once. Now tokens land on stdout as the model writes them.

When it's on

On by default when ALL of:

Not --json (JSON is buffered into the envelope)
Not --deep (intermediate rounds would print multiple full drafts back-to-back)
stdout.isTTY (piped output shouldn't interleave with progress events)
Not --no-stream (explicit opt-out)

Off otherwise. DEEPDIVE_FORCE_STREAM=1 env bypasses the TTY check for CI/testing.

CLI output structure under streaming

Write # {question}\n\n up front
Stream answer tokens straight to stdout
After synthesis, write the ## Sources\n... block

If --out is also set, the full markdown (re-rendered from the buffered result) goes to the file, so the file stays atomically complete.

Implementation

src/llm-stream.ts — new callLLMStream with async-generator SSE parser. Reuses the existing retry helper for the initial connect; mid-stream failures propagate (can't undo already-emitted tokens).
src/synthesize.ts — optional onToken param; streams when set, buffered callLLM path otherwise.
src/agent.ts — new AgentConfig.onSynthesizeToken?: (chunk, round) => void hook.
src/cli.ts — picks streaming mode, writes header + sources around the stream.
src/config.ts — streamEnabled derived once; auto-off for JSON/deep/env-opt-out.

Test plan

npm run build — clean under strict: true
npm test — 180 pass (up from 164), 0 fail
16 new assertions: parseBlocks (6: single-line, multi-line data folding, leading-space stripping, empty/comment blocks, [DONE] sentinel, malformed-JSON drop), parseSSE (4: multi-frame, chunk boundary splits, trailing-event-without-blank-line, CRLF endings), callLLMStream integration (4: token order + text aggregation + usage parsing, 500→200 retry, 401 does-not-retry, non-text_delta events ignored), plus CLI --no-stream and config derivation matrix

… them A deep query's final synthesis is a 30–60s single LLM call. Until now the user stared at a frozen terminal for the whole duration, then got the entire answer at once. Now the tokens stream to stdout as soon as they arrive. Turned on by default when ALL of: - not --json (JSON is buffered into the envelope) - not --deep (intermediate rounds would print multiple full drafts back-to-back) - stdout.isTTY (piped output shouldn't interleave with progress events) - not --no-stream (explicit opt-out) Turned off otherwise. Env-var DEEPDIVE_FORCE_STREAM=1 bypasses the TTY check for CI-style testing. CLI output structure under streaming: 1. Write "# {question}\n\n" up front 2. Stream answer tokens straight to stdout as they arrive 3. After synthesis completes, write the "## Sources\n..." block If --out is also set, the full markdown (re-rendered from the buffered result) goes to the file so the file remains atomically complete. Implementation: - src/llm-stream.ts: callLLMStream + parseSSE async generator + pure parseBlocks frame parser. Reuses the existing retry helper for the initial connect; mid-stream failures propagate. - src/synthesize.ts: added optional onToken param; passes through to streaming variant when set, falls back to buffered callLLM otherwise. - src/agent.ts: added AgentConfig.onSynthesizeToken — forwards chunks with the current round number. - src/cli.ts: picks streaming mode, writes header + sources around the stream. - src/config.ts: streamEnabled derived once and auto-off for json/deep/ env opt-out. Tests: 16 new assertions (180 total, up from 164 pre-branch, 96 before the production-grade track started). - parseBlocks (6): single-line, multi-line data folding (spec-compliant), leading-space stripping, empty/comment blocks, [DONE] sentinel, malformed-JSON drop - parseSSE (4): multi-frame stream, chunk boundary splitting, trailing event without blank line, CRLF line endings - callLLMStream integration (4): token order + full-text aggregation + usage parsing, 500-retry-then-succeed, 401-does-not-retry, non-text_delta events ignored gracefully - CLI/config (2): --no-stream flag, streamEnabled derivation matrix

askalf enabled auto-merge (squash) April 23, 2026 01:23

askalf merged commit a942e41 into master Apr 23, 2026
4 checks passed

askalf deleted the feat/streaming-synth branch April 23, 2026 01:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: streaming synthesis#7

feat: streaming synthesis#7
askalf merged 1 commit into
masterfrom
feat/streaming-synth

askalf commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

askalf commented Apr 23, 2026

Summary

When it's on

CLI output structure under streaming

Implementation

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant