Summary
The livepeer_gateway.trickle_publisher is throwing TrickleSegmentWriteError during active sessions (not on teardown), failing to POST trickle segments to the orchestrator. This is distinct from #846 which covers 404 errors on session teardown. Here the stream is confirmed live (hundreds of prior segments succeeded), and then a segment POST fails mid-session.
cc @mjh1 @emranemran
Error Logs (Grafana Loki, 2026-04-10 ~20:18–20:21 UTC)
2026-04-10 20:18:47,802 - livepeer_gateway.trickle_publisher - ERROR - Trickle POST exception url=https://orch-staging-1.daydream.monster:8935/ai/trickle/d21a61c3-4-out/583 error=Trickle POST exception ...
2026-04-10 20:20:52,724 - livepeer_gateway.trickle_publisher - WARNING - Trickle POST retrying same segment url=.../d21a61c3-4-out/584 (no request body consumed)
2026-04-10 20:21:52,932 - livepeer_gateway.trickle_publisher - ERROR - Trickle POST exception url=.../d21a61c3-4-out/584 error=...
Stack Trace
livepeer_gateway.trickle_publisher.TrickleSegmentWriteError: Trickle POST exception url=https://orch-staging-1.daydream.monster:8935/ai/trickle/d21a61c3-4-out/583
File "/app/.venv/lib/python3.12/site-packages/livepeer_gateway/media_publish.py", line 856, in _stream_pipe_to_trickle
File "/app/.venv/lib/python3.12/site-packages/livepeer_gateway/trickle_publisher.py", line 574, in write
File "/app/.venv/lib/python3.12/site-packages/livepeer_gateway/trickle_publisher.py", line 224, in _run_post
Context
- Session:
d21a61c3 on orch-staging-1.daydream.monster:8935
- Frequency: 2 segment failures (seq 583 and 584) in same session
- Session stats at time of error: 583 segments started, 582 completed, 1 failed — stream was active ~27 minutes (
elapsed_s=1490s)
- App:
github_f1lhgmk5v76a0ev1w0u378by-scope-livepeer-staging
- Pattern: segment 583 POST failed → retry of 584 also failed with 'no request body consumed' → both end in TrickleSegmentWriteError
Probable Cause
The orchestrator dropped the connection or rejected the write mid-session (not EOF/404). The 'no request body consumed' warning on retry suggests the orch-side HTTP server may have closed the connection before reading the body — possibly a timeout, transient network issue, or orch-staging-1 hiccup.
Related Issues
Suggested Fix
- Add explicit distinction between mid-session write failures vs teardown-404s in error handling
- Consider retry logic with backoff for TrickleSegmentWriteError (currently appears to do one retry then fail the segment)
- Add alerting/metric when
segments_failed > 0 in MediaPublishStats
Summary
The
livepeer_gateway.trickle_publisheris throwingTrickleSegmentWriteErrorduring active sessions (not on teardown), failing to POST trickle segments to the orchestrator. This is distinct from #846 which covers 404 errors on session teardown. Here the stream is confirmed live (hundreds of prior segments succeeded), and then a segment POST fails mid-session.cc @mjh1 @emranemran
Error Logs (Grafana Loki, 2026-04-10 ~20:18–20:21 UTC)
Stack Trace
Context
d21a61c3onorch-staging-1.daydream.monster:8935elapsed_s=1490s)github_f1lhgmk5v76a0ev1w0u378by-scope-livepeer-stagingProbable Cause
The orchestrator dropped the connection or rejected the write mid-session (not EOF/404). The 'no request body consumed' warning on retry suggests the orch-side HTTP server may have closed the connection before reading the body — possibly a timeout, transient network issue, or orch-staging-1 hiccup.
Related Issues
Suggested Fix
segments_failed > 0in MediaPublishStats