fix(connect): keep spirc running after transient connection-id update failure#1716
Conversation
The `connection_id_update` arm of the spirc `select!` loop is the only arm that treats a handler error as fatal: it `break`s out of the loop, which leads to the "unexpected shutdown" path and terminates the spirc while the librespot process stays alive. The device then silently disappears from Spotify Connect until the process is manually restarted. Such failures are typically transient — `handle_connection_id_update` issues the connect-state PUT (`notify_new_device_appeared`), which can race a dealer websocket reset and surface as `SpircError::FailedDealerSetup`. The connection_id arrives via a long-lived dealer subscription to `hm://pusher/v1/connections/`, and the pusher re-sends a fresh connection_id on every dealer reconnect, so logging and continuing lets the next update retry registration — exactly like every other arm in this loop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR changes Spirc’s behavior to avoid shutting down when handling a connection-id update fails, and documents the behavior change in the changelog.
Changes:
- Stop breaking out of the Spirc loop on connection-id update handling errors.
- Add a changelog entry describing the updated behavior and user-facing impact.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| connect/src/spirc.rs | Removes the break on connection-id update handling failure so Spirc continues running. |
| CHANGELOG.md | Documents the new behavior so releases capture the operational/user impact. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Heads-up for reviewers: the red That's already addressed by the open clippy PRs — #1711 fixes this exact sort lint, and #1712 fixes a second stable-clippy lint (clippy aborts on the first error, so that one would surface next). Merging those would turn CI green here without touching this PR, so I've kept this diff to the single relevant line rather than duplicating their fixes. |
Problem
In
SpircTask::run, theconnection_id_updatearm of the maintokio::select!loop is the only arm that treats a handler error as fatal — itbreaks out of the loop:Breaking leads straight to the
unexpected shutdownpath and terminates the spirc — but the librespot process stays alive. So process-level supervision (Restart=on-failure, etc.) never fires, and the device silently disappears from Spotify Connect until it is manually restarted.The failure is usually transient:
handle_connection_id_updateissues the connect-state PUT (notify_new_device_appeared), which can race a dealer websocket reset and surface asSpircError::FailedDealerSetup.Fix
Drop the
break;so the arm logs and continues, identical in shape to the seven sibling arms in the sameselect!. The connection_id is delivered through a long-lived dealer subscription tohm://pusher/v1/connections/, and the pusher re-sends a fresh connection_id on every dealer reconnect — so a transient failure is retried on the next update instead of taking down the whole controller.Real-world impact
A single transient
failed to put connect state for new deviceleft araspotifyinstance wedged — process healthy,systemctlreporting the unitactive, discovery socket still listening, but gone from Connect — for 13 days before it was noticed and restarted.Notes
breakand adds separate reconnect machinery). This can land as a small standalone robustness fix.connect/src/spirc.rshas no test coverage aroundrun(), so there is no unit test to update; verified by reproducing the wedge against a live instance.🤖 Generated with Claude Code