Bound all unbounded retry and polling loops; validate LLM-built camera tree#57
Open
fviolette26 wants to merge 1 commit into
Open
Bound all unbounded retry and polling loops; validate LLM-built camera tree#57fviolette26 wants to merge 1 commit into
fviolette26 wants to merge 1 commit into
Conversation
…a tree Several failure paths previously hung forever instead of erroring: - utils download_image/download_video used bare @Retry (retry forever, no wait) around requests.get with no timeout: an expired signed URL became an infinite hot loop. Now: 3 attempts, exponential backoff, fail-fast on 4xx, connect/read timeouts. - The doubao-seedance client retried task creation every second forever on any exception (including auth errors, which it never surfaced) and polled with no deadline. The omni client had the same create loop and an unbounded default poll. Both now check HTTP status, fail fast on 4xx, bound create retries, and default the poll deadline (300 polls). - Event/scene extraction looped on the LLM-asserted is_last flag with no cap, so a model that never set it spent tokens without bound. The extractor and both pipeline loops now abort at a hard cap. - construct_camera_tree accepted whatever parent graph the LLM emitted; a cycle deadlocked frame generation forever (cameras awaiting events only their descendants would set). The tree is now validated for length, unknown parents, self-parents, and cycles. Also removes a duplicated parent_shot_idx assignment. Adds regression tests for every bound. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Several failure paths previously hung forever (or spent tokens without bound) instead of surfacing an error:
utils/image.py/utils/video.py—download_image/download_videoused bare@retry(tenacity: retry forever, zero wait) around arequests.getwith no timeout, so an expired signed URL became an infinite hot loop. Now: 3 attempts, exponential backoff, fail-fast on 4xx, connect/read timeouts (shared policy inutils/retry.py).tools/video_generator_doubao_seedance_yunwu_api.py— task creation retried every second forever on any exception (a bad API key never surfaced; the exception wasn't even included in the log message), and polling had no deadline. Now checks HTTP status, fails fast on 4xx, bounds create retries, and polls with a deadline plus a consecutive-error cap.tools/video_generator_omni_yunwu_api.py— same unbounded create loop, fixed the same way; the poll deadline now defaults to 300 instead of unboundedNone.agents/event_extractor.py/pipelines/novel2movie_pipeline.py— event/scene extraction looped on the LLM-assertedis_lastflag with no cap; a model that never sets it spent tokens forever. Hard caps now abort with a clear error.agents/camera_image_generator.py—construct_camera_treeaccepted whatever parent graph the LLM emitted; a cycle deadlocked frame generation forever (cameras awaiting events only their descendants would set), with no error. The tree is now validated for length, unknown parents, self-parents, and cycles. Also removes a duplicatedparent_shot_idxassignment.Test plan
uv run --with pytest python -m pytest tests/— 116 passed (102 existing + 14 new intests/test_hang_guards.py). The new tests' fakes succeed after N calls, so the fixed code must give up before the fake would have succeeded — they fail fast against the old behavior instead of hanging.🤖 Generated with Claude Code