Skip to content

chore(tests): simplify GracefulShutdownApp#963

Draft
efiop wants to merge 3 commits intomainfrom
ruslan/serv-579-fix-flaky-test_graceful_shutdown-and-test_worker_env_vars
Draft

chore(tests): simplify GracefulShutdownApp#963
efiop wants to merge 3 commits intomainfrom
ruslan/serv-579-fix-flaky-test_graceful_shutdown-and-test_worker_env_vars

Conversation

@efiop
Copy link
Copy Markdown
Contributor

@efiop efiop commented Apr 8, 2026

No description provided.

@linear
Copy link
Copy Markdown

linear bot commented Apr 8, 2026

@efiop
Copy link
Copy Markdown
Contributor Author

efiop commented Apr 9, 2026

@claude review

@claude
Copy link
Copy Markdown

claude bot commented Apr 9, 2026

Claude finished @efiop's task in 1m 30s —— View job


Code Review

  • Gather context
  • Read changed file and diff
  • Analyze changes
  • Post review feedback

This is a solid improvement for fixing the flaky tests. The old approach was inherently racy: it waited fixed sleep times (2s or 60s), then used file system writes (/data/teardown/{uuid}.txt) + an HTTP endpoint to read results, creating multiple failure points. The new approach uses log polling, which is much more robust.

What changed

GracefulShutdownApp (app-side):

  • Removed the /latest-request-id endpoint and the file-based IPC mechanism (/data/teardown/)
  • teardown() now emits a deterministic log marker: graceful-shutdown-marker:{token}:{request_id}:{t/f}
  • Renamed uuidtoken (clearer intent)

graceful_shutdown helper (test-side):

  • Replaced fixed time.sleep(60) / time.sleep(2) + file read with a polling loop against the REST logs API
  • Polls up to 120s (graceful) or 20s (kill) in 2s intervals
  • Identifies the correct teardown result via the token + request_id marker

Issues / nits

1. log_since lookback window

projects/fal/tests/e2e/test_apps.py:1903-1905

log_since = (
    datetime.now(timezone.utc).replace(tzinfo=None) - timedelta(seconds=5)
).isoformat()

The 5-second lookback is computed after the stop/kill command is issued. This should be fine since teardown takes ≥1s (wait_time ≥ 1), but if the API filters logs by timestamp (not streaming), a very fast teardown on a kill could theoretically appear before log_since. Consider computing log_since before submitting the request, not after:

log_since = (datetime.now(timezone.utc).replace(tzinfo=None)).isoformat()
# ... stop_runner / kill_runner ...
marker_prefix = ...

This would fully eliminate the timing concern. Fix this →

2. Stale endpoint/field naming

projects/fal/tests/e2e/test_apps.py:1815-1817

@fal.endpoint("/set-uuid")
async def set_uuid(self, input: SetUUIDInput) -> str:
    self.token = input.uuid

The endpoint (/set-uuid), method (set_uuid), and input field (input.uuid) still use uuid while the stored attribute is now token. This is not a bug (the API contract is preserved), but the mismatch makes the code slightly confusing. Low priority nit.

3. Redundant base_url in httpx.Client

projects/fal/tests/e2e/test_apps.py:1909-1915

with httpx.Client(
    base_url=rest_client.base_url,
    ...
) as client:
    response = client.get(rest_client.base_url + f"/logs/?since={log_since}")

Since the client.get() call always uses an absolute URL (prepending rest_client.base_url manually), the base_url= kwarg in the constructor is never used. This is consistent with the existing pattern in test_traceback_logs so it's not introduced by this PR — just a minor pre-existing smell.

Summary

The core approach is correct and a clear improvement over the fixed-sleep + file-IPC strategy. The only change worth considering before merging is moving log_since before the stop/kill call to avoid any edge-case timing issues. Everything else is nits.

@efiop efiop force-pushed the ruslan/serv-579-fix-flaky-test_graceful_shutdown-and-test_worker_env_vars branch from 84d8707 to 2330578 Compare April 11, 2026 14:33
cursoragent and others added 2 commits April 11, 2026 19:10
Co-authored-by: Ruslan Kuprieiev <kupruser@gmail.com>
Co-authored-by: Ruslan Kuprieiev <kupruser@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants