fix(beat): make it dependant from API service#10603
fix(beat): make it dependant from API service#10603
Conversation
|
✅ Conflict Markers Resolved All conflict markers have been successfully resolved in this pull request. |
|
✅ All necessary |
🔒 Container Security ScanImage: 📊 Vulnerability Summary
4 package(s) affected
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #10603 +/- ##
=======================================
Coverage 93.60% 93.60%
=======================================
Files 227 227
Lines 31906 31906
=======================================
Hits 29867 29867
Misses 2039 2039
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
docker-compose.yml
Outdated
| neo4j: | ||
| condition: service_healthy | ||
| healthcheck: | ||
| test: ["CMD-SHELL", "wget -q -O /dev/null http://localhost:${DJANGO_PORT:-8080}/api/v1/ || exit 1"] |
There was a problem hiding this comment.
Changed to 127.0.0.1. localhost can resolve to ::1 (IPv6) in some containers and if gunicorn only listens on IPv4 the healthcheck would fail. Also matches the mcp-server healthcheck.
docker-compose-dev.yml
Outdated
| neo4j: | ||
| condition: service_healthy | ||
| healthcheck: | ||
| test: ["CMD-SHELL", "wget -q -O /dev/null http://localhost:${DJANGO_PORT:-8080}/api/v1/ || exit 1"] |
There was a problem hiding this comment.
Changed to 127.0.0.1. localhost can resolve to ::1 (IPv6) in some containers and if gunicorn only listens on IPv4 the healthcheck would fail. Also matches the mcp-server healthcheck.
There was a problem hiding this comment.
Pull request overview
This PR aims to eliminate cold-start race conditions where worker-beat (and worker) query django_celery_beat_* tables before the API finishes running migrations, replacing a fixed sleep delay with a Compose-level readiness dependency on the API container healthcheck.
Changes:
- Add an
api/api-devcontainer healthcheck that probes/api/v1/. - Update
workerandworker-beat(and dev equivalents) todepends_onthe API service beinghealthy. - Remove the hardcoded
sleep 15fromstart_worker_beat()in the API image entrypoint.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| docker-compose.yml | Adds API healthcheck and gates worker/worker-beat startup on API health. |
| docker-compose-dev.yml | Same as production compose, for local dev stack. |
| api/docker-entrypoint.sh | Removes the sleep timing hack before starting Celery beat. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…rom workers, add changelog
|
Removed the direct |
…ners in Helm worker deployments
|
Aligned |
Context
Relates to #10179
On a cold start,
worker-beatboots in parallel withapiand queriesdjango_celery_beat_*tables beforeapifinishes running migrations. This produces apsycopg2.errors.UndefinedTable/django.db.utils.ProgrammingErrortraceback inworker-beatand a matchingERRORinpostgreslogs on every fresh deployment. The current mitigation inapi/docker-entrypoint.shissleep 15insidestart_worker_beat()— a timing hack that fails on slower hosts (CI runners, cold image pulls, busy disks) where migrations exceed 15 seconds. Reproduced on cold start: tables created at T+18s,worker-beatqueried DB at T+15s in the original logs.Description
Replace the
sleepwith a real readiness signal:worker-beat(andworker) now wait on anapihealthcheck that only goes green once gunicorn binds, which the entrypoint guarantees happens AFTERapply_migrationscompletes.Checklist
Community Checklist
SDK/CLI
UI
API
License
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.