refactor: link video jobs to logs by internal id#2711
Conversation
The video job flow resolved its log row by matching on requestId
(log.findFirst({ requestId })), which can return the wrong record if
request IDs ever collide and forces a separate query everywhere a job
needs its log.
Persist the log's internal id on video_job.logId when the job is
finalized (set atomically with resultLoggedAt), and use it for every
internal job<->log lookup: the API status route now reads job.logId
directly (no query), the gateway content routes use job.logId, and the
log->job reverse lookup matches on video_job.logId. Backfills existing
rows from the current requestId linkage.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughAdds a Changesvideo_job logId column and lookup removal
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0371d77562
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .update(tables.videoJob) | ||
| .set({ | ||
| resultLoggedAt: now, | ||
| logId, |
There was a problem hiding this comment.
Insert the log before assigning its FK
When a terminal video job is finalized, this update writes a freshly generated logId before tx.insert(tables.log) creates that row. The migration adds a non-deferrable FK from video_job.log_id to log.id, so Postgres checks this statement immediately and rejects every first-time finalization with a foreign-key violation; the transaction never reaches the log insert, so completed/failed video jobs are not logged and the rest of finalization cannot run. Insert the log first or assign logId in a second update after the insert.
Useful? React with 👍 / 👎.
| @@ -0,0 +1,4 @@ | |||
| ALTER TABLE "video_job" ADD COLUMN "log_id" text;--> statement-breakpoint | |||
| UPDATE "video_job" SET "log_id" = "log"."id" FROM "log" WHERE "log"."request_id" = "video_job"."request_id";--> statement-breakpoint | |||
There was a problem hiding this comment.
Scope the log backfill to the job project
This backfill matches only on request_id, but video request IDs are accepted from the caller's x-request-id header and are only indexed, not unique, so the same value can exist in multiple projects or orgs. In that case this UPDATE can attach a video_job to an unrelated log; after this change the API and gateway trust job.logId for signed content URLs and log-to-job lookup, so migrated historical jobs can serve or fetch the wrong log's video content. Include at least project_id (and ideally organization_id) in the join.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/db/migrations/1781645725_pink_morlocks.sql (1)
2-2: 💤 Low valueConsider scoping the backfill by
project_idfor accuracy.The current backfill joins only on
request_id, which could link a job to the wrong log if request IDs happen to collide across different projects. Both tables haveproject_id—adding it to the join would make the backfill more precise.Suggested improvement
-UPDATE "video_job" SET "log_id" = "log"."id" FROM "log" WHERE "log"."request_id" = "video_job"."request_id";--> statement-breakpoint +UPDATE "video_job" SET "log_id" = "log"."id" FROM "log" WHERE "log"."request_id" = "video_job"."request_id" AND "log"."project_id" = "video_job"."project_id";--> statement-breakpoint🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/db/migrations/1781645725_pink_morlocks.sql` at line 2, The UPDATE statement in the migration joins the video_job and log tables only on request_id, which could incorrectly match rows if request IDs collide across different projects. Modify the WHERE clause in the UPDATE statement to add an additional AND condition that also matches on project_id from both tables, ensuring the backfill accurately scopes the join to both request_id and project_id.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/db/migrations/1781645725_pink_morlocks.sql`:
- Line 2: The UPDATE statement in the migration joins the video_job and log
tables only on request_id, which could incorrectly match rows if request IDs
collide across different projects. Modify the WHERE clause in the UPDATE
statement to add an additional AND condition that also matches on project_id
from both tables, ensuring the backfill accurately scopes the join to both
request_id and project_id.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: ce017fb7-1381-41af-b238-d5daf31b7766
📒 Files selected for processing (7)
apps/api/src/routes/video.tsapps/gateway/src/videos/videos.tsapps/worker/src/services/video-jobs.tspackages/db/migrations/1781645725_pink_morlocks.sqlpackages/db/migrations/meta/1781645725_snapshot.jsonpackages/db/migrations/meta/_journal.jsonpackages/db/src/schema.ts
Address PR review: - The FK video_job.log_id -> log.id is non-deferrable, so writing logId in the claim update before the log row exists raised an immediate foreign-key violation on first finalization. Insert the log first, then set logId in a second update within the same transaction (the claim update still guards concurrency via resultLoggedAt). - Scope the historical backfill by project_id and organization_id in addition to request_id, since request IDs are caller-supplied and only indexed, not unique across projects/orgs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reset migrations to origin/main per repo procedure; regenerated below.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bf59c77772
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @@ -0,0 +1,4 @@ | |||
| ALTER TABLE "video_job" ADD COLUMN "log_id" text;--> statement-breakpoint | |||
| UPDATE "video_job" SET "log_id" = "log"."id" FROM "log" WHERE "log"."request_id" = "video_job"."request_id" AND "log"."project_id" = "video_job"."project_id" AND "log"."organization_id" = "video_job"."organization_id";--> statement-breakpoint | |||
There was a problem hiding this comment.
Make the backfill ignore ambiguous request IDs
Fresh evidence: this revised backfill now scopes by project and organization, but it still updates from every log row sharing the client-supplied request_id. Since x-request-id is accepted from callers and log.request_id is only indexed, a single project can have multiple matching logs for the same value (for example a retry, or a chat request and a video request reusing the header); PostgreSQL UPDATE ... FROM can then choose an arbitrary log.id. After this change the gateway resolves video content solely through video_job.log_id, so an ambiguous historical backfill can attach the job to the wrong log and make the real video log URL 404 or mark downloads on an unrelated log.
Useful? React with 👍 / 👎.
Summary
Addresses the CodeRabbit concern on
apps/api/src/routes/video.ts: the video status route resolved its log row vialog.findFirst({ where: { requestId } }), an unscoped match onrequestIdthat could return the wrong record. The video job flow did this in several places (getVideoLogIdByRequestId(job.requestId)in the gateway and worker, plus aprojectId + requestIdreverse lookup), each re-querying the log table to recover an id it could have stored.This switches the whole video job flow to a persisted internal id:
video_job.logId(FK →log.id,ON DELETE SET NULL) + index.finalizeVideoJob): setlogIdon the job in the same atomic update that stampsresultLoggedAt, so the link is written exactly when the log is created.job.logIddirectly — no extra query, and already access-checked by organization.job.logId; the log→job reverse lookup matches onvideo_job.logIdinstead ofprojectId + requestId.getVideoLogIdByRequestIdhelper from both the gateway and worker.The migration backfills
logIdfor existing rows from the currentrequestIdlinkage (the only link that exists pre-migration).Testing
pnpm format+pnpm build(17/17) pass.apps/gateway/src/videos/videos.spec.ts: the 15 failing cases fail identically on the base commit (they need live provider credentials); this change is test-neutral.🤖 Generated with Claude Code
Summary by CodeRabbit