| title | Resume after crash | |
|---|---|---|
| description | Recover a drover run interrupted by Ctrl+C or process failure using DBOS durable checkpoints. | |
| product | drover-orchestrator | |
| audience | platform-operator | |
| doc_type | tutorial | |
| topics |
|
|
| surface | repo-docs |
Drover Orchestrator checkpoints workflow state so an interrupted drover run can continue without redoing finished tasks. This lesson uses a small epic, interrupts mid-run, then resumes.
- Completed First parallel epic (or equivalent:
drover init, tasks in backlog) - An agent CLI installed (Configure agents)
- Optional:
DBOS_SYSTEM_DATABASE_URLfor PostgreSQL-backed durability (Set up PostgreSQL)
cd my-drover-webpage # or your test repo
drover epic add "Crash recovery demo"
drover add "Task A — base change" --epic epic-xxx
drover add "Task B — depends on A" --epic epic-xxx --blocked-by task-aaa
drover add "Task C — depends on A" --epic epic-xxx --blocked-by task-aaaReplace epic-xxx and task-aaa with IDs from command output.
export DROVER_AGENT_TYPE=claude
drover run --workers 2 --epic epic-xxx --verboseWait until at least one task completes and a second task is in progress (or queued).
Press Ctrl+C in the terminal running drover run.
Check partial progress:
drover status --treeYou should see a mix of completed, in_progress, ready, and blocked states.
Re-run the orchestrator on the same epic:
drover run --workers 2 --epic epic-xxxOr use the convenience command (same effect — re-enters the durable run loop):
drover resume- Completed tasks are not re-executed from scratch.
- In-progress tasks may be retried according to claim/stall timeouts (
DROVER_CLAIM_TIMEOUT,DROVER_STALL_TIMEOUTin configuration reference). - When blockers clear, ready tasks run — often two in parallel if
--workers 2.
drover statusAll tasks should reach completed. Confirm Git history:
git log --oneline -5| Storage | Interrupt recovery |
|---|---|
| Default SQLite DBOS | Works on same machine; .drover/ must be intact |
DBOS_SYSTEM_DATABASE_URL |
Survives orchestrator process death; shared across hosts if DB is central |
For CI runners that disappear after each job, use PostgreSQL and a persistent DROVER_DATABASE_URL only if you need shared task state across machines.
- Durable workflows spec
- CLI reference —
drover resume,drover retry - Set up PostgreSQL for production