Why
A significant portion of integration recovery time was spent repeatedly mapping known failure signatures to likely causes.
Problem
Known Cloud/DC failure signatures are not documented in a single operator-focused runbook.
Impact
- Slower mean time to diagnosis for recurring failures
- Higher cognitive load for maintainers and future agents
Suggested approach
- Add a concise CI triage runbook section that maps:
- failure signature
- likely root cause
- first checks / next commands
- Cover known patterns observed in recent runs (e.g., Jira DC JQL backend errors, transient version-delete 405 behavior).
- Link from contributor/agent docs so this is used proactively.
Related areas
AGENTS.md
Tests/Integration/README.md
- CI workflow docs under
.github/
Why
A significant portion of integration recovery time was spent repeatedly mapping known failure signatures to likely causes.
Problem
Known Cloud/DC failure signatures are not documented in a single operator-focused runbook.
Impact
Suggested approach
Related areas
AGENTS.mdTests/Integration/README.md.github/