Summary
Operators need proactive notification when token usage approaches budget limits, not just when requests start failing. Webhook alerts at configurable thresholds enable teams to respond before users are impacted.
Problem Statement
Silent budget exhaustion means teams only discover the problem when users receive 429 errors. Without alerts at 75% or 90% usage, there is no opportunity for proactive intervention.
Proposed Solution
Add an alerts option:
app.use(tokenFirewall({
alerts: {
webhookUrl: process.env.ALERT_WEBHOOK_URL,
thresholds: [
{ percent: 75, label: 'warning' },
{ percent: 90, label: 'critical' },
],
cooldownMinutes: 60,
secret: process.env.ALERT_SIGNING_SECRET,
},
}));
Webhook Payload
{
"event": "token_budget_alert",
"label": "warning",
"threshold_percent": 75,
"used": 75430,
"limit": 100000,
"period": "2026-06",
"timestamp": "2026-06-28T03:00:00Z"
}
Sign payloads with HMAC-SHA256 in the X-TokenFirewall-Signature header. Retry delivery up to 3 times on non-2xx responses with exponential backoff.
Acceptance Criteria
Summary
Operators need proactive notification when token usage approaches budget limits, not just when requests start failing. Webhook alerts at configurable thresholds enable teams to respond before users are impacted.
Problem Statement
Silent budget exhaustion means teams only discover the problem when users receive 429 errors. Without alerts at 75% or 90% usage, there is no opportunity for proactive intervention.
Proposed Solution
Add an
alertsoption:Webhook Payload
{ "event": "token_budget_alert", "label": "warning", "threshold_percent": 75, "used": 75430, "limit": 100000, "period": "2026-06", "timestamp": "2026-06-28T03:00:00Z" }Sign payloads with HMAC-SHA256 in the
X-TokenFirewall-Signatureheader. Retry delivery up to 3 times on non-2xx responses with exponential backoff.Acceptance Criteria