Skip to content

Feature/per model rate limit budgets#93

Open
Franck-Sorel wants to merge 5 commits intomainfrom
feature/per-model-rate-limit-budgets
Open

Feature/per model rate limit budgets#93
Franck-Sorel wants to merge 5 commits intomainfrom
feature/per-model-rate-limit-budgets

Conversation

@Franck-Sorel
Copy link
Copy Markdown
Collaborator

No description provided.

Allow configuring individual monthly budgets per AI model instead of
a single global budget per billing plan.
Changes:
- Add modelBudgets.overrides.<model> to set custom budgets for specific models
- Fall back to monthlyBudgetUsd for models without override
- Add modelBudgetMicroUsd helper for budget resolution
- Update BackendTrafficPolicy to use per-model budgets
- Update documentation with per-model budget examples
Example:
  rateLimitBudgeting:
    plans:
      free:
        monthlyBudgetUsd: 30
        modelBudgets:
          overrides:
            gpt-5-mini: 10
            gemini-2.5-pro: 50
@Franck-Sorel Franck-Sorel force-pushed the feature/per-model-rate-limit-budgets branch from c0eb625 to f646f14 Compare March 26, 2026 09:22
Remove the requests-per-minute fallback rule that was acting as a burst
guard. The monthly cost-based budget rule is now the only rate limit
control.
Changes:
- Remove rateLimitFallback from values.yaml
- Remove fallback rule from backendtrafficpolicy.yaml
- Update documentation to remove fallback references
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make ratelimit budget configurable per models in the values.yaml.

1 participant