Budgets
Budgets protect Cloud-managed AI spend at the organization, environment, caller API key, provider key, team, or member level. In AISIX, budget enforcement is a Cloud-managed workflow: AISIX Cloud owns the budget rules and usage totals, and the managed gateway enforces the returned allow or deny decision on live traffic.
This guide explains what budgets can protect, how enforcement affects proxy traffic, and what a caller sees when AISIX rejects a request for budget reasons.
Budget enforcement is supported only through AISIX Cloud managed budget checks. Self-hosted gateways do not support budget enforcement.
Budget Configuration
Configure budget policy in AISIX Cloud. Unlike rate limits, budgets are not created through the Admin API and do not have a local self-hosted enforcement engine.
Organization and environment budgets are created from the Budgets view. Caller API key and provider key budgets use the same enforcement rules, but are edited from the corresponding resource views.
For each budget, choose:
- A budget target, such as an organization, environment, caller API key, provider key, team, member, or each member inside a team.
- A USD spending limit.
- A period: day, week, or month.
- An enforcement mode. Blocking budgets can reject traffic after the limit is reached. Warn-only budgets keep traffic flowing and surface the over-budget state in Cloud.
Budget Targets
Choose the narrowest budget target that matches the spend you want to control:
| Target | Use when |
|---|---|
| Organization | The whole account needs a top-level spending cap. |
| Environment | One deployment environment needs its own cap. |
| Caller API key | One application or tenant needs its own cap. |
| Provider key | One upstream credential or provider account needs a cap. |
| Team | All caller API keys bound to a team should share a cap. |
| Member | All caller API keys bound to one member should share a cap. |
| Each member in a team | Every member inside a team should get an individual cap. |
Caller API key budgets follow the downstream key that authenticates the application or tenant. Provider key budgets follow the upstream credential that AISIX uses to call the model provider.
Team and member budgets depend on the caller API key identity projected to the managed gateway. If a team or member budget does not behave as expected, check the API key's team or member binding in AISIX Cloud.
Enforcement Path
A common setup is to place a blocking monthly budget on one caller API key. The application continues to use the same proxy API and caller API key; it does not call a separate budget API.
As the application sends requests, the managed control plane tracks usage for the matching budget target. Before AISIX sends a provider request, the managed gateway asks the control plane whether the caller may continue. If the budget still has room, AISIX continues through the normal request path. A single allowed request can put the budget over the limit after its usage is recorded; later hard-stop checks reject matching traffic before the provider call and return 429.
Budget Rejection Response
For OpenAI-compatible requests, the response uses the OpenAI-style error envelope:
{
"error": {
"message": "api key budget 'production-chat' exceeded ($1.00/month). Resets 2026-06-01 00:00 UTC.",
"type": "billing_error",
"code": "budget_exceeded",
"scope": "api_key",
"scope_ref": "api-key-uuid-1",
"limit_usd": "1.00",
"spent_usd": "2.00",
"period": "month",
"period_resets_at": "2026-06-01T00:00:00Z",
"retry_after_seconds": 259200
}
}
For Anthropic-style Messages requests, the response keeps the Anthropic error format and does not include the extra budget fields.
Availability and Caching
Managed budget checks use a short decision cache so repeated requests do not always require a control-plane round trip. Fresh cached decisions are reused for 5 seconds.
If the control plane is unreachable, AISIX can reuse a stale cached decision up to the ceiling set by AISIX_DP_BUDGET_STALE_MAX_SECONDS. This is a process environment variable, and the default is 600 seconds. If there is no cached decision and the control plane is unreachable, AISIX denies the request.
Within the stale ceiling, sticky mode keeps using the last cached decision: previously denied keys stay denied, and previously allowed keys continue. After the stale ceiling expires, AISIX applies the outage behavior returned by the control plane:
- Sticky mode refuses traffic because the cached decision is too old.
- Fail-open mode allows traffic.
- Fail-closed mode refuses traffic.
This behavior applies only to managed budget checks.
Budget Usage Metrics
AISIX Cloud tracks budget usage from managed traffic. When the managed budget response includes totals, AISIX also records budget gauges with the caller API key identity. Labels include the API key ID and, when available, the projected team and member IDs.
If the decision does not include budget totals, AISIX clears the budget gauges for that key identity.
Review Budget Rejections
When a managed deployment returns budget_exceeded, check the error code and any structured budget fields first. The denial came from the managed budget-check response, not from the Admin API.
Then check the AISIX Cloud budget configuration for the returned budget target. If the returned scope looks wrong, inspect the caller API key binding that reached the managed gateway.
Next Steps
You have now seen where managed budget decisions come from and how AISIX enforces them on the request path. Next, continue with Rate Limits to configure request, token, and concurrency limits.