Parameters
See plugin common configurations for configuration options available to all plugins.
In API7 Enterprise (from 3.8.17), you should configure one of the following parameter sets, but not both:
rules- Any combination of
limitandtime_windowand/orinstances
rules is not available in APISIX yet.
limit
integer | string
vaild vaule:
greater than 0
The maximum number of tokens allowed to consume within a given time interval.
In API7 Enterprise (from 3.8.17), this parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$). In APISIX, only the integer type is supported.time_window
integer | string
vaild vaule:
greater than 0
The time interval corresponding to the rate limiting
limitin seconds.In API7 Enterprise (from 3.8.17), this parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$). In APISIX, only the integer type is supported.show_limit_quota_header
boolean
default:
trueIf true, include
X-AI-RateLimit-Limit-*to show the total quota,X-AI-RateLimit-Remaining-*to show the remaining quota in the response header, andX-AI-RateLimit-Reset-*to show the number of seconds left for the counter to reset, where*is the instance name.limit_strategy
string
default:
total_tokensvaild vaule:
total_tokens,prompt_tokens, orcompletion_tokensType of token to apply rate limiting.
total_tokens,prompt_tokens, andcompletion_tokensvalues are returned in each model response, wheretotal_tokensis the sum ofprompt_tokensandcompletion_tokens.instances
array[object]
LLM instance rate limiting configurations.
name
string
required
Name of the LLM service instance.
limit
integer | string
required
vaild vaule:
greater than 0
The maximum number of tokens allowed to consume within a given time interval.
In API7 Enterprise (from 3.8.17), this parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$). In APISIX, only the integer type is supported.time_window
integer | string
required
vaild vaule:
greater than 0
The time interval corresponding to the rate limiting
limitin seconds.In API7 Enterprise (from 3.8.17), this parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$). In APISIX, only the integer type is supported.
rejected_code
integer
default:
503vaild vaule:
between 200 and 599 inclusive
The HTTP status code returned when a request exceeding the quota is rejected.
rejected_msg
string
vaild vaule:
any non-empty string
The response body returned when a request exceeding the quota is rejected.
rules
array[object]
An array of rate-limiting rules to be applied simultaneously.
Available in API7 Enterprise from 3.8.17. Not available in APISIX yet.
count
integer | string
required
vaild vaule:
greater than 0
The maximum number of requests allowed within a given time interval.
This parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$).time_window
integer | string
required
vaild vaule:
greater than 0
The time interval corresponding to the rate limiting
countin seconds.This parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$).key
string
required
The key to count requests by. If the configured key does not exist, the rule will not be executed.
If the
key_typeisvar, thekeyis interpreted as a variable. The variable does not need to be prefixed by a dollar sign ($). See built-in variables for available variables.If the
key_typeisvar_combination, thekeyis interpreted as a combination of variables. All variables should be prefixed by dollar signs ($). For example, to configure thekeyto use a combination of two request headerscustom-aandcustom-b, thekeyshould be configured as$http_custom_a $http_custom_b.If the
key_typeisconstant, thekeyis interpreted as a constant value.