Parameters
See plugin common configurations for configuration options available to all plugins.
In API7 Enterprise (from 3.8.17), you should configure one of the following parameter sets, but not both:
rules- Any combination of
limitandtime_windowand/orinstances
rules is not available in APISIX yet.
limit
integer | string
vaild vaule:
greater than 0
The maximum number of tokens allowed to consume within a given time interval.
In API7 Enterprise (from 3.8.17), this parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$). In APISIX, only the integer type is supported.time_window
integer | string
vaild vaule:
greater than 0
The time interval corresponding to the rate limiting
limitin seconds.In API7 Enterprise (from 3.8.17), this parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$). In APISIX, only the integer type is supported.show_limit_quota_header
boolean
default:
trueIf true, includes the rate limiting response headers. Specifically, if
rulesis not set, the headers are:X-AI-RateLimit-Limitshows the total quota.X-AI-RateLimit-Remainingshows the remaining quota.X-AI-RateLimit-Resetshows the number of seconds until the counter resets.
Whenrulesis set, a prefix (followed by a hyphen) is inserted afterX-AI-. Seerules.header_prefixfor details.limit_strategy
string
default:
total_tokensvaild vaule:
total_tokens,prompt_tokens, orcompletion_tokensType of token to apply rate limiting.
total_tokens,prompt_tokens, andcompletion_tokensvalues are returned in each model response, wheretotal_tokensis the sum ofprompt_tokensandcompletion_tokens.instances
array[object]
LLM instance rate limiting configurations.
name
string
required
Name of the LLM service instance.
limit
integer | string
required
vaild vaule:
greater than 0
The maximum number of tokens allowed to consume within a given time interval.
In API7 Enterprise (from 3.8.17), this parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$). In APISIX, only the integer type is supported.time_window
integer | string
required
vaild vaule:
greater than 0
The time interval corresponding to the rate limiting
limitin seconds.In API7 Enterprise (from 3.8.17), this parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$). In APISIX, only the integer type is supported.
rejected_code
integer
default:
503vaild vaule:
between 200 and 599 inclusive
The HTTP status code returned when a request exceeding the quota is rejected.
rejected_msg
string
vaild vaule:
any non-empty string
The response body returned when a request exceeding the quota is rejected.
policy
string
required
vaild vaule:
local,redis,redis-cluster, orredis-sentinelThe policy for rate limiting counter. Available in API7 Enterprise from version 3.8.19. When upgrading from an earlier version, set this to
local; existing configurations will continue to function. The option is not available in APISIX yet.Set to
localto store the counter in memory locally.Set to
redisto store the counter on a Redis instance.Set to
redis-clusterto store the counter in a Redis cluster.Set to
redis-sentinelto store the counter on the Redis primary node managed by Redis Sentinel, which ensures high availability by automatically promoting a replica to primary in case of failure. Redis Sentinel provides high availability for Redis when not using Redis Cluster.redis_host
string
The address of the Redis node. Required when
policyisredis.redis_port
integer
default:
6379vaild vaule:
greater than or equal to 1
The port of the Redis node when
policyisredis.redis_username
string
The username for Redis if Redis ACL is used. If you use the legacy authentication method
requirepass, configure only theredis_password. Used whenpolicyisredis.redis_password
string
The password of the Redis node when
policyisredis, orredis-cluster.redis_database
integer
default:
0vaild vaule:
greater than or equal to 0
The database number in Redis when
policyisredisorredis-sentinel.redis_ssl
boolean
default:
falseIf true, use SSL to connect to Redis when
policyisredis.redis_ssl_verify
boolean
default:
falseIf true, verify the server SSL certificate when
policyisredis.redis_timeout
integer
default:
1000vaild vaule:
greater than or equal to 1
The Redis timeout value in milliseconds when
policyisredisorredis-cluster.redis_cluster_nodes
array[string]
The list of Redis cluster nodes with at least two addresses. Required when
policyisredis-cluster.redis_cluster_name
string
The name of the Redis cluster. Required when
policyisredis-cluster.redis_cluster_ssl
boolean
default:
falseIf true, use SSL to connect to Redis cluster when
policyisredis-cluster.redis_cluster_ssl_verify
boolean
default:
falseIf true, verify the server SSL certificate when
policyisredis-cluster.redis_sentinels
array[object]
An array of Redis Sentinel nodes (host and port). Required when
policyisredis-sentinel.redis_master_name
string
The name of the Redis master group that Sentinels are monitoring. Required when
policyisredis-sentinel.redis_role
string
default:
mastervaild vaule:
masterorslaveThe Redis node role to connect to. Configurable when
policyisredis-sentinel. Set tomasterto connect to the current Redis master, and set toslaveto connect to a Redis replica.redis_connect_timeout
integer
default:
1000vaild vaule:
greater than or equal to 1
Timeout in milliseconds for establishing a connection to a Redis node. Configurable when
policyisredis-sentinel.redis_read_timeout
integer
default:
1000vaild vaule:
greater than or equal to 1
Timeout in milliseconds for reading data from a Redis node. Configurable when
policyisredis-sentinel.redis_keepalive_timeout
integer
default:
60000vaild vaule:
greater than or equal to 1
Time in milliseconds that an idle Redis connection is kept alive in the connection pool before being closed. Configurable when
policyisredis-sentinel.sentinel_username
string
Username used to authenticate with the Redis Sentinel instance. Configurable when
policyisredis-sentinel.sentinel_password
string
Password used to authenticate with the Redis Sentinel instance. Configurable when
policyisredis-sentinel.allow_degradation
boolean
default:
falseIf true, allow the gateway to continue handling requests without the plugin when the plugin or its dependencies become unavailable.
Available in API7 Enterprise from version 3.8.19. Not available in APISIX yet.
rules
array[object]
An array of rate-limiting rules to be applied simultaneously.
Available in API7 Enterprise from 3.8.17. Not yet available in APISIX.
count
integer | string
required
vaild vaule:
greater than 0
The maximum number of requests allowed within a given time interval.
This parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$).time_window
integer | string
required
vaild vaule:
greater than 0
The time interval corresponding to the rate limiting
countin seconds.This parameter also supports the string data type and allows the use of built-in variables prefixed with a dollar sign (
$).key
string
required
The key to count requests by. If the configured key does not exist, the rule will not be executed.
If the
key_typeisvar, thekeyis interpreted as a variable. The variable does not need to be prefixed by a dollar sign ($). See built-in variables for available variables.If the
key_typeisvar_combination, thekeyis interpreted as a combination of variables. All variables should be prefixed by dollar signs ($). For example, to configure thekeyto use a combination of two request headerscustom-aandcustom-b, thekeyshould be configured as$http_custom_a $http_custom_b.If the
key_typeisconstant, thekeyis interpreted as a constant value.header_prefix
string
Prefix for all rate limiting response headers. Available in API7 Enterprise from version 3.8.19. Not yet available in APISIX.
When configured, the prefix is inserted after
X-AI-in the header name. For example, withheader_prefixset totest, the headers becomeX-AI-Test-RateLimit-Limit,X-AI-Test-RateLimit-Remaining, andX-AI-Test-RateLimit-Reset.When not configured, the index of the rule in the rules array is used as the prefix. For example, headers for the first rule will be
X-AI-1-RateLimit-Limit,X-AI-1-RateLimit-Remaining, andX-AI-1-RateLimit-Reset.