Skip to main content

Parameters

See plugin common configurations for configuration options available to all plugins.

  • limit

    integer


    vaild vaule:

    greater than 0


    The maximum number of token allowed to consume within a given time interval. At least one of the limit and instances.limit should be configured.

  • time_window

    integer


    vaild vaule:

    greater than 0


    The time interval corresponding to the rate limiting limit in seconds.bAt least one of the time_window and instances.time_window should be configured.

  • show_limit_quota_header

    boolean


    default: true


    If true, include X-AI-RateLimit-Limit-* to show the total quota, X-AI-RateLimit-Remaining-* to show the remaining quota in the response header, and X-AI-RateLimit-Reset-* to show the number of seconds left for the counter to reset, where * is the instance name.

  • limit_strategy

    string


    default: total_tokens


    vaild vaule:

    total_tokens, prompt_tokens, or completion_tokens


    Type of token to apply rate limiting. total_tokens, prompt_tokens, and completion_tokens values are returned in each model response, where total_tokens is the sum of prompt_tokens and completion_tokens.

  • instances

    array[object]


    LLM instance rate limiting configurations.

    • name

      string


      required


      Name of the LLM service instance.

    • limit

      integer


      required


      vaild vaule:

      greater than 0


      The maximum number of token allowed to consume within a given time interval.

    • time_window

      integer


      required


      vaild vaule:

      greater than 0


      The time interval corresponding to the rate limiting limit in seconds.

  • rejected_code

    integer


    default: 503


    vaild vaule:

    between 200 and 599 inclusive


    The HTTP status code returned when a request exceeding the quota is rejected.

  • rejected_msg

    string


    vaild vaule:

    any non-empty string


    The response body returned when a request exceeding the quota is rejected.

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2025. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation