Skip to main content

Parameters

See plugin common configurations for configuration options available to all plugins.

  • fallback_strategy

    string


    default: instance_health_and_rate_limiting


    vaild vaule:

    instance_health_and_rate_limiting


    Fallback strategy. When set, the plugin will check whether the specified instance’s token has been exhausted when a request is forwarded. If so, forward the request to the next instance regardless of the instance priority. When not set, the plugin will not forward the request the request to low priority instances when token of the high priority instance is exhausted.

  • balancer

    object


    Load balancing configurations.

    • algorithm

      string


      default: roundrobin


      vaild vaule:

      roundrobin or chash


      Load balancing algorithm. When set to roundrobin, weighted round robin algorithm is used. When set to chash, consistent hashing algorithm is used.

    • hash_on

      string


      vaild vaule:

      vars, headers, cookie, consumer, or vars_combinations


      Used when type is chash. Support hashing on built-in variables, headers, cookie, consumer, or a combination of built-in variables.

    • key

      string


      Used when type is chash. When hash_on is set to header or cookie, key is required. When hash_on is set to consumer, key is not required as the consumer name will be used as the key automatically.

  • instances

    array[object]


    required


    LLM instance configurations.

    • name

      string


      required


      Name of the LLM service instance.

    • provider

      string


      required


      vaild vaule:

      openai, deepseek, openai-compatible


      LLM service provider. When set to openai, the plugin will proxy the request to api.openai.com. When set to deepseek, the plugin will proxy the request to api.deepseek.com. When set to openai-compatible, the plugin will proxy the request to the custom endpoint configured in override.

    • priority

      integer


      default: 0


      Priority of the LLM instance in load balancing. priority takes precedence over weight.

    • weight

      string


      required


      default: 0


      vaild vaule:

      greater or equal to 0


      Weight of the LLM instance in load balancing.

    • auth

      object


      required


      Authentication configurations.

      • header

        object


        Authentication headers. At least one of the header and query should be configured.

      • query

        object


        Authentication query parameters. At least one of the header and query should be configured.

    • options

      object


      Model configurations.

      In addition to model, you can configure additional parameters and they will be forwarded to the upstream LLM service in the request body. For instance, if you are working with OpenAI or DeepSeek, you can configure additional parameters such as max_tokens, temperature, top_p, and stream. See your LLM provider's API documentation for more available options.

      • model

        string


        Name of the LLM model, such as gpt-4 or gpt-3.5. See your LLM provider's API documentation for more available models.

  • logging

    object


    Logging configurations.

    • summaries

      boolean


      default: false


      If true, log request LLM model, duration, request and response tokens.

    • payloads

      boolean


      default: false


      If true, log request and response payload.

    • override

      object


      Override setting.

      • endpoint

        string


        LLM provider endpoint to replace the default endpoint with. If not configured, the plugin uses the default OpenAI endpoint https://api.openai.com/v1/chat/completions.

    • checks

      object


      Health check configurations.

      Note that at the moment, OpenAI and DeepSeek do not provide an official health check endpoint. Other LLM services that you can configure under openai-compatible provider may have available health check endpoints.

      • active

        object


        required


        Active health check configurations.

        • type

          string


          default: http


          vaild vaule:

          http, https, or tcp


          Type of health check connection.

        • timeout

          number


          default: 1


          Health check timeout in seconds.

        • concurrency

          integer


          default: 10


          Number of upstream nodes to be checked at the same time.

        • host

          string


          HTTP host.

        • port

          integer


          vaild vaule:

          between 1 and 65535 inclusive


          HTTP port.

        • http_path

          string


          default: /


          vaild vaule:

          between 1 and 65535 inclusive


          Path for HTTP probing requests.

        • https_verify_certificate

          boolean


          default: true


          If true, verify the node's TLS certificate.

        • healthy

          object


          Healthy check configurations.

          • interval

            integer


            default: 1


            Time interval of checking healthy nodes, in seconds.

          • http_statuses

            array[integer]


            default: [200,302]


            vaild vaule:

            status code between 200 and 599 inclusive


            An array of HTTP status codes that defines a healthy node.

          • successes

            integer


            default: 2


            vaild vaule:

            between 1 and 254 inclusive


            Number of successful probes to define a healthy node.

        • unhealthy

          object


          Unhealthy check configurations.

          • interval

            integer


            default: 1


            Time interval of checking unhealthy nodes, in seconds.

          • http_statuses

            array[integer]


            default: [429,404,500,501,502,503,504,505]


            vaild vaule:

            status code between 200 and 599 inclusive


            An array of HTTP status codes that defines an unhealthy node.

          • http_failures

            integer


            default: 5


            vaild vaule:

            between 1 and 254 inclusive


            Number of HTTP failures to define an unhealthy node.

          • timeout

            integer


            default: 3


            vaild vaule:

            between 1 and 254 inclusive


            Number of probe timeouts to define an unhealthy node.

  • timeout

    integer


    default: 30000


    vaild vaule:

    greater than or equal to 1


    Request timeout in milliseconds when requesting the LLM service.

  • keepalive

    boolean


    default: true


    If true, keep the conneciton alive when requesting the LLM service.

  • keepalive_timeout

    integer


    default: 60000


    vaild vaule:

    greater than or equal to 1000


    Request timeout in milliseconds when requesting the LLM service.

  • keepalive_pool

    integer


    default: 30


    Keepalive pool size for when connecting with the LLM service.

  • ssl_verify

    boolean


    default: true


    If true, verify the LLM service's certificate.

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2025. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation