Skip to main content

Parameters

See plugin common configurations for configuration options available to all plugins.

  • provider

    string


    required


    vaild vaule:

    openai, deepseek, azure-openai, aimlapi, gemini, vertex-ai, anthropic, openrouter, bedrock, openai-compatible


    LLM service provider.

    When set to openai, the plugin will proxy requests to https://api.openai.com/chat/completions.

    When set to deepseek, the plugin will proxy requests to https://api.deepseek.com/chat/completions.

    When set to gemini (available from APISIX 3.15.0 and Enterprise 3.9.2), the plugin will proxy requests to https://generativelanguage.googleapis.com/v1beta/openai/chat/completions. If you are proxying requests to an embedding model, you should configure the embedding model endpoint in the override.

    When set to vertex-ai (available from APISIX 3.15.0 and Enterprise 3.9.2), the plugin proxies requests to Google Cloud Vertex AI. For chat completions, the plugin will proxy requests to https://{region}-aiplatform.googleapis.com/v1beta1/projects/{project_id}/locations/{region}/endpoints/openapi/chat/completions. For embeddings, the plugin will proxy requests to https://{region}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{region}/publishers/google/models/{model}:predict. These require configuring provider_conf with project_id and region. Alternatively, you can configure override for a custom endpoint.

    When set to anthropic (available from APISIX 3.15.0 and Enterprise 3.9.2), the plugin will proxy requests to https://api.anthropic.com/v1/chat/completions.

    When set to openrouter (available from APISIX 3.15.0 and Enterprise 3.9.2), the plugin will proxy requests to https://openrouter.ai/api/v1/chat/completions.

    When set to bedrock (available from Enterprise 3.9.12, not available in APISIX yet), the plugin proxies requests to AWS Bedrock using the Converse API. Requires configuring auth.aws with IAM credentials and provider_conf.region with the AWS region. Supports both non-streaming and streaming (ConverseStream) when stream is set to true in the request body.

    When set to aimlapi (available from APISIX 3.14.0 and Enterprise 3.8.17), the plugin uses the OpenAI-compatible driver and proxies the request to https://api.aimlapi.com/v1/chat/completions.

    When set to openai-compatible, the plugin proxies requests to the custom endpoint configured in override.

    When set to azure-openai, the plugin also proxies requests to the custom endpoint configured in override and additionally removes the model parameter from user requests.

  • auth

    object


    required


    Authentication configurations.

    • header

      object


      Authentication headers.

    • query

      object


      Authentication query parameters.

    • gcp

      object


      GCP service account authentication for Vertex AI. Available in API7 Enterprise from 3.9.2 and not in APISIX.

      • service_account_json

        string


        GCP service account JSON content used for authentication. This can be configured using this parameter or by setting the GCP_SERVICE_ACCOUNT environment variable.

      • max_ttl

        integer


        Maximum TTL for GCP access token caching, in seconds.

      • expire_early_secs

        integer


        default: 60


        Number of seconds to expire the access token before its actual expiration time. This prevents edge cases where tokens expire during active requests.

    • aws

      object


      AWS IAM credentials for SigV4 signing. Required when provider is bedrock (for Bedrock, auth.aws is sufficient and auth.header/auth.query are not required). Available in API7 Enterprise from version 3.9.12. Not available in APISIX yet.

      • access_key_id

        string


        required


        AWS IAM access key ID.

      • secret_access_key

        string


        required


        AWS IAM secret access key.

      • session_token

        string


        AWS session token for temporary credentials (e.g. from STS AssumeRole).

  • options

    object


    Model configurations.

    In addition to model, you can configure additional parameters and they will be forwarded to the upstream LLM service in the request body. For instance, if you are working with OpenAI, you can configure additional parameters such as temperature, top_p, and stream. See your LLM provider's API documentation for more available options.

    • model

      string


      Name of the LLM model, such as gpt-4 or gpt-3.5. See your LLM provider's API documentation for more available models.

  • provider_conf

    object


    Provider-specific configuration. Required when provider is vertex-ai or bedrock.

    Available in API7 Enterprise from 3.9.2 and not in APISIX.

    • project_id

      string


      Google Cloud Project ID. Required when provider is vertex-ai.

    • region

      string


      required


      Cloud region. For vertex-ai, this is the GCP region. For bedrock, this is the AWS region (e.g. us-east-1).

  • override

    object


    Override setting.

    • endpoint

      string


      LLM provider endpoint. Required when provider is openai-compatible.

    • llm_options

      object


      Provider-aware LLM option overrides. Available in API7 Enterprise from version 3.9.10. Not available in APISIX yet.

      • max_tokens

        integer


        Maximum number of output tokens. The gateway automatically maps this to the correct field name for the target provider (e.g. max_completion_tokens for OpenAI Chat, max_output_tokens for OpenAI Responses API). Always force-overwrites the client value.

    • request_body

      object


      Per target-protocol request body overrides. Keys are target protocol names (openai-chat, openai-responses, openai-embeddings, anthropic-messages); values are partial request bodies that are deep-merged into the outgoing body (objects merged recursively, arrays and scalars replaced wholesale). Available in API7 Enterprise from version 3.9.10. Not available in APISIX yet.

    • request_body_force_override

      boolean


      default: false


      When false (default), client request body fields take priority and request_body override values only fill in missing fields. When true, request_body override values forcefully overwrite client fields. Available in API7 Enterprise from version 3.9.10. Not available in APISIX yet.

  • logging

    object


    Logging configurations. These configurations apply to access logs and logs sent to logging plugins, and do not affect the error log.

    • summaries

      boolean


      default: false


      If true, log request LLM model, duration, request and response tokens.

    • payloads

      boolean


      default: false


      If true, log request and response payload.

  • timeout

    integer


    default: 30000


    vaild vaule:

    between 1 and 600000 inclusive


    Request timeout in milliseconds when requesting the LLM service.

  • keepalive

    boolean


    default: true


    If true, keep the connection alive when requesting the LLM service.

  • keepalive_timeout

    integer


    default: 60000


    vaild vaule:

    greater than or equal to 1000


    Keepalive timeout in milliseconds when requesting the LLM service.

  • keepalive_pool

    integer


    default: 30


    vaild vaule:

    greater than or equal to 1


    Keepalive pool size for when connecting with the LLM service.

  • ssl_verify

    boolean


    default: true


    If true, verify the LLM service's certificate.

  • max_stream_duration_ms

    integer


    Maximum wall-clock duration (in milliseconds) for a streaming AI response. If the upstream keeps sending data past this deadline, the connection is closed. Unset means no cap. Use this to protect the gateway from upstream bugs that produce tokens indefinitely. Available in API7 Enterprise from version 3.9.10. Not available in APISIX yet.

  • max_response_bytes

    integer


    Maximum total bytes read from the upstream for a single AI response (streaming or non-streaming). If exceeded, the connection is closed. Unset means no cap. Available in API7 Enterprise from version 3.9.10. Not available in APISIX yet.

  • streaming_flush_interval_ms

    integer


    default: 10


    Background flush interval in milliseconds for streaming responses. A positive value starts a background thread that flushes output periodically to bound client latency when upstreams burst multiple tokens at once. Set to 0 to flush each chunk synchronously inline. Available in API7 Enterprise from version 3.9.13. Not available in APISIX yet.

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation