Version: 3.10.x

Integrate Azure OpenAI Service

Azure OpenAI Service provides access to OpenAI models through Azure's enterprise infrastructure. This guide shows how to route traffic to Azure OpenAI through API7 Gateway using the ai-proxy plugin.

Prerequisites

Install Docker.
Install cURL to send requests to the services for validation.
Have a running API7 Gateway instance.
Have an Azure account with an Azure OpenAI Service resource deployed.

Create a token from the Dashboard and save it to an environment variable:

export API_KEY=your-dashboard-token   # replace with your Dashboard token

Replace {gateway_group_id} with your gateway group ID. Use default if you are following the quickstart.
If you are following the Admin API examples, create or reuse a service in API7 Gateway. If you do not have one yet, follow Create or Reuse a Service, then save its ID to an environment variable:
```
export SERVICE_ID=your-service-id         # replace with your service ID
```

Configure Azure Authentication

Obtain the API key and endpoint from your Azure OpenAI resource. Save the key to an environment variable:

export AZ_OPENAI_API_KEY=your-azure-api-key   # replace with your Azure OpenAI API key

Azure OpenAI uses the api-key header for authentication (not the Authorization header used by OpenAI).

Configure the AI Proxy for Azure

Create a route with the ai-proxy plugin:

Admin API
ADC

curl -k "https://localhost:7443/apisix/admin/routes?gateway_group_id={gateway_group_id}" -X PUT \
  -H "X-API-KEY: ${API_KEY}" \
  -d '{
    "id": "azure-openai-route",
    "service_id": "'"$SERVICE_ID"'",
    "paths": ["/azure-openai"],
    "plugins": {
      "ai-proxy": {
        "provider": "azure-openai",
        "auth": {
          "header": {
            "api-key": "'"$AZ_OPENAI_API_KEY"'"
          }
        },
        "options": {
          "model": "gpt-4"
        },
        "override": {
          "endpoint": "https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT/chat/completions?api-version=2024-10-21"
        }
      }
    }
  }'

❶ Set the provider to azure-openai.

❷ Attach the Azure API key using the api-key header.

❸ Specify the full Azure OpenAI endpoint, including your resource name, deployment name, and API version.

adc.yaml
services:
  - name: Azure OpenAI Service
    routes:
      - uris:
          - /azure-openai
        name: azure-openai-route
        plugins:
          ai-proxy:
            provider: azure-openai
            auth:
              header:
                api-key: ${AZ_OPENAI_API_KEY}
            options:
              model: gpt-4
            override:
              endpoint: https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT/chat/completions?api-version=2024-10-21

❶ Set the provider to azure-openai.

❷ Attach the Azure API key using the api-key header.

❸ Specify the full Azure OpenAI endpoint, including your resource name, deployment name, and API version.

Synchronize the configuration to API7 Gateway:

adc sync -f adc.yaml

Multi-Model Routing Across Deployments

Use ai-proxy-multi to route traffic across multiple Azure regions or deployment versions for high availability:

Admin API
ADC

curl -k "https://localhost:7443/apisix/admin/routes?gateway_group_id={gateway_group_id}" -X PUT \
  -H "X-API-KEY: ${API_KEY}" \
  -d '{
    "id": "azure-multi-route",
    "service_id": "'"$SERVICE_ID"'",
    "paths": ["/azure-openai"],
    "plugins": {
      "ai-proxy-multi": {
        "instances": [
          {
            "name": "east-us",
            "provider": "azure-openai",
            "auth": { "header": { "api-key": "'"$AZ_OPENAI_API_KEY"'" } },
            "options": { "model": "gpt-4" },
            "override": { "endpoint": "https://my-resource-eastus.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-10-21" },
            "weight": 1,
            "priority": 2
          },
          {
            "name": "west-europe",
            "provider": "azure-openai",
            "auth": { "header": { "api-key": "'"$AZ_OPENAI_API_KEY_EU"'" } },
            "options": { "model": "gpt-4" },
            "override": { "endpoint": "https://my-resource-westeu.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-10-21" },
            "weight": 1,
            "priority": 1
          }
        ]
      }
    }
  }'

❶ Define the primary instance in the East US region.

❷ Set priority to 2 (highest) so this instance is used first.

❸ Set priority to 1 for the West Europe instance, which serves as a fallback when East US is unavailable.

adc.yaml
services:
  - name: Azure OpenAI Multi-Region
    routes:
      - uris:
          - /azure-openai
        name: azure-multi-route
        plugins:
          ai-proxy-multi:
            instances:
              - name: east-us
                provider: azure-openai
                auth:
                  header:
                    api-key: ${AZ_OPENAI_API_KEY}
                options:
                  model: gpt-4
                override:
                  endpoint: https://my-resource-eastus.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-10-21
                weight: 1
                priority: 2
              - name: west-europe
                provider: azure-openai
                auth:
                  header:
                    api-key: ${AZ_OPENAI_API_KEY_EU}
                options:
                  model: gpt-4
                override:
                  endpoint: https://my-resource-westeu.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-10-21
                weight: 1
                priority: 1

❶ Define the primary instance in the East US region.

❷ Set priority to 2 (highest) so this instance is used first.

❸ Set priority to 1 for the West Europe instance, which serves as a fallback when East US is unavailable.

Synchronize the configuration to API7 Gateway:

adc sync -f adc.yaml

For more routing strategies, see Multi-LLM Routing and Fallback.

Validate the Configuration

Send a chat completion request:

curl "http://127.0.0.1:9080/azure-openai" -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "system", "content": "You are an AI assistant that helps people find information." },
      { "role": "user", "content": "Write a 50-word introduction for API gateways." }
    ]
  }'

You should receive a response similar to the following:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "API gateways serve as the central entry point for managing, securing, and routing API traffic in modern architectures. They handle authentication, rate limiting, load balancing, and protocol translation, enabling organizations to expose services reliably while maintaining security and observability across distributed systems."
      },
      "index": 0,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 45,
    "total_tokens": 73
  }
}

To enable streaming responses, set "stream": true in the request body. Use the proxy-buffering plugin to disable NGINX proxy_buffering to avoid server-sent events (SSE) being buffered.

Next Steps

You have learned how to route traffic to Azure OpenAI through API7 Gateway. See Azure OpenAI Service REST API reference to learn more.

Multi-LLM Routing and Fallback — Route across Azure regions and providers.
RAG Integration — Use Azure OpenAI with Azure AI Search for retrieval-augmented generation.

Prerequisites​

Configure Azure Authentication​

Configure the AI Proxy for Azure​

Multi-Model Routing Across Deployments​

Validate the Configuration​

Next Steps​