Version: 3.10.x

Route Traffic to DeepSeek Models

DeepSeek provides high-performance language models at competitive pricing through an OpenAI-compatible API. API7 Gateway includes a dedicated DeepSeek driver, so you do not need to use the generic openai-compatible provider. This guide shows how to route traffic to DeepSeek through API7 Gateway using the ai-proxy plugin.

Prerequisites

Install Docker.
Install cURL to send requests to the services for validation.
Have a running API7 Gateway instance.

Create a token from the Dashboard and save it to an environment variable:

export API_KEY=your-dashboard-token   # replace with your Dashboard token

Replace {gateway_group_id} with your gateway group ID. Use default if you are following the quickstart.
If you are following the Admin API examples, create or reuse a service in API7 Gateway. If you do not have one yet, follow Create or Reuse a Service, then save its ID to an environment variable:
```
export SERVICE_ID=your-service-id         # replace with your service ID
```

Obtain a DeepSeek API Key

Create an account at platform.deepseek.com and generate an API key. Save the key to an environment variable:

export DEEPSEEK_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxx   # replace with your DeepSeek API key

Configure the AI Proxy for DeepSeek

Create a route with the ai-proxy plugin:

Admin API
ADC

curl -k "https://localhost:7443/apisix/admin/routes?gateway_group_id={gateway_group_id}" -X PUT \
  -H "X-API-KEY: ${API_KEY}" \
  -d '{
    "id": "deepseek-route",
    "service_id": "'"$SERVICE_ID"'",
    "paths": ["/deepseek"],
    "plugins": {
      "ai-proxy": {
        "provider": "deepseek",
        "auth": {
          "header": {
            "Authorization": "Bearer '"$DEEPSEEK_API_KEY"'"
          }
        },
        "options": {
          "model": "deepseek-chat"
        }
      }
    }
  }'

❶ Set the provider to deepseek. This uses the dedicated DeepSeek driver.

❷ Attach the DeepSeek API key in the Authorization header.

❸ Set the model to deepseek-chat. Other available models include deepseek-reasoner.

adc.yaml
services:
  - name: DeepSeek Service
    routes:
      - uris:
          - /deepseek
        name: deepseek-route
        plugins:
          ai-proxy:
            provider: deepseek
            auth:
              header:
                Authorization: "Bearer ${DEEPSEEK_API_KEY}"
            options:
              model: deepseek-chat

❶ Set the provider to deepseek. This uses the dedicated DeepSeek driver.

❷ Attach the DeepSeek API key in the Authorization header.

❸ Set the model to deepseek-chat. Other available models include deepseek-reasoner.

Synchronize the configuration to API7 Gateway:

adc sync -f adc.yaml

Multi-Model Routing with DeepSeek

DeepSeek models offer significantly lower per-token pricing than many alternatives. Use ai-proxy-multi to route most traffic to DeepSeek while keeping a premium provider as fallback:

Admin API
ADC

curl -k "https://localhost:7443/apisix/admin/routes?gateway_group_id={gateway_group_id}" -X PUT \
  -H "X-API-KEY: ${API_KEY}" \
  -d '{
    "id": "deepseek-multi-route",
    "service_id": "'"$SERVICE_ID"'",
    "paths": ["/deepseek"],
    "plugins": {
      "ai-proxy-multi": {
        "fallback_strategy": ["http_429", "http_5xx"],
        "instances": [
          {
            "name": "deepseek-primary",
            "provider": "deepseek",
            "auth": { "header": { "Authorization": "Bearer '"$DEEPSEEK_API_KEY"'" } },
            "options": { "model": "deepseek-chat" },
            "weight": 1,
            "priority": 2
          },
          {
            "name": "openai-fallback",
            "provider": "openai",
            "auth": { "header": { "Authorization": "Bearer '"$OPENAI_API_KEY"'" } },
            "options": { "model": "gpt-4o-mini" },
            "weight": 1,
            "priority": 1
          }
        ]
      }
    }
  }'

❶ fallback_strategy enables automatic failover on HTTP 429 (rate limited) or 5xx (server error).

❷ Set DeepSeek as the primary instance with highest priority.

❸ Set OpenAI as the fallback with lower priority. Traffic routes here only when DeepSeek is unavailable.

adc.yaml
services:
  - name: DeepSeek with Fallback
    routes:
      - uris:
          - /deepseek
        name: deepseek-multi-route
        plugins:
          ai-proxy-multi:
            fallback_strategy:
              - http_429
              - http_5xx
            instances:
              - name: deepseek-primary
                provider: deepseek
                auth:
                  header:
                    Authorization: "Bearer ${DEEPSEEK_API_KEY}"
                options:
                  model: deepseek-chat
                weight: 1
                priority: 2
              - name: openai-fallback
                provider: openai
                auth:
                  header:
                    Authorization: "Bearer ${OPENAI_API_KEY}"
                options:
                  model: gpt-4o-mini
                weight: 1
                priority: 1

❶ fallback_strategy enables automatic failover on HTTP 429 (rate limited) or 5xx (server error).

❷ Set DeepSeek as the primary instance with highest priority.

❸ Set OpenAI as the fallback with lower priority. Traffic routes here only when DeepSeek is unavailable.

Synchronize the configuration to API7 Gateway:

adc sync -f adc.yaml

For more routing strategies, see Multi-LLM Routing and Fallback.

Validate the Configuration

Send a chat completion request:

curl "http://127.0.0.1:9080/deepseek" -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "What is the Fibonacci sequence?" }
    ]
  }'

You should receive a response similar to the following:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "deepseek-chat",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The Fibonacci sequence is a series of numbers where each number is the sum of the two preceding ones, typically starting with 0 and 1: 0, 1, 1, 2, 3, 5, 8, 13, 21, ..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 48,
    "total_tokens": 58
  }
}

To enable streaming responses, set "stream": true in the request body. Use the proxy-buffering plugin to disable NGINX proxy_buffering to avoid server-sent events (SSE) being buffered.

Next Steps

You have learned how to route traffic to DeepSeek through API7 Gateway. See the DeepSeek API documentation for more details about available models.

Multi-LLM Routing and Fallback — Combine DeepSeek with other providers for cost optimization.
Token Rate Limiting — Control spending with token budgets.

Prerequisites​

Obtain a DeepSeek API Key​

Configure the AI Proxy for DeepSeek​

Multi-Model Routing with DeepSeek​

Validate the Configuration​

Next Steps​