Skip to main content

ai-proxy

The ai-proxy plugin simplifies access to LLM providers and models by transforming plugin configurations into the designated request format.

The plugin currently supports transforming plugin configurations to the request format required by OpenAI. Transformation to other request formats for more LLM providers will be supported soon.

note

The full plugin feature is currently only available in Enterprise and will be made available in open-source APISIX in 3.12.0 release.

Examples

The following examples will be using OpenAI as the upstream service provider. Before proceeding, create an OpenAI account and an API key. You can optionally save the key to an environment variable as such:

export OPENAI_API_KEY=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   # replace with your API key

If you are working with other LLM providers, please refer to the provider's documentation to obtain an API key.

Proxy to OpenAI

The following example demonstrates how you can configure the API key, model, and other parameters in the ai-proxy plugin and configure the plugin on a route to proxy user prompts to OpenAI.

Create a route and configure the ai-proxy plugin as such:

curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"model": {
"provider": "openai",
"name": "gpt-4",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"passthrough": false
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org": 1
}
}
}'

❶ Attach OpenAI API key in the header.

❷ Specify the name of the model.

❸ Specify the maximum number of tokens in the generated response.

❹ Specify the temperature used for sampling, which controls the randomness of the output.

❺ Do not relay the response from LLM to the upstream service. The upstream node can be set to any arbitrary value.

❻ Since passthrough is set to false, this upstream node can be set to any value.

Send a POST request to the route with a system prompt and a sample user question in the request body:

curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-H "Host: api.openai.com:443" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "What is 1+1?" }
]
}'

You should receive a response similar to the following:

{
...,
"model": "gpt-4-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "1+1 equals 2.",
"refusal": null
},
"logprobs": null,
"finish_reason": "stop"
}
],
...
}

Proxy to Azure OpenAI

The following example demonstrates how you can configure the ai-proxy plugin to proxy requests to other LLM services, such as Azure OpenAI.

Obtain the Azure OpenAI API key and save it to an environement variable:

export AZ_OPENAI_API_KEY=57cha9ee8e8a89a12c0aha174f180f4   # replace with your API key

Create a route and configure the ai-proxy plugin as such:

curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything*",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"api-key": "'"$AZ_OPENAI_API_KEY"'"
}
},
"model": {
"provider": "openai",
"name": "gpt-4",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"override": {
"endpoint": "https://api7-auzre-openai.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-02-15-preview"
}
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org": 1
}
}
}'

❶ Update with your Azure OpenAI API key.

❷ Override the default OpenAI URL with the Azure OpenAI URL.

Send a POST request to the route with a sample question in the request body:

curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "system",
"content": "You are an AI assistant that helps people find information."
},
{
"role": "user",
"content": "Write me a 50-word introduction for Apache APISIX."
}
],
"max_tokens": 800,
"temperature": 0.7,
"frequency_penalty": 0,
"presence_penalty": 0,
"top_p": 0.95,
"stop": null
}'

You should receive a response similar to the following:

{
"choices": [
{
...,
"message": {
"content": "Apache APISIX is a modern, cloud-native API gateway built to handle high-performance and low-latency use cases. It offers a wide range of features, including load balancing, rate limiting, authentication, and dynamic routing, making it an ideal choice for microservices and cloud-native architectures.",
"role": "assistant"
}
}
],
...
}

Forward LLM Response to Upstream Service

The following example demonstrates how you can configure the ai-proxy plugin to proxy requests to the default OpenAI model and forward model responses to upstream services.

Create a route and configure the ai-proxy plugin as such:

curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"model": {
"provider": "openai",
"name": "gpt-4"
},
"passthrough": true
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org": 1
}
}
}'

❶ Relay the response from LLM to the upstream service.

Send a POST request to the route with a system prompt and a sample user question in the request body:

curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-H "Host: api.openai.com:443" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "What is 1+1?" }
]
}'

You should receive a response similar to the following, showing the model response is forwarded to the upstream service:

{
"args": {},
"data": "{\n \"id\": \"chatcmpl-AeeWFN1jRjXcYhouHr0lK523fGu9h\",\n \"object\": \"chat.completion\",\n \"created\": 1734252460,\n \"model\": \"gpt-4-0613\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"1+1 is 2.\",\n \"refusal\": null\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 23,\n \"completion_tokens\": 7,\n \"total_tokens\": 30,\n \"prompt_tokens_details\": {\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\": {\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\": 0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"system_fingerprint\": null\n}\n",
"files": {},
"form": {},
"headers": {
"Accept": "*/*",
"Content-Length": "736",
"Content-Type": "application/json",
"Host": "api.openai.com",
"User-Agent": "curl/8.6.0",
"X-Amzn-Trace-Id": "Root=1-675e96d2-773710411aec134e3f4dda43",
"X-Forwarded-Host": "api.openai.com"
},
"json": {
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "1+1 is 2.",
"refusal": null,
"role": "assistant"
}
}
],
"created": 1734252460,
"id": "chatcmpl-AeeWFN1jRjXcYhouHr0lK523fGu9h",
"model": "gpt-4-0613",
"object": "chat.completion",
"system_fingerprint": null,
"usage": {
"completion_tokens": 7,
"completion_tokens_details": {
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"reasoning_tokens": 0,
"rejected_prediction_tokens": 0
},
"prompt_tokens": 23,
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0
},
"total_tokens": 30
}
},
"method": "POST",
"origin": "192.168.65.1, xx.xx.xx.xx",
"url": "http://api.openai.com/anything"
}

API7.ai Logo

API Management for Modern Architectures with Edge, API Gateway, Kubernetes, and Service Mesh.

Product

API7 Cloud

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2025. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the

Apache Software Foundation