ai-proxy
The ai-proxy
plugin simplifies access to LLM and embedding models by transforming plugin configurations into the designated request format. The plugin supports the integration with OpenAI, DeepSeek, and other OpenAI-compatible APIs.
The updated plugin usage in this doc is currently only available in API7 Enterprise and will become available in APISIX in the 3.12.0 release.
Examples
The examples below demonstrate how you can configure ai-proxy
for different scenarios.
Proxy to OpenAI
The following example demonstrates how you can configure the API key, model, and other parameters in the ai-proxy
plugin and configure the plugin on a route to proxy user prompts to OpenAI.
Obtain the OpenAI API key and save it to an environement variable:
export OPENAI_API_KEY=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 # replace with your API key
Create a route and configure the ai-proxy
plugin as such:
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"options":{
"model": "gpt-4"
}
}
}
}'
❶ Specify the provider to be openai
.
❷ Attach OpenAI API key in the api-key
header.
❸ Specify the name of the model.
Send a POST request to the route with a system prompt and a sample user question in the request body:
curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-H "Host: api.openai.com" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "What is 1+1?" }
]
}'
You should receive a response similar to the following:
{
...,
"model": "gpt-4-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "1+1 equals 2.",
"refusal": null
},
"logprobs": null,
"finish_reason": "stop"
}
],
...
}
Proxy to DeepSeek
The following example demonstrates how you can configure the ai-proxy
plugin to proxy requests to DeekSeek.
Obtain the DeekSeek API key and save it to an environement variable:
export DEEPSEEK_API_KEY=sk-5e99f3e26abc40e75d80009a90e66 # replace with your API key
Create a route and configure the ai-proxy
plugin as such:
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"provider": "deepseek",
"auth": {
"header": {
"Authorization": "Bearer '"$DEEPSEEK_API_KEY"'"
}
},
"options": {
"model": "deepseek-chat"
}
}
}
}'
❶ Specify the provider to be deepseek
, so that the plugin will proxy requests to https://api.deepseek.com/chat/completions
.
❷ Attach OpenAI API key in the Authorization
header.
❸ Specify the name of the model.
Send a POST request to the route with a sample question in the request body:
curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "system",
"content": "You are an AI assistant that helps people find information."
},
{
"role": "user",
"content": "Write me a 50-word introduction for Apache APISIX."
}
]
}'
You should receive a response similar to the following:
{
...
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Apache APISIX is a dynamic, real-time, high-performance API gateway and cloud-native platform. It provides rich traffic management features like load balancing, dynamic upstream, canary release, circuit breaking, authentication, observability, and more. Designed for microservices and serverless architectures, APISIX ensures scalability, security, and seamless integration with modern DevOps workflows."
},
"logprobs": null,
"finish_reason": "stop"
}
],
...
}
Proxy to Azure OpenAI
The following example demonstrates how you can configure the ai-proxy
plugin to proxy requests to other LLM services, such as Azure OpenAI.
Obtain the Azure OpenAI API key and save it to an environement variable:
export AZ_OPENAI_API_KEY=57cha9ee8e8a89a12c0aha174f180f4 # replace with your API key
Create a route and configure the ai-proxy
plugin as such:
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"provider": "openai-compatible",
"auth": {
"header": {
"api-key": "'"$AZ_OPENAI_API_KEY"'"
}
},
"options":{
"model": "gpt-4"
},
"override": {
"endpoint": "https://api7-auzre-openai.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-02-15-preview"
}
}
}
}'
❶ Specify the provider to be openai-compatible
, so that the plugin will proxy requests to the custom endpoint in override
.
❷ Attach OpenAI API key in the api-key
header.
❸ Specify the name of the model.
❹ Override with the Azure OpenAI endpoint.
Send a POST request to the route with a sample question in the request body:
curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "system",
"content": "You are an AI assistant that helps people find information."
},
{
"role": "user",
"content": "Write me a 50-word introduction for Apache APISIX."
}
],
"max_tokens": 800,
"temperature": 0.7,
"frequency_penalty": 0,
"presence_penalty": 0,
"top_p": 0.95,
"stop": null
}'
You should receive a response similar to the following:
{
"choices": [
{
...,
"message": {
"content": "Apache APISIX is a modern, cloud-native API gateway built to handle high-performance and low-latency use cases. It offers a wide range of features, including load balancing, rate limiting, authentication, and dynamic routing, making it an ideal choice for microservices and cloud-native architectures.",
"role": "assistant"
}
}
],
...
}
Proxy to Embedding Models
The following example demonstrates how you can configure the ai-proxy
plugin to proxy requests to embedding models. This example will use the OpenAI embedding model endpoint.
Obtain the OpenAI API key and save it to an environement variable:
export OPENAI_API_KEY=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 # replace with your API key
Create a route and configure the ai-proxy
plugin as such:
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/embeddings",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"options":{
"model": "text-embedding-3-small",
"encoding_format": "float"
},
"override": {
"endpoint": "https://api.openai.com/v1/embeddings"
}
}
}
}'
❶ Specify the provider to be openai
, so that the plugin will proxy requests to https://api.openai.com/chat/completions
.
❷ Attach OpenAI API key in the api-key
header.
❸ Specify the name of the embedding model.
❹ Add an additional parameter encoding_format
to configure returned embedding vector to be a list of floating point numbers.
❺ Override the default endpoint with the embedding API endpoint.
Send a POST request to the route with an input string:
curl "http://127.0.0.1:9080/embeddings" -X POST \
-H "Content-Type: application/json" \
-d '{
"input": "hello world"
}'
You should receive a response similar to the following:
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [
-0.0067144386,
-0.039197803,
0.034177095,
0.028763203,
-0.024785956,
-0.04201061,
...
],
}
],
"model": "text-embedding-3-small",
"usage": {
"prompt_tokens": 2,
"total_tokens": 2
}
}