ai-proxy
The ai-proxy
plugin simplifies access to LLM providers and models by transforming plugin configurations into the designated request format.
The plugin currently supports transforming plugin configurations to the request format required by OpenAI. Transformation to other request formats for more LLM providers will be supported soon.
The full plugin feature is currently only available in Enterprise and will be made available in open-source APISIX in 3.12.0 release.
Examples
The following examples will be using OpenAI as the upstream service provider. Before proceeding, create an OpenAI account and an API key. You can optionally save the key to an environment variable as such:
export OPENAI_API_KEY=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 # replace with your API key
If you are working with other LLM providers, please refer to the provider's documentation to obtain an API key.
Proxy to OpenAI
The following example demonstrates how you can configure the API key, model, and other parameters in the ai-proxy
plugin and configure the plugin on a route to proxy user prompts to OpenAI.
Create a route and configure the ai-proxy
plugin as such:
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"model": {
"provider": "openai",
"name": "gpt-4",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"passthrough": false
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org": 1
}
}
}'
❶ Attach OpenAI API key in the header.
❷ Specify the name of the model.
❸ Specify the maximum number of tokens in the generated response.
❹ Specify the temperature used for sampling, which controls the randomness of the output.
❺ Do not relay the response from LLM to the upstream service. The upstream node can be set to any arbitrary value.
❻ Since passthrough
is set to false
, this upstream node can be set to any value.
Send a POST request to the route with a system prompt and a sample user question in the request body:
curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-H "Host: api.openai.com:443" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "What is 1+1?" }
]
}'
You should receive a response similar to the following:
{
...,
"model": "gpt-4-0613",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "1+1 equals 2.",
"refusal": null
},
"logprobs": null,
"finish_reason": "stop"
}
],
...
}
Proxy to Azure OpenAI
The following example demonstrates how you can configure the ai-proxy
plugin to proxy requests to other LLM services, such as Azure OpenAI.
Obtain the Azure OpenAI API key and save it to an environement variable:
export AZ_OPENAI_API_KEY=57cha9ee8e8a89a12c0aha174f180f4 # replace with your API key
Create a route and configure the ai-proxy
plugin as such:
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything*",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"api-key": "'"$AZ_OPENAI_API_KEY"'"
}
},
"model": {
"provider": "openai",
"name": "gpt-4",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
},
"override": {
"endpoint": "https://api7-auzre-openai.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-02-15-preview"
}
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org": 1
}
}
}'
❶ Update with your Azure OpenAI API key.
❷ Override the default OpenAI URL with the Azure OpenAI URL.
Send a POST request to the route with a sample question in the request body:
curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "system",
"content": "You are an AI assistant that helps people find information."
},
{
"role": "user",
"content": "Write me a 50-word introduction for Apache APISIX."
}
],
"max_tokens": 800,
"temperature": 0.7,
"frequency_penalty": 0,
"presence_penalty": 0,
"top_p": 0.95,
"stop": null
}'
You should receive a response similar to the following:
{
"choices": [
{
...,
"message": {
"content": "Apache APISIX is a modern, cloud-native API gateway built to handle high-performance and low-latency use cases. It offers a wide range of features, including load balancing, rate limiting, authentication, and dynamic routing, making it an ideal choice for microservices and cloud-native architectures.",
"role": "assistant"
}
}
],
...
}
Forward LLM Response to Upstream Service
The following example demonstrates how you can configure the ai-proxy
plugin to proxy requests to the default OpenAI model and forward model responses to upstream services.
Create a route and configure the ai-proxy
plugin as such:
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"model": {
"provider": "openai",
"name": "gpt-4"
},
"passthrough": true
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org": 1
}
}
}'
❶ Relay the response from LLM to the upstream service.
Send a POST request to the route with a system prompt and a sample user question in the request body:
curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-H "Host: api.openai.com:443" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "What is 1+1?" }
]
}'
You should receive a response similar to the following, showing the model response is forwarded to the upstream service:
{
"args": {},
"data": "{\n \"id\": \"chatcmpl-AeeWFN1jRjXcYhouHr0lK523fGu9h\",\n \"object\": \"chat.completion\",\n \"created\": 1734252460,\n \"model\": \"gpt-4-0613\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"1+1 is 2.\",\n \"refusal\": null\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"prompt_tokens\": 23,\n \"completion_tokens\": 7,\n \"total_tokens\": 30,\n \"prompt_tokens_details\": {\n \"cached_tokens\": 0,\n \"audio_tokens\": 0\n },\n \"completion_tokens_details\": {\n \"reasoning_tokens\": 0,\n \"audio_tokens\": 0,\n \"accepted_prediction_tokens\": 0,\n \"rejected_prediction_tokens\": 0\n }\n },\n \"system_fingerprint\": null\n}\n",
"files": {},
"form": {},
"headers": {
"Accept": "*/*",
"Content-Length": "736",
"Content-Type": "application/json",
"Host": "api.openai.com",
"User-Agent": "curl/8.6.0",
"X-Amzn-Trace-Id": "Root=1-675e96d2-773710411aec134e3f4dda43",
"X-Forwarded-Host": "api.openai.com"
},
"json": {
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "1+1 is 2.",
"refusal": null,
"role": "assistant"
}
}
],
"created": 1734252460,
"id": "chatcmpl-AeeWFN1jRjXcYhouHr0lK523fGu9h",
"model": "gpt-4-0613",
"object": "chat.completion",
"system_fingerprint": null,
"usage": {
"completion_tokens": 7,
"completion_tokens_details": {
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"reasoning_tokens": 0,
"rejected_prediction_tokens": 0
},
"prompt_tokens": 23,
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0
},
"total_tokens": 30
}
},
"method": "POST",
"origin": "192.168.65.1, xx.xx.xx.xx",
"url": "http://api.openai.com/anything"
}