Configure Prompt Decorators
When working with large language models (LLMs) for specialized content generation, it is a common practice to pre-engineer and pre-configure prompts as the “rules of engagement” to shape how the model should operate within desired guidelines and safety standards in the subsequent interactions.
In this document, you will learn how to configure prompt decorators in APISIX using the ai-prompt-decorator
plugin, to prepend and append additional messages to the user-defined message. While the document will be using OpenAI as the sample upstream service, the procedure can be easily adapted to work with other LLM service providers.
Prerequisite(s)
- Install Docker.
- Install cURL to send requests to the services for validation.
- Follow the Getting Started Tutorial to start a new APISIX instance in Docker.
Obtain an OpenAI API Key
Create an OpenAI account and an API key before proceeding. You can optionally save the key to an environment variable as such:
export OPENAI_API_KEY=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 # replace with your API key
Create a Route
In this example, you will prepend a system prompt to instruct the model to answer briefly and conceptually, and append a user prompt to instruct the model to end the answer with a simple analogy.
Create a route to the chat completion endpoint with pre-configured prompt templates:
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-prompt-decorator-route",
"uri": "/v1/chat/completions",
"plugins": {
"proxy-rewrite": {
"headers": {
"set": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
}
},
"ai-prompt-decorator": {
"prepend":[
{
"role": "system",
"content": "Answer briefly and conceptually."
}
],
"append":[
{
"role": "user",
"content": "End the answer with a simple analogy."
}
]
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"api.openai.com:443": 1
},
"scheme": "https"
}
}'
❶ Configure the OpenAI API key in the proxy-rewrite
plugin. Alternatively, you can choose to attach the API key in every client request if you do not wish to configure the key in APISIX.
❷ Prepend a system message to set the behavior of the assistant.
❸ Append additional user message to the user-defined prompt.
Verify
Send a POST request to the route specifying the model and a sample message in the request body:
curl "http://127.0.0.1:9080/v1/chat/completions" -X POST \
-H "Content-Type: application/json" \
-H "Host: api.openai.com:443" \
-d '{
"model": "gpt-4",
"messages": [{ "role": "user", "content": "What is mTLS authentication?" }]
}'
You should receive a response similar to the following:
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "Mutual TLS (mTLS) authentication is a security protocol that ensures both the client and server authenticate each other's identity before establishing a connection. This mutual authentication is achieved through the exchange and verification of digital certificates, which are cryptographically signed credentials proving each party's identity. In contrast to standard TLS, where only the server is authenticated, mTLS adds an additional layer of trust by verifying the client as well, providing enhanced security for sensitive communications.\n\nThink of mTLS as a secret handshake between two friends meeting at a club. Both must know the handshake to get in, ensuring they recognize and trust each other before entering.",
"role": "assistant"
}
}
],
"created": 1723193502,
"id": "chatcmpl-9uFdWDlwKif6biCt9DpG0xgedEamg",
"model": "gpt-4o-2024-05-13",
"object": "chat.completion",
"system_fingerprint": "fp_abc28019ad",
"usage": {
"completion_tokens": 124,
"prompt_tokens": 31,
"total_tokens": 155
}
}
Next Steps
You have now learned how to configure prompt decorators in APISIX when integrating with LLM service providers.
If you would like to integrate with OpenAI's streaming API, you can use the proxy-buffering
plugin to disable NGINX's proxy_buffering
directive to avoid server-sent events (SSE) being buffered.
In addition, you can integrate more capabilities that APISIX offers, such as rate limiting and caching, to improve system availability and user experience.