Configure Prompt Decorators
When working with large language models (LLMs) for specialized content generation, it is a common practice to pre-engineer and pre-configure prompts as the “rules of engagement” to shape how the model should operate within desired guidelines and safety standards in the subsequent interactions.
In this document, you will learn how to configure prompt decorators in APISIX using the ai-prompt-decorator
plugin, to prepend and append additional messages to the user-defined message. While the document will be using OpenAI as the sample upstream service, the procedure can be easily adapted to work with other LLM service providers.
Prerequisite(s)
- Install Docker.
- Install cURL to send requests to the services for validation.
- Follow the Getting Started Tutorial to start a new APISIX instance in Docker.
Obtain an OpenAI API Key
Create an OpenAI account and an API key before proceeding. You can optionally save the key to an environment variable as such:
export OPENAI_API_KEY=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 # replace with your API key
Create a Route
In this example, you will prepend a system prompt to instruct the model to answer briefly and conceptually, and append a user prompt to instruct the model to end the answer with a simple analogy.
Create a route to the chat completion endpoint with pre-configured prompt templates:
- Admin API
- Ingress Controller
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-prompt-decorator-route",
"uri": "/anything",
"plugins": {
"ai-proxy": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"options": {
"model": "gpt-4"
}
},
"ai-prompt-decorator": {
"prepend":[
{
"role": "system",
"content": "Answer briefly and conceptually."
}
],
"append":[
{
"role": "user",
"content": "End the answer with a simple analogy."
}
]
}
}
}'
❶ Prepend a system message to set the behavior of the assistant.
❷ Append an additional user message to the user-defined prompt.
- Gateway API
- APISIX CRD
apiVersion: apisix.apache.org/v1alpha1
kind: PluginConfig
metadata:
namespace: ingress-apisix
name: ai-prompt-decor-plugin-config
spec:
plugins:
- name: ai-proxy
config:
provider: openai
auth:
header:
Authorization: "Bearer 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"
options:
model: gpt-4
- name: ai-prompt-decorator
config:
prepend:
- role: system
content: Answer briefly and conceptually.
append:
- role: user
content: End the answer with a simple analogy.
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
namespace: ingress-apisix
name: ai-prompt-decorator-route
spec:
parentRefs:
- name: apisix
rules:
- matches:
- path:
type: Exact
value: /anything
filters:
- type: ExtensionRef
extensionRef:
group: apisix.apache.org
kind: PluginConfig
name: ai-prompt-decor-plugin-config
apiVersion: apisix.apache.org/v2
kind: ApisixRoute
metadata:
namespace: ingress-apisix
name: ai-prompt-decorator-route
spec:
ingressClassName: apisix
http:
- name: ai-prompt-decorator-route
match:
paths:
- /anything
plugins:
- name: ai-proxy
enable: true
config:
provider: openai
auth:
header:
Authorization: "Bearer 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"
options:
model: gpt-4
- name: ai-prompt-decorator
enable: true
config:
prepend:
- role: system
content: Answer briefly and conceptually.
append:
- role: user
content: End the answer with a simple analogy.
❶ Prepend a system message to set the behavior of the assistant.
❷ Append an additional user message to the user-defined prompt.
Apply the configuration to your cluster:
kubectl apply -f prompt-decorator-route.yaml
Verify
Send a POST request to the route with a sample message in the request body:
curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [{ "role": "user", "content": "What is mTLS authentication?" }]
}'
You should receive a response similar to the following:
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "Mutual TLS (mTLS) authentication is a security protocol that ensures both the client and server authenticate each other's identity before establishing a connection. This mutual authentication is achieved through the exchange and verification of digital certificates, which are cryptographically signed credentials proving each party's identity. In contrast to standard TLS, where only the server is authenticated, mTLS adds an additional layer of trust by verifying the client as well, providing enhanced security for sensitive communications.\n\nThink of mTLS as a secret handshake between two friends meeting at a club. Both must know the handshake to get in, ensuring they recognize and trust each other before entering.",
"role": "assistant"
}
}
],
"created": 1723193502,
"id": "chatcmpl-9uFdWDlwKif6biCt9DpG0xgedEamg",
"model": "gpt-4o-2024-05-13",
"object": "chat.completion",
"system_fingerprint": "fp_abc28019ad",
"usage": {
"completion_tokens": 124,
"prompt_tokens": 31,
"total_tokens": 155
}
}
Next Steps
You have now learned how to configure prompt decorators in APISIX when integrating with LLM service providers.
If you would like to integrate with OpenAI's streaming API, you can use the proxy-buffering
plugin to disable NGINX's proxy_buffering
directive to avoid server-sent events (SSE) being buffered.
In addition, you can integrate more capabilities that APISIX offers, such as rate limiting and caching, to improve system availability and user experience.