Version: 3.13.0

Configure Prompt Decorators

When working with large language models (LLMs) for specialized content generation, it is a common practice to pre-engineer and pre-configure prompts as the “rules of engagement” to shape how the model should operate within desired guidelines and safety standards in the subsequent interactions.

In this document, you will learn how to configure prompt decorators in APISIX using the ai-prompt-decorator plugin, to prepend and append additional messages to the user-defined message. While the document will be using OpenAI as the sample upstream service, the procedure can be easily adapted to work with other LLM service providers.

Prerequisite(s)

Install Docker.
Install cURL to send requests to the services for validation.
Follow the Getting Started Tutorial to start a new APISIX instance in Docker.

Obtain an OpenAI API Key

Create an OpenAI account and an API key before proceeding. You can optionally save the key to an environment variable as such:

export OPENAI_API_KEY=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   # replace with your API key

Create a Route

In this example, you will prepend a system prompt to instruct the model to answer briefly and conceptually, and append a user prompt to instruct the model to end the answer with a simple analogy.

Create a route to the chat completion endpoint with pre-configured prompt templates:

curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
  -H "X-API-KEY: ${ADMIN_API_KEY}" \
  -d '{
    "id": "ai-prompt-decorator-route",
    "uri": "/v1/chat/completions",
    "plugins": {
      "proxy-rewrite": {
        "headers": {
          "set": {
            "Authorization": "Bearer '"$OPENAI_API_KEY"'"
          }
        }
      },
      "ai-prompt-decorator": {
        "prepend":[
          {
            "role": "system",
            "content": "Answer briefly and conceptually."
          }
        ],
        "append":[
          {
            "role": "user",
            "content": "End the answer with a simple analogy."
          }
        ]
      }
    },
    "upstream": {
      "type": "roundrobin",
      "nodes": {
        "api.openai.com:443": 1
      },
      "scheme": "https"
    }
  }'

❶ Configure the OpenAI API key in the proxy-rewrite plugin. Alternatively, you can choose to attach the API key in every client request if you do not wish to configure the key in APISIX.

❷ Prepend a system message to set the behavior of the assistant.

❸ Append additional user message to the user-defined prompt.

Verify

Send a POST request to the route specifying the model and a sample message in the request body:

curl "http://127.0.0.1:9080/v1/chat/completions" -X POST \
  -H "Content-Type: application/json" \
  -H "Host: api.openai.com:443" \
  -d '{
    "model": "gpt-4",
    "messages": [{ "role": "user", "content": "What is mTLS authentication?" }]
  }'

You should receive a response similar to the following:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Mutual TLS (mTLS) authentication is a security protocol that ensures both the client and server authenticate each other's identity before establishing a connection. This mutual authentication is achieved through the exchange and verification of digital certificates, which are cryptographically signed credentials proving each party's identity. In contrast to standard TLS, where only the server is authenticated, mTLS adds an additional layer of trust by verifying the client as well, providing enhanced security for sensitive communications.\n\nThink of mTLS as a secret handshake between two friends meeting at a club. Both must know the handshake to get in, ensuring they recognize and trust each other before entering.",
        "role": "assistant"
      }
    }
  ],
  "created": 1723193502,
  "id": "chatcmpl-9uFdWDlwKif6biCt9DpG0xgedEamg",
  "model": "gpt-4o-2024-05-13",
  "object": "chat.completion",
  "system_fingerprint": "fp_abc28019ad",
  "usage": {
    "completion_tokens": 124,
    "prompt_tokens": 31,
    "total_tokens": 155
  }
}

Next Steps

You have now learned how to configure prompt decorators in APISIX when integrating with LLM service providers.

If you would like to integrate with OpenAI's streaming API, you can use the proxy-buffering plugin to disable NGINX's proxy_buffering directive to avoid server-sent events (SSE) being buffered.

In addition, you can integrate more capabilities that APISIX offers, such as rate limiting and caching, to improve system availability and user experience.

Prerequisite(s)​

Obtain an OpenAI API Key​

Create a Route​

Verify​

Next Steps​

Prerequisite(s)

Obtain an OpenAI API Key

Create a Route

Verify

Next Steps