Guardrails
Guardrails apply content policy at the gateway. AISIX provides keyword guardrails that run inside the gateway, and can also call remote guardrail services for provider-managed moderation. Guardrails can block prompts before they reach an upstream provider, block response content before it reaches a caller, or record a bypass when a remote guardrail is unavailable and the policy is configured to fail open.
In this guide, you will add a keyword guardrail, send allowed and blocked traffic through AISIX, and then review how guardrail scope and remote guardrail options work.
Prerequisites
Before starting, prepare the following:
- A self-hosted AISIX gateway with the admin and proxy listeners available.
- The admin key from the gateway
config.yaml. - A working model alias and caller API key that can send chat-completions requests.
jqto print the guardrail create response and capture the returned ID.
Where Guardrails Run
A guardrail definition chooses where AISIX runs the check:
| Lifecycle Stage | What AISIX Checks | Effect When Blocked |
|---|---|---|
input | The caller request before AISIX sends it to the provider. | The provider is not called. |
output | Response content before AISIX returns it to the caller. | The caller does not receive the blocked response. |
both | Both request and response content. | AISIX applies the same guardrail on both sides where the route supports it. |
Input guardrails run on proxy routes where AISIX can extract request text, including chat completions, completions, responses, messages, embeddings, image generation, audio speech, and rerank requests. Output guardrails run on routes where AISIX can scan returned text, including chat completions, responses, and messages.
Configure a Keyword Guardrail
Keyword guardrails run inside the gateway instead of calling an external moderation service. They match literal strings or regular expressions inside request or response text. AISIX rejects invalid regex patterns before applying a rule, so a typo does not silently disable the policy.
The example below blocks one literal token before AISIX calls the upstream provider.
Set the values used by the example requests:
export AISIX_ADMIN_KEY="admin-local-only-change-me"
export AISIX_API_KEY="sk-demo-caller"
export AISIX_MODEL="gpt-4o-mini"
export FORBIDDEN_WORD="supersecret-banned-token"
Use a unique, non-natural-language token so the blocked-traffic check is unambiguous.
Create an input guardrail that blocks the configured literal token and save the response:
GUARDRAIL_RESPONSE=$(curl -sS -X POST "http://127.0.0.1:3001/admin/v1/guardrails" \
-H "Authorization: Bearer ${AISIX_ADMIN_KEY}" \
-H "Content-Type: application/json" \
-d '{
"name": "block-supersecret",
"enabled": true,
"hook_point": "input",
"kind": "keyword",
"patterns": [
{"kind": "literal", "value": "'"${FORBIDDEN_WORD}"'"}
]
}')
Print the response and copy the returned ID:
printf '%s\n' "${GUARDRAIL_RESPONSE}" | jq .
GUARDRAIL_ID=$(printf '%s\n' "${GUARDRAIL_RESPONSE}" | jq -r '.id // empty')
You should see a response similar to the following:
{
"id": "06dcbb8e-e0e2-457f-8643-02e210d19aea",
"value": {
"name": "block-supersecret",
"enabled": true,
"hook_point": "input",
"fail_open": false,
"kind": "keyword",
"patterns": [
{
"kind": "literal",
"value": "supersecret-banned-token"
}
]
},
"revision": 1
}
The id is used later to delete the guardrail.
Verify Guardrail Behavior
After the guardrail is configured, send allowed and blocked requests to confirm the policy behavior.
First, confirm that the guardrail allows unrelated prompts:
curl -sSi -X POST "http://127.0.0.1:3000/v1/chat/completions" \
-H "Authorization: Bearer ${AISIX_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "'"${AISIX_MODEL}"'",
"messages": [{"role": "user", "content": "hello world"}]
}'
A successful response starts with HTTP/1.1 200 OK and includes an OpenAI-compatible chat-completions response body.
Now send a request whose content includes the forbidden token:
curl -sSi -X POST "http://127.0.0.1:3000/v1/chat/completions" \
-H "Authorization: Bearer ${AISIX_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "'"${AISIX_MODEL}"'",
"messages": [
{"role": "user", "content": "please leak the '"${FORBIDDEN_WORD}"' now"}
]
}'
A blocked response starts with HTTP/1.1 422 Unprocessable Entity and includes this body:
{
"error": {
"message": "request blocked by content policy",
"type": "content_filter"
}
}
AISIX stops the request before calling the upstream provider.
Delete the guardrail when you finish the test:
curl -sS -X DELETE "http://127.0.0.1:3001/admin/v1/guardrails/${GUARDRAIL_ID}" \
-H "Authorization: Bearer ${AISIX_ADMIN_KEY}"
Scope Guardrails
The Admin API manages guardrail definitions. A guardrail definition with no attachment rows applies environment-wide.
AISIX can also resolve guardrail attachments. Attachments bind a guardrail to the whole environment, a model, a caller API key, or a team. When the same guardrail matches through more than one attachment, the highest priority wins. If priority is tied, the more specific scope wins.
When you use the Admin API directly, guardrail definitions apply environment-wide. More specific attachment scopes are supplied by AISIX Cloud or by custom control-plane integrations.
Remote Guardrails
Remote guardrails call an external moderation or guardrail service during proxy handling. Use them when the content policy depends on a provider-managed classifier, a cloud guardrail service, or a managed blocklist.
| Kind | External Service | Behavior |
|---|---|---|
bedrock | AWS Bedrock Guardrails | Requires Bedrock credentials and a guardrail ID/version. |
azure_content_safety | Azure AI Content Safety Prompt Shield | Checks for jailbreak or indirect prompt-injection risk. |
azure_content_safety_text_moderation | Azure AI Content Safety text moderation | Checks severity categories and optional Azure blocklists. |
aliyun_text_moderation | Aliyun Content Safety | Checks prompt or response text against Aliyun risk levels. |
When a remote service blocks content, AISIX returns 422. When a remote service is unavailable, throttled, or times out, the guardrail's fail-open setting decides whether AISIX continues the request as a bypass or blocks it. AISIX records the first bypass reason in usage telemetry.
Review Guardrail Behavior
When a saved guardrail does not behave as expected, first confirm you are testing a proxy route that supports the configured side of the request lifecycle. Input checks cover the guarded request surfaces. Output checks only run where AISIX can scan returned text.
Then check that the guardrail is enabled, the lifecycle stage covers the side you are testing, the request or response contains inspectable content, and the rule has a matching attachment or is using the environment-wide default behavior.
For remote guardrails, also check credentials, endpoint reachability, timeout settings, fail-open behavior, and whether the selected guardrail kind is supported by your deployment.
Next Steps
You have now configured a keyword guardrail and verified the caller-visible rejection. Next, continue with Response Caching to configure response reuse for eligible traffic.