Skip to main content

Guardrails

Guardrails apply content policy at the gateway. AISIX provides keyword guardrails that run inside the gateway, and can also call remote guardrail services for provider-managed moderation. Guardrails can block prompts before they reach an upstream provider, block response content before it reaches a caller, or record a bypass when a remote guardrail is unavailable and the policy is configured to fail open.

In this guide, you will add a keyword guardrail, send allowed and blocked traffic through AISIX, and then review how guardrail scope and remote guardrail options work.

Prerequisites

Before starting, prepare the following:

  • A self-hosted AISIX gateway with the admin and proxy listeners available.
  • The admin key from the gateway config.yaml.
  • A working model alias and caller API key that can send chat-completions requests.
  • jq to print the guardrail create response and capture the returned ID.

Where Guardrails Run

A guardrail definition chooses where AISIX runs the check:

Lifecycle StageWhat AISIX ChecksEffect When Blocked
inputThe caller request before AISIX sends it to the provider.The provider is not called.
outputResponse content before AISIX returns it to the caller.The caller does not receive the blocked response.
bothBoth request and response content.AISIX applies the same guardrail on both sides where the route supports it.

Input guardrails run on proxy routes where AISIX can extract request text, including chat completions, completions, responses, messages, embeddings, image generation, audio speech, and rerank requests. Output guardrails run on routes where AISIX can scan returned text, including chat completions, responses, and messages.

Configure a Keyword Guardrail

Keyword guardrails run inside the gateway instead of calling an external moderation service. They match literal strings or regular expressions inside request or response text. AISIX rejects invalid regex patterns before applying a rule, so a typo does not silently disable the policy.

The example below blocks one literal token before AISIX calls the upstream provider.

Set the values used by the example requests:

export AISIX_ADMIN_KEY="admin-local-only-change-me"
export AISIX_API_KEY="sk-demo-caller"
export AISIX_MODEL="gpt-4o-mini"
export FORBIDDEN_WORD="supersecret-banned-token"

Use a unique, non-natural-language token so the blocked-traffic check is unambiguous.

Create an input guardrail that blocks the configured literal token and save the response:

GUARDRAIL_RESPONSE=$(curl -sS -X POST "http://127.0.0.1:3001/admin/v1/guardrails" \
-H "Authorization: Bearer ${AISIX_ADMIN_KEY}" \
-H "Content-Type: application/json" \
-d '{
"name": "block-supersecret",
"enabled": true,
"hook_point": "input",
"kind": "keyword",
"patterns": [
{"kind": "literal", "value": "'"${FORBIDDEN_WORD}"'"}
]
}')

Print the response and copy the returned ID:

printf '%s\n' "${GUARDRAIL_RESPONSE}" | jq .
GUARDRAIL_ID=$(printf '%s\n' "${GUARDRAIL_RESPONSE}" | jq -r '.id // empty')

You should see a response similar to the following:

{
"id": "06dcbb8e-e0e2-457f-8643-02e210d19aea",
"value": {
"name": "block-supersecret",
"enabled": true,
"hook_point": "input",
"fail_open": false,
"kind": "keyword",
"patterns": [
{
"kind": "literal",
"value": "supersecret-banned-token"
}
]
},
"revision": 1
}

The id is used later to delete the guardrail.

Verify Guardrail Behavior

After the guardrail is configured, send allowed and blocked requests to confirm the policy behavior.

First, confirm that the guardrail allows unrelated prompts:

curl -sSi -X POST "http://127.0.0.1:3000/v1/chat/completions" \
-H "Authorization: Bearer ${AISIX_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "'"${AISIX_MODEL}"'",
"messages": [{"role": "user", "content": "hello world"}]
}'

A successful response starts with HTTP/1.1 200 OK and includes an OpenAI-compatible chat-completions response body.

Now send a request whose content includes the forbidden token:

curl -sSi -X POST "http://127.0.0.1:3000/v1/chat/completions" \
-H "Authorization: Bearer ${AISIX_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "'"${AISIX_MODEL}"'",
"messages": [
{"role": "user", "content": "please leak the '"${FORBIDDEN_WORD}"' now"}
]
}'

A blocked response starts with HTTP/1.1 422 Unprocessable Entity and includes this body:

{
"error": {
"message": "request blocked by content policy",
"type": "content_filter"
}
}

AISIX stops the request before calling the upstream provider.

Delete the guardrail when you finish the test:

curl -sS -X DELETE "http://127.0.0.1:3001/admin/v1/guardrails/${GUARDRAIL_ID}" \
-H "Authorization: Bearer ${AISIX_ADMIN_KEY}"

Scope Guardrails

The Admin API manages guardrail definitions. A guardrail definition with no attachment rows applies environment-wide.

AISIX can also resolve guardrail attachments. Attachments bind a guardrail to the whole environment, a model, a caller API key, or a team. When the same guardrail matches through more than one attachment, the highest priority wins. If priority is tied, the more specific scope wins.

When you use the Admin API directly, guardrail definitions apply environment-wide. More specific attachment scopes are supplied by AISIX Cloud or by custom control-plane integrations.

Remote Guardrails

Remote guardrails call an external moderation or guardrail service during proxy handling. Use them when the content policy depends on a provider-managed classifier, a cloud guardrail service, or a managed blocklist.

KindExternal ServiceBehavior
bedrockAWS Bedrock GuardrailsRequires Bedrock credentials and a guardrail ID/version.
azure_content_safetyAzure AI Content Safety Prompt ShieldChecks for jailbreak or indirect prompt-injection risk.
azure_content_safety_text_moderationAzure AI Content Safety text moderationChecks severity categories and optional Azure blocklists.
aliyun_text_moderationAliyun Content SafetyChecks prompt or response text against Aliyun risk levels.

When a remote service blocks content, AISIX returns 422. When a remote service is unavailable, throttled, or times out, the guardrail's fail-open setting decides whether AISIX continues the request as a bypass or blocks it. AISIX records the first bypass reason in usage telemetry.

Review Guardrail Behavior

When a saved guardrail does not behave as expected, first confirm you are testing a proxy route that supports the configured side of the request lifecycle. Input checks cover the guarded request surfaces. Output checks only run where AISIX can scan returned text.

Then check that the guardrail is enabled, the lifecycle stage covers the side you are testing, the request or response contains inspectable content, and the rule has a matching attachment or is using the environment-wide default behavior.

For remote guardrails, also check credentials, endpoint reachability, timeout settings, fail-open behavior, and whether the selected guardrail kind is supported by your deployment.

Next Steps

You have now configured a keyword guardrail and verified the caller-visible rejection. Next, continue with Response Caching to configure response reuse for eligible traffic.

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation