Alibaba Cloud AI Guardrails
Alibaba Cloud AI Guardrails keep risk-level moderation policy in Alibaba Cloud, and AISIX applies the moderation decision to gateway traffic. Requests can be checked before they reach the upstream model, and responses can be checked before they reach callers.
In this guide, you will create an Alibaba Cloud AI Guardrails resource, send one allowed request, and send one blocked request that AISIX rejects before it reaches the upstream model.
Prerequisites
Before starting, prepare the following:
- A self-hosted AISIX gateway with the admin and proxy listeners available.
- The admin key from the gateway
config.yaml. - A working model alias and caller API key that can send chat-completions requests.
- Alibaba Cloud AI Guardrails activated in the region you want to use.
- An Alibaba Cloud AccessKey ID and AccessKey secret allowed to call the
TextModerationPlusAPI action.
Create the Alibaba Cloud Guardrail Resource
Create an Alibaba Cloud AI Guardrails resource in AISIX:
curl -sS -X POST "http://127.0.0.1:3001/admin/v1/guardrails" \
-H "Authorization: Bearer YOUR_ADMIN_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "aliyun-review",
"enabled": true,
"hook_point": "both",
"fail_open": false,
"kind": "aliyun_text_moderation",
"region": "cn-shanghai",
"access_key_id": "YOUR_ALIBABA_CLOUD_ACCESS_KEY_ID",
"access_key_secret": "YOUR_ALIBABA_CLOUD_ACCESS_KEY_SECRET",
"risk_level_threshold": "medium",
"timeout_ms": 3000
}'
❶ Set fail_open to false when strict input enforcement is more important than availability. If Alibaba Cloud AI Guardrails fails or times out, AISIX blocks the request.
❷ Use risk_level_threshold to decide which Alibaba Cloud risk levels block traffic. AISIX blocks returned levels at or above the configured threshold.
❸ Use timeout_ms to bound how long the request waits for the guardrail decision.
For output checks, output_fail_open controls the same tradeoff. It defaults to false, so AISIX does not release unscanned model output during an Alibaba Cloud outage.
Copy the returned guardrail ID if you want to inspect, update, or delete the resource later.
Verify Allowed Traffic
Send a benign request through AISIX:
curl -sSi -X POST "http://127.0.0.1:3000/v1/chat/completions" \
-H "Authorization: Bearer YOUR_CALLER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-prod",
"messages": [{"role": "user", "content": "What is the capital of France?"}]
}'
A successful response starts with HTTP/1.1 200 OK and returns an OpenAI-compatible chat-completions body.
Verify Blocked Traffic
For a repeatable check, use content that your Alibaba Cloud AI Guardrails configuration classifies at or above the configured AISIX threshold.
Send a request containing that content:
curl -sSi -X POST "http://127.0.0.1:3000/v1/chat/completions" \
-H "Authorization: Bearer YOUR_CALLER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-prod",
"messages": [{
"role": "user",
"content": "YOUR_POLICY_VIOLATING_TEXT"
}]
}'
A blocked response starts with HTTP/1.1 422 Unprocessable Entity and includes an OpenAI-compatible error:
{
"error": {
"message": "request blocked by content policy (guardrail 'aliyun-review')",
"type": "content_filter"
}
}
AISIX blocks the request before dispatching to the upstream model when Alibaba Cloud AI Guardrails returns a blocking risk level.
Tune Risk Threshold
Alibaba Cloud returns a risk level for each moderation decision. Alibaba Cloud determines the level from its moderation result and configured score thresholds. AISIX does not recalculate it.
AISIX can block the low, medium, or high levels:
| Threshold | Effect |
|---|---|
high | Blocks only high-risk verdicts. This is the default. |
medium | Blocks medium-risk and high-risk verdicts. |
low | Blocks low-risk, medium-risk, and high-risk verdicts. |
Use a stricter threshold when the application should reject more uncertain content, or a less strict threshold when you want Alibaba Cloud to block only the strongest matches.
Next Steps
You have now enforced Alibaba Cloud AI Guardrails through AISIX. Review Where Guardrails Run if you need to change whether the guardrail checks requests, responses, or both.