Skip to main content

Version: latest

Enforce AI Guardrails and Protect PII

This guide shows how to implement layered AI safety controls with API7 AI Gateway using ai-prompt-guard, ai-aws-content-moderation, and ai-request-rewrite.

Overview

Guardrails are most effective when enforced at the gateway layer, where policies are centralized and applied consistently across applications. A practical defense-in-depth model uses three layers:

  1. Prompt filtering to block prompt injection and disallowed instructions before model invocation.
  2. Content moderation to detect harmful content categories and reject high-risk requests.
  3. PII redaction to mask sensitive data before requests are sent to LLM providers.

Prerequisites

  • Install Docker.
  • Install cURL to send requests to the services for validation.
  • Have a running API7 Enterprise Gateway instance. See the Getting Started Guide for setup instructions.

Prompt Protection

Use ai-prompt-guard to apply PCRE-based allow and deny patterns. In this example, the plugin checks user messages only (match_all_roles: false) and scans only the latest message (match_all_conversation_history: false). When a deny pattern matches, the request is rejected with HTTP 400.

curl "http://127.0.0.1:7080/apisix/admin/routes?gateway_group_id=default" -X PUT \
-H "X-API-KEY: $ADMIN_API_KEY" \
-d '{
"id": "ai-guardrails-prompt-protection",
"service_id": "$SERVICE_ID",
"paths": ["/ai/chat"],
"plugins": {
"ai-prompt-guard": {
"allow_patterns": ["(?i)^(what|how|why|explain|summarize|translate)\\b"],
"deny_patterns": ["(?i)(ignore\\s+all\\s+previous\\s+instructions|reveal\\s+system\\s+prompt|bypass\\s+guardrails)"],
"match_all_roles": false,
"match_all_conversation_history": false
},
"ai-proxy": {
"provider": "openai",
"auth": { "header": { "Authorization": "Bearer '"$OPENAI_API_KEY"'" } },
"options": { "model": "gpt-4o" }
}
}
}'

allow_patterns defines accepted prompt shape using PCRE syntax.

deny_patterns blocks known injection and policy-bypass phrases.

match_all_roles: false and match_all_conversation_history: false scope matching to the latest user message.

For the full configuration reference, see ai-prompt-guard.

Content Moderation

Content moderation adds a second layer of filtering for harmful or abusive text.

AWS Comprehend Integration

Use ai-aws-content-moderation to score six moderation categories with 0-1 thresholds. Requests above configured thresholds can be blocked with a configurable rejection status code and message.

curl "http://127.0.0.1:7080/apisix/admin/routes?gateway_group_id=default" -X PUT \
-H "X-API-KEY: $ADMIN_API_KEY" \
-d '{
"id": "ai-guardrails-content-moderation",
"service_id": "$SERVICE_ID",
"paths": ["/ai/chat"],
"plugins": {
"ai-aws-content-moderation": {
"comprehend": {
"access_key_id": "'"$AWS_ACCESS_KEY_ID"'",
"secret_access_key": "'"$AWS_SECRET_ACCESS_KEY"'",
"region": "us-east-1"
},
"moderation_categories": {
"PROFANITY": 0.5,
"HATE_SPEECH": 0.5,
"INSULT": 0.5,
"HARASSMENT_OR_ABUSE": 0.5,
"SEXUAL": 0.5,
"VIOLENCE_OR_THREAT": 0.5
},
"moderation_threshold": 0.5
},
"ai-proxy": {
"provider": "openai",
"auth": { "header": { "Authorization": "Bearer '"$OPENAI_API_KEY"'" } },
"options": { "model": "gpt-4o" }
}
}
}'

comprehend provides AWS credentials for the Comprehend API. access_key_id, secret_access_key, and region are required.

❷ Configure per-category thresholds for PROFANITY, HATE_SPEECH, INSULT, HARASSMENT_OR_ABUSE, SEXUAL, and VIOLENCE_OR_THREAT.

moderation_threshold defines the global block threshold applied to moderation scores.

For the full configuration reference, see ai-aws-content-moderation.

Custom Moderation Services

If you use a custom moderation stack, you can call a dedicated moderation model with ai-request-rewrite and reject or sanitize content before forwarding to the primary LLM route. This approach is useful when you need custom taxonomy, language coverage, or organization-specific policies.

PII Redaction

PII protection helps prevent accidental exposure of names, phone numbers, account identifiers, and other sensitive fields to external LLM providers. Gateway-side redaction also helps support compliance controls aligned with GDPR, HIPAA, and SOC 2.

Request-Side PII Masking

Use ai-request-rewrite to send incoming prompts to a separate model that detects and masks PII before the request reaches the primary LLM.

curl "http://127.0.0.1:7080/apisix/admin/routes?gateway_group_id=default" -X PUT \
-H "X-API-KEY: $ADMIN_API_KEY" \
-d '{
"id": "ai-guardrails-pii-request",
"service_id": "$SERVICE_ID",
"paths": ["/ai/chat"],
"plugins": {
"ai-request-rewrite": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"options": {
"model": "gpt-4o"
},
"prompt": "Detect and redact PII in the incoming user text. Replace emails with [REDACTED_EMAIL], phone numbers with [REDACTED_PHONE], payment card numbers with [REDACTED_CARD], and government identifiers with [REDACTED_ID]. Return only sanitized text."
},
"ai-proxy": {
"provider": "openai",
"auth": { "header": { "Authorization": "Bearer '"$OPENAI_API_KEY"'" } },
"options": { "model": "gpt-4o" }
}
}
}'

provider sets the model backend used for rewrite decisions.

auth configures credentials for the rewrite model call.

options.model selects the rewrite model; prompt defines masking instructions.

Response-Side PII Filtering

The same ai-request-rewrite pattern can be applied to sanitize model output before it is returned to clients. Use a response-focused rewrite prompt to mask generated PII (for example, names, phone numbers, and IDs that appear in model responses).

{
"ai-request-rewrite": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer YOUR_API_KEY"
}
},
"options": {
"model": "gpt-4o"
},
"prompt": "Review generated text and mask any detected PII before returning it to clients."
}
}

For the full configuration reference, see ai-request-rewrite.

Combining Guardrails

In most deployments, combine the three controls on the same route. A practical execution order is:

  1. Start with ai-prompt-guard to reject obvious prompt injection attempts early.
  2. Apply ai-request-rewrite to sanitize request content and remove PII.
  3. Use ai-aws-content-moderation to score and block harmful content.
  4. Finally, ai-proxy forwards approved traffic to the target LLM.
note

Plugin execution order is determined by plugin priority (phase and internal ordering), not by the order they appear in the configuration. The order listed above reflects the expected runtime behavior based on each plugin's assigned priority.

curl "http://127.0.0.1:7080/apisix/admin/routes?gateway_group_id=default" -X PUT \
-H "X-API-KEY: $ADMIN_API_KEY" \
-d '{
"id": "ai-guardrails-combined",
"service_id": "$SERVICE_ID",
"paths": ["/ai/chat"],
"plugins": {
"ai-prompt-guard": {
"allow_patterns": ["(?i)^.{1,4000}$"],
"deny_patterns": ["(?i)(ignore\\s+all\\s+previous\\s+instructions|reveal\\s+system\\s+prompt|developer\\s+mode)"],
"match_all_roles": false,
"match_all_conversation_history": false
},
"ai-request-rewrite": {
"provider": "openai",
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'"
}
},
"options": {
"model": "gpt-4o"
},
"prompt": "Redact PII from user content before forwarding to the target model."
},
"ai-aws-content-moderation": {
"comprehend": {
"access_key_id": "'"$AWS_ACCESS_KEY_ID"'",
"secret_access_key": "'"$AWS_SECRET_ACCESS_KEY"'",
"region": "us-east-1"
},
"moderation_categories": {
"PROFANITY": 0.5,
"HATE_SPEECH": 0.5,
"INSULT": 0.5,
"HARASSMENT_OR_ABUSE": 0.5,
"SEXUAL": 0.5,
"VIOLENCE_OR_THREAT": 0.5
},
"moderation_threshold": 0.5
},
"ai-proxy": {
"provider": "openai",
"auth": { "header": { "Authorization": "Bearer '"$OPENAI_API_KEY"'" } },
"options": { "model": "gpt-4o" }
}
}
}'

ai-prompt-guard performs first-pass prompt filtering and rejects deny-pattern matches with HTTP 400.

ai-request-rewrite sanitizes prompt content and masks PII before model invocation.

ai-aws-content-moderation provides AWS Comprehend credentials in the comprehend block and enforces toxicity and abuse thresholds before forwarding traffic.

Next Steps

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation