Rerank
Rerank requests reorder candidate documents for a query before an application uses those documents in search, retrieval, or RAG workflows.
AISIX AI Gateway exposes POST /v1/rerank so rerank traffic can use the same caller API keys, model aliases, upstream credentials, and request-side policy as the rest of the gateway traffic path.
In this guide, you will send a rerank request through AISIX and review the provider requirement for this endpoint.
Prerequisites
Before starting, prepare the following:
- A running AISIX gateway that can serve proxy requests.
- A caller API key that can access the model alias.
- A model alias whose configured provider label is OpenAI, Cohere, or Jina.
Send a Rerank Request
Send the rerank request through the gateway proxy with the AISIX model alias in the request body:
curl -sS -X POST "http://127.0.0.1:3000/v1/rerank" \
-H "Authorization: Bearer YOUR_CALLER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "rerank-prod",
"query": "gateway docs",
"documents": ["doc a", "doc b", "doc c"]
}' \
-o aisix-rerank-response.json
AISIX resolves the model alias, checks the caller API key, runs supported input policy checks on the query and document text, rewrites only the model field to the upstream model ID, and forwards the request to the upstream rerank endpoint.
The response keeps the upstream rerank response shape:
{
"results": [
{
"index": 1,
"relevance_score": 0.95
},
{
"index": 0,
"relevance_score": 0.42
},
{
"index": 2,
"relevance_score": 0.18
}
]
}
Some providers include additional fields such as a response ID, model name, usage, or metadata.
Check that the response contains ranked results:
jq '.results | length' aisix-rerank-response.json
The command should print the number of returned results:
3
Provider Requirement
AISIX accepts rerank requests only when the resolved model is configured with the OpenAI, Cohere, or Jina provider label. These provider paths share the common rerank fields for model, query, and documents. Optional rerank fields are forwarded unchanged and are not normalized across providers.
When the resolved model uses another provider label, AISIX returns 400 before sending the request upstream. This prevents a rerank request from being sent to a provider route that does not use the expected rerank format.
Voyage AI also exposes a rerank API, but its request and response fields differ from the supported rerank format. AISIX needs a dedicated adapter before treating it as compatible.
For Cohere and Jina, configure the provider key base URL for the API root in the provider's reference. AISIX appends the rerank path and avoids duplicating a common API-version segment when the base URL already ends in a version prefix.
Rerank Behavior
AISIX forwards the request body with only the model field rewritten. It does not add chat-completion fields or translate provider-specific rerank parameters.
Input guardrails can inspect the query and document text before AISIX calls the provider. Output guardrails do not inspect reranked response content on this path because the response contains ranking results, not generated text.
Successful rerank responses are returned as upstream bytes with the upstream content type. AISIX parses usage only as a best-effort telemetry step. If usage is missing or uses an unrecognized shape, AISIX still returns the upstream response unchanged.
If a guardrail does not block a request, check whether the configured guardrail can inspect the query or document text. If the request returns 400 before reaching the provider, check the resolved model's provider label. If the upstream returns 404, check the provider key base URL.
Next Steps
You have now seen how AISIX proxies rerank requests and where provider support is intentionally narrow. Next, continue with Provider Passthrough to review the provider-native escape hatch for endpoints AISIX does not model directly.