Embeddings

Embeddings convert text into vectors that applications can use for semantic search, retrieval pipelines, clustering, and similarity checks. AISIX AI Gateway lets embedding clients keep the OpenAI-compatible request and response format while the gateway manages caller authentication, model aliases, upstream credentials, and policy.

In this guide, you will send an embeddings request through AISIX and review the provider behavior that matters for this endpoint.

Prerequisites

Before starting, prepare the following:

A running AISIX gateway that can serve proxy requests.
A caller API key that can access the model alias.
A model alias backed by a provider and model that support embeddings.

Send an Embeddings Request

Send the embeddings request through the gateway proxy with the AISIX model alias in the request body:

curl -sS -X POST "http://127.0.0.1:3000/v1/embeddings" \
  -H "Authorization: Bearer YOUR_CALLER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-prod",
    "input": ["hello", "world"]
  }'

AISIX resolves the model alias, checks the caller API key, rewrites the upstream model ID, and forwards the embeddings request to the provider.

The response keeps the OpenAI-compatible embeddings format:

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0123, -0.0456, 0.0789]
    },
    {
      "object": "embedding",
      "index": 1,
      "embedding": [0.0234, -0.0567, 0.0891]
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 2,
    "total_tokens": 2
  }
}

The embedding value may be a float array or a base64 string, depending on the requested encoding format and upstream response.

Provider and Request Behavior

The embeddings route is an OpenAI-compatible proxy path. Provider support depends on the resolved adapter and model you configured.

AISIX accepts embedding input as a single string or an array of strings. It preserves the caller's input shape when it forwards the request upstream, so callers do not need separate client-side logic to switch between a single input and a batch input.

Input guardrails can inspect text in those supported input forms before AISIX calls the provider. Token-array inputs are not currently supported by this gateway route.

The gateway records usage when the upstream returns token usage. Embeddings do not use completion tokens, response caching, streaming, or output guardrails on this proxy path.

Embeddings Behavior

If AISIX returns 501, the resolved provider path does not support embeddings. The request is not translated to another provider format; use a model backed by a provider that supports embeddings.

For batch requests, the response should return one embedding entry per input item. If fewer vectors come back, inspect the upstream response and gateway logs for provider-specific batch handling.

If an input guardrail does not block the request, check whether the input contains inspectable text. AISIX can scan a single string or an array of strings.

Next Steps

You have now seen how AISIX proxies embeddings requests and where provider support can differ. Next, continue with Rerank when your application needs ranking behavior for retrieved documents.

Prerequisites​

Send an Embeddings Request​

Provider and Request Behavior​

Embeddings Behavior​

Next Steps​

Prerequisites

Send an Embeddings Request

Provider and Request Behavior

Embeddings Behavior

Next Steps