Skip to main content

Streaming

AISIX AI Gateway can stream proxy responses to clients that expect server-sent events. Streaming keeps the gateway responsibilities in place: AISIX still authenticates the caller API key, resolves the model alias, applies supported policy, and forwards the request to the selected upstream provider.

Streaming endpoints preserve the client-facing stream format for each route, but gateway policy can affect delivery.

In this guide, you will send an OpenAI-compatible streaming request, then review where streaming behavior differs across endpoint families.

Prerequisites

Before starting, prepare the following:

  • A running AISIX gateway that can serve proxy requests.
  • A caller API key that can access the model alias.
  • A model alias backed by a provider and model that support streaming.

Send a Streaming Request

The request keeps the model value as the AISIX model alias and asks the upstream to stream the response:

curl -sS -N -X POST "http://127.0.0.1:3000/v1/chat/completions" \
-H "Authorization: Bearer YOUR_CALLER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-prod",
"stream": true,
"messages": [
{"role": "user", "content": "Stream a short greeting."}
]
}'

The response is an OpenAI-style SSE stream. A direct HTTP client sees data: frames similar to the following:

data: {"id":"***","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"}}]}

data: [DONE]

An OpenAI-compatible SDK reads the same stream through its normal streaming API.

Choose a Streaming Path

Choose the proxy endpoint that matches the client response format:

Client formatProxy pathBehavior
OpenAI-compatible chat/v1/chat/completionsReturns OpenAI-style SSE chunks for OpenAI-compatible SDKs and direct SSE consumers.
Anthropic Messages/v1/messagesReturns Anthropic-style SSE events. Anthropic upstreams stream natively; non-Anthropic upstreams stream through translation.
OpenAI Responses API/v1/responsesStreams only when the resolved model provider is openai. Non-OpenAI providers return 400.

Use the client format as the deciding factor. Do not switch to /v1/messages or /v1/responses only because the upstream provider changes.

Review Streaming Behavior

Streaming starts after AISIX has accepted the request and selected the target. If a client aborts a stream mid-response, the gateway remains healthy and continues serving later requests.

Streaming chat completions are not cached. Each streaming request dispatches upstream, even when a cache policy exists.

For multi-target models, AISIX can retry or fail over before any stream bytes are sent to the client. Once bytes have reached the client, the selected upstream remains responsible for the stream and AISIX does not switch targets.

If output guardrails are enabled, AISIX may hold, scan, or terminate streamed output depending on the endpoint and guardrail policy. For chat-completions and Messages streams, a blocked output can be signaled with a terminal SSE error event instead of normal stream completion. For Responses API streaming, AISIX can buffer the stream for policy inspection before returning it or blocking it.

When the upstream disconnects mid-stream, treat the partial stream as incomplete unless the endpoint-specific client contract says otherwise.

If chunks do not arrive, confirm the request includes the streaming flag shown in the example and that the client reads server-sent events. For Responses API streaming, a 400 response usually means the resolved model is not an OpenAI provider.

If a stream ends early, check the upstream provider status, gateway logs, and any configured stream timeout.

Next Steps

You have now seen how streaming behavior differs across AISIX endpoint families. For the main OpenAI-style path, see OpenAI-Compatible API. For Anthropic-style events, see Anthropic-Style Messages API. For stream errors and response headers, see Headers and Error Codes.

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation