Skip to main content

Production Deployment

Production deployment is the point where a working gateway becomes part of a real traffic path. Before widening traffic, confirm the deployment mode, dependency placement, listener exposure, readiness checks, and shutdown behavior.

Choose the Deployment Mode

Start by deciding where configuration is managed.

In a self-hosted deployment, AISIX runs with a local admin listener. You manage provider keys, model aliases, caller API keys, and policy resources through the Admin API, and AISIX stores those resources in etcd.

In a managed deployment, AISIX Cloud manages resources and projects them into the managed gateway. The local admin listener and local playground are not exposed. The gateway connects back to AISIX Cloud through the managed mTLS path.

Keep the rest of the deployment plan aligned with that choice. A self-hosted deployment needs a private admin path and reachable etcd configuration store. A managed deployment needs the Cloud certificate bundle, managed connectivity, and live proxy-path checks.

Prepare the Runtime Dependencies

Run the configuration store outside the gateway process. In self-hosted deployments, etcd is part of the gateway control plane, so keep it private to AISIX and the systems that manage gateway configuration.

Plan etcd persistence, backups, and access control before production traffic. Self-hosted etcd stores dynamic resources, including sensitive provider credentials and policy configuration.

Start with the in-memory cache unless multiple gateway instances need shared response caching. If you choose Redis, configure cache.redis.url; the gateway fails startup when cache.backend is redis and the Redis block is missing or unreachable.

Use the startup config for process-level settings only. Dynamic resources such as provider keys, model aliases, caller API keys, guardrails, cache policies, and observability exporters should come from the configuration store.

Set the Listener Baseline

Expose the proxy listener only to intended callers or to the ingress tier that fronts caller traffic.

For self-hosted deployments, keep the admin listener on loopback, a private subnet, or an admin-only network. Do not expose admin, metrics, or OpenAPI endpoints to the public network.

Metrics endpoints are unauthenticated, whether they are served on the private admin listener or on a dedicated metrics listener. Keep metrics endpoints private and scrape them only from trusted monitoring infrastructure.

In managed deployments, metrics may be served on a dedicated listener instead of the admin listener. Verify the listener that your startup configuration exposes.

Enable listener TLS when proxy or admin traffic leaves local development. Use etcd TLS or mTLS when your configuration store requires encrypted or mutually authenticated transport. In managed deployments, verify the managed certificate bundle and Cloud control-plane URL before investigating higher-level projection issues.

Verify before Routing Traffic

Liveness is necessary, but it is not enough. A process can be alive while caller authentication, model resolution, provider credentials, or upstream network access is still broken.

Set the listener URLs for your deployment:

PROXY_URL="https://gateway.example.com"
ADMIN_URL="https://admin.internal.example.com"

Check proxy liveness on the proxy listener:

curl -i "${PROXY_URL}/livez"

For self-hosted deployments, also check admin liveness on the private admin listener:

curl -i "${ADMIN_URL}/livez"

In self-hosted deployments, check admin health for model health and configuration freshness:

curl -sS "${ADMIN_URL}/admin/v1/health" \
-H "Authorization: Bearer ${AISIX_ADMIN_KEY}"

Then verify the actual request path:

  • GET /v1/models with the same caller API key the application will use.
  • One real proxy request for each endpoint family you plan to expose.
  • Logs, metrics, headers, usage events, or exporter output that show the smoke-test request.
  • A rejected request with an invalid or unauthorized caller API key, so access control is proven before traffic widens.

Shutdown Behavior

AISIX handles graceful shutdown on SIGINT and SIGTERM. During shutdown, the server stops accepting new work, coordinates listener shutdown with background tasks, and marks liveness as failing.

During an intentional drain, AISIX marks /livez as failed so load balancers or orchestration probes can remove the instance from service. GET /livez is a liveness signal, not a full readiness or traffic-drain contract.

Before Widening Traffic

Before exposing production traffic broadly, confirm these conditions:

  • The startup config matches the intended deployment mode.
  • The proxy listener is reachable only through the intended network path.
  • The admin listener is private in self-hosted deployments.
  • The configuration store is private and reachable from the gateway.
  • TLS, mTLS, and certificate file paths are valid for the transports that use them.
  • At least one provider key, model alias, and caller API key have been created.
  • The model alias appears through the caller-facing proxy path.
  • A real provider-backed request succeeds through the gateway.
  • Operational signals show the smoke-test request.

If the process is up but real requests fail, treat it as a request-path problem. Check model visibility, configuration propagation, caller API-key access, provider credentials, and upstream connectivity before widening traffic.

Next Steps

You have now seen the production baseline for moving AISIX toward real traffic. Continue with Network and Security for listener and secret handling, TLS and mTLS for transport security, and Health Checks for readiness checks.

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation