Skip to main content

Version: 3.9.x

Deploy for High Availability

Deploying API7 Gateway for high availability (HA) eliminates single points of failure in both the control plane (CP) and data plane (DP). This guide covers the prerequisites, architecture, and step-by-step instructions for deploying a production-grade HA setup.

For a conceptual overview, see High Availability.

This page explains how to deploy an HA topology. After the deployment is in place, use the configure-and-manage guides for ongoing operations:

Before you deploy, plan the surrounding infrastructure that keeps the cluster available during failures:

  • An external PostgreSQL deployment with its own HA or managed failover strategy.
  • A load balancer in front of the control plane and another in front of the data plane.
  • TLS certificates for the Dashboard and DP Manager endpoints, plus mTLS certificates for data plane nodes.
  • A maintenance process that restarts one node at a time and verifies traffic before moving on.

Architecture Overview

A high-availability deployment consists of:

  • Multiple CP nodes (Dashboard + DP Manager) sharing a PostgreSQL database, fronted by a load balancer.
  • Multiple DP nodes in one or more Gateway Groups, fronted by a load balancer.
  • (Optional) A backup gateway node that periodically exports configuration to external storage (AWS S3 or Azure Blob Storage) for data plane resilience during CP outages.

Prerequisites

Deployment Checklist

Complete the following preparation work before you install or scale any HA nodes:

  1. Provision an external PostgreSQL instance or cluster and confirm that automatic failover is handled outside API7 Gateway.
  2. Reserve stable DNS names or virtual IPs for the control plane and data plane load balancers.
  3. Prepare TLS certificates for the Dashboard and DP Manager endpoints. If data plane nodes connect over mTLS, also prepare the CA and node certificates required by your deployment flow.
  4. Confirm that every control plane node uses the same PostgreSQL DSN and that every data plane node connects to the same gateway group through the same DP Manager endpoint.
note

API7 Gateway depends on the availability of PostgreSQL, but it does not configure PostgreSQL replication, promotion, or backups for you. Treat database HA as a prerequisite for control plane HA.

Minimum Hosts

ComponentMinimum NodesNotes
Control Plane2Each runs Dashboard + DP Manager
Data Plane2Each runs API7 Gateway
PostgreSQLManaged or HA clusterHA configuration is out of scope; see PostgreSQL HA

Hardware Requirements

ComponentCPUMemoryDisk
Control Plane4 Cores8 GB40 GB
Data Plane4 Cores8 GB20 GB

For detailed requirements, see System Requirements.

Network Ports

Ensure the following ports are accessible between components:

ServicePortProtocolDescription
Dashboard7080 / 7443HTTP / HTTPSDashboard UI and Admin API
DP Manager7900 / 7943HTTP / HTTPSData plane management
Gateway (HTTP)9080HTTPAPI traffic
Gateway (HTTPS)9443HTTPSAPI traffic
Gateway Status7085HTTPHealth check endpoint
PostgreSQL5432TCPDatabase
Prometheus9090HTTPMetrics (optional)

Load Balancers and TLS

For production HA, give each traffic path a stable frontend:

EndpointTypical frontendPurpose
DashboardIngress, internal load balancer, or reverse proxyOperator access to the Dashboard and Admin API
DP ManagerCluster-internal Service or internal load balancerStable mTLS address for data plane nodes
Data planeExternal load balancer or ingress controllerClient API traffic

On Kubernetes, the control plane chart already creates separate Services for the Dashboard and DP Manager. A common pattern is to keep dashboard_service.type: ClusterIP and expose the Dashboard through an Ingress or internal proxy, while keeping dp_manager_service as a stable internal endpoint. If data plane nodes connect from outside the cluster, expose the DP Manager through an internal load balancer or another stable private frontend.

Control Plane HA

The API7 Dashboard and DP Manager are stateless applications that store all configuration in PostgreSQL. Deploy multiple instances behind a load balancer for HA.

All control plane replicas must share:

  • The same PostgreSQL database.
  • The same license state and Dashboard configuration.
  • Stable endpoints for operators and data plane nodes.

Scale the control plane by setting replica counts in your Helm values and keeping the service frontends stable:

values.yaml
dashboard:
replicaCount: 2
dp_manager:
replicaCount: 2

postgresql:
builtin: false

dashboard_service:
type: ClusterIP

dp_manager_service:
type: ClusterIP

dashboard_configuration:
database:
dsn: "postgres://api7ee:$DB_PASSWORD@your-pg-ha-endpoint:5432/api7ee"

dp_manager_configuration:
database:
dsn: "postgres://api7ee:$DB_PASSWORD@your-pg-ha-endpoint:5432/api7ee"

❶ Deploy at least 2 Dashboard replicas for redundancy.

❷ Deploy at least 2 DP Manager replicas.

❸ Disable the built-in PostgreSQL and point the control plane to your external HA database.

❹ Keep the Dashboard behind a stable frontend such as an Ingress, internal proxy, or LoadBalancer service.

❺ Keep the DP Manager reachable at one stable address for all data plane nodes in the same gateway group.

If Developer Portal remains enabled in the same Helm release, also set developer_portal_configuration.database.dsn to the same PostgreSQL endpoint or disable Developer Portal for that release.

Install or upgrade the Helm release:

helm upgrade --install api7ee3 api7/api7ee3 -f values.yaml -n api7 --create-namespace

If you terminate TLS on the Dashboard itself, configure the Dashboard certificate with dashboard.keyCertSecret. If you terminate TLS at an Ingress or load balancer instead, keep the backend ports reachable only on the trusted network.

For stronger placement guarantees on Kubernetes, also consider dashboard.topologySpreadConstraints and dp_manager.topologySpreadConstraints so replicas land on different worker nodes or availability zones.

tip

All Dashboard and DP Manager instances must connect to the same PostgreSQL database. PostgreSQL HA (primary-replica, Patroni, or managed services like Amazon RDS, Azure Database, or Google Cloud SQL) is a separate concern and should be configured according to your database provider's documentation.

Data Plane HA

Data plane nodes are stateless — they receive configuration from the control plane and process traffic independently. Deploy multiple nodes behind a load balancer.

For predictable failover behavior, configure the data plane so that:

  • Each node belongs to the same gateway group.
  • The load balancer only sends traffic to healthy nodes.
  • Nodes are distributed across failure domains where possible, such as different hosts, zones, or Kubernetes worker nodes.

Health Check Configuration

Each gateway node exposes a status endpoint for health monitoring:

EndpointMethodDescription
/statusGETReturns 200 if the gateway is running
/status/readyGETReturns 200 if the gateway is ready to accept traffic

The status endpoint listens on port 7085 by default. Configure your load balancer to perform health checks against this endpoint at regular intervals (for example, every 10–30 seconds).

Use /status when you only need to know that the gateway process is alive. Use /status/ready when you want the load balancer to route traffic only to nodes that still have an available DP Manager connection.

note

If your operating model allows data plane nodes to continue serving cached configuration during a brief control plane interruption, health checking /status keeps those nodes in rotation. If you instead want the load balancer to stop routing traffic when DP Manager connectivity is lost, use /status/ready. See Configure Readiness and Liveness Probes for the detailed tradeoff.

Deploy Multiple DP Nodes

Add these HA-oriented settings to the values.yaml file you use for the api7/gateway chart:

gateway-values.yaml
api7ee:
status_endpoint:
enabled: true
ip: 0.0.0.0
port: 7085

apisix:
replicaCount: 3

podDisruptionBudget:
enabled: true
minAvailable: 1

affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- gateway
topologyKey: kubernetes.io/hostname

gateway:
readinessProbe:
httpGet:
path: /status/ready
port: 7085
livenessProbe:
httpGet:
path: /status
port: 7085

❶ Expose the status endpoints on port 7085 so Kubernetes and external health checkers can verify node state.

❷ Run at least 2 data plane replicas for HA; start with 3 if you want more headroom during maintenance.

❸ Use a PodDisruptionBudget so voluntary disruptions do not evict all gateway pods at once.

❹ Spread pods across worker nodes to reduce the impact of a single-node failure.

❺ Use separate readiness and liveness checks so the scheduler and load balancers stop sending traffic to nodes that are not ready.

Choose the service exposure model that matches your environment. On managed Kubernetes, setting gateway.type: LoadBalancer is a common way to provision the data plane frontend. If you already have an ingress controller or external L4/L7 load balancer, keep the service type that fits that design and point the frontend at the gateway Service.

For details on health checks and resilience behavior, see Data Plane High Availability.

Data Plane Resilience (Fallback CP)

For environments that require the data plane to survive extended control plane outages, configure Fallback CP. This feature periodically exports all gateway configuration to external storage (AWS S3 or Azure Blob Storage), enabling data plane nodes to fetch configuration from storage when the control plane is unreachable.

For detailed setup instructions, see Data Plane Resilience.

Verification

After deploying the HA setup, verify each component:

  1. Control Plane: Access the Dashboard through the load balancer URL. Confirm that both instances are serving requests by checking access logs on each node.

  2. Data Plane: Send a test request through the DP load balancer:

    curl -i "http://<dp-load-balancer>:9080/"
  3. Health Checks: Verify the status endpoint on each DP node:

    curl -i "http://<dp-node>:7085/status"
    # Expected: HTTP/1.1 200 OK
  4. Failover Test: Stop one CP node and verify the Dashboard remains accessible through the load balancer. Stop one DP node and verify API traffic continues to flow through the remaining nodes.

Operational Runbook

Use the following checks during acceptance testing and routine maintenance.

Control Plane Failover Test

  1. Log in to the Dashboard through the control plane load balancer.
  2. Stop or isolate one CP node.
  3. Refresh the Dashboard and confirm that the session remains usable.
  4. Apply a small configuration change, such as updating a route description, and confirm it succeeds.
  5. Restore the failed node and confirm it rejoins the load balancer pool.

Data Plane Failover Test

  1. Send repeated requests through the data plane load balancer.
  2. Stop or drain one gateway node.
  3. Confirm the load balancer health checks mark the node unhealthy.
  4. Verify client traffic continues through the remaining gateway nodes without a full outage.
  5. Restore the node and confirm it returns to service only after the health check passes.

Rolling Restart Procedure

Perform rolling maintenance one node at a time:

  1. Confirm at least one other CP node and one other DP node are healthy before you restart anything.
  2. Remove a single node from the load balancer pool or drain it gracefully.
  3. Restart that node and wait until it is healthy again.
  4. Verify Dashboard access or gateway traffic before moving to the next node.
  5. Repeat for the remaining nodes.

Never restart all control plane nodes or all data plane nodes at the same time unless you have planned downtime.

Next Steps

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation