Data Plane High Availability
High availability (HA) for API7 Gateway Data Plane (DP) nodes keeps API traffic flowing when individual nodes fail or are restarted. Use this page when you need to design a multi-node deployment behind a load balancer.
This guide focuses on runtime availability of traffic-serving nodes. For adjacent concerns, see:
- Scale Data Plane for adding capacity.
- Autoscale Data Plane on Kubernetes for HPA-based scaling.
- Data Plane Resilience for fallback behavior during extended control plane outages.
HA Architecture
A typical HA setup for the Data Plane involves deploying multiple DP nodes across different physical servers or availability zones.
- Multi-Node Deployment: Deploying at least two DP nodes ensures that one node can take over if the other fails.
- Stateless Operation: DP nodes operate independently and do not share any state, simplifying the HA architecture.
- Control Plane Independence: DP nodes can continue to handle traffic using the cached configuration even if the Control Plane becomes temporarily unavailable.
Health Checks and Probes
Configure health checks to monitor the status of DP nodes and ensure only healthy nodes receive traffic.
Status API Endpoint
The DP exposes a status API (typically at port 7085) that can be used for health monitoring.
- URL:
http://<dp-node-ip>:7085/status - Response: A successful response (HTTP 200) indicates the DP is healthy and ready to process traffic.
Kubernetes Probes
In a Kubernetes environment, use Readiness and Liveness probes to manage the lifecycle of DP pods.
# Kubernetes pod probes example
readinessProbe:
httpGet:
path: /status/ready
port: 7085
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /status
port: 7085
initialDelaySeconds: 15
periodSeconds: 20
Resilience to Control Plane Failures
API7 Gateway Data Plane nodes are designed to be resilient to Control Plane (CP) unavailability.
- Configuration Caching: When a DP node receives configuration from the CP, it caches it in memory.
- Traffic Processing: If the CP becomes unavailable, the DP continues to process incoming requests using the cached configuration.
- Automatic Reconnection: The DP node will automatically attempt to reconnect to the CP until the connection is restored. Once reconnected, it will synchronize any configuration changes made during the downtime.
Multi-Node DP Deployment
To achieve HA, you should deploy multiple DP nodes behind a load balancer.
- Load Balancer: Distributes traffic across all healthy DP nodes.
- Health Checks: The load balancer should use the status API endpoint to perform regular health checks on each DP node.
- Failover: If a DP node fails, the load balancer will stop sending traffic to it and redistribute the load to the remaining healthy nodes.