Configure Upstream Health Checks
Health checking is a mechanism that determines whether upstream services are healthy or unhealthy based on their responsiveness. With health checks enabled, APISIX will only forward requests to upstream services that are considered healthy, and not forward requests to the services that are considered unhealthy.
There are two general approaches to health check:
- Active checks: APISIX proactively and periodically sends requests to upstream services and determines the health of those based on the responses to these requests.
- Passive checks: APISIX determines the health of upstream services based on how they respond to client requests, without proactively probing.
This guide will show you how to configure both active and passive health checks for your upstream services.
If you are using the APISIX Ingress Controller RC5 with APISIX in standalone mode, there is currently an issue where the Control API does not return health check data. This issue does not occur when APISIX is running in traditional mode with etcd.
Prerequisite(s)
- Install Docker.
- Install cURL to send requests to the services for validation.
- Follow the Getting Started tutorial to start a new APISIX instance in Docker or on Kubernetes.
Start Sample Upstream Services
- Docker
- Kubernetes
Start two NGINX instances as sample upstream services in the same Docker network as APISIX:
DOCKER_NETWORK=apisix-quickstart-net
docker run -d -p 8080:80 --network=${DOCKER_NETWORK} --name nginx1 nginx
docker run -d -p 8081:80 --network=${DOCKER_NETWORK} --name nginx2 nginx
Create a Kubernetes manifest file for the deployment of two NGINX instances as sample upstream services:
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: ingress-apisix
name: nginx1
spec:
replicas: 1
selector:
matchLabels:
app: nginx1
template:
metadata:
labels:
app: nginx1
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
namespace: ingress-apisix
name: nginx1
spec:
selector:
app: nginx1
ports:
- port: 8080
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: ingress-apisix
name: nginx2
spec:
replicas: 1
selector:
matchLabels:
app: nginx2
template:
metadata:
labels:
app: nginx2
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
namespace: ingress-apisix
name: nginx2
spec:
selector:
app: nginx2
ports:
- port: 8081
targetPort: 80
Apply the configuration to your cluster:
kubectl apply -f nginx.yaml
Expose the NGINX service ports to your local machine by port forwarding:
kubectl port-forward svc/nginx1 8080:8080 &
kubectl port-forward svc/nginx2 8081:8081 &
Verify both NGINX instances are running:
for port in 8080 8081; do
curl -s "http://127.0.0.1:$port" | grep -q "Welcome to nginx" &&
echo "NGINX welcome page available on port $port."
done
You should see the following response:
NGINX welcome page available on port 8080.
NGINX welcome page available on port 8081.
Configure Active Health Checks
Active checks determine the health of upstream services by periodically sending requests, or probes, to the services and seeing how they respond.
In this section, you will find two examples with verification steps, to understand:
- how changes in upstream statuses can be detected by active checks
- how APISIX forwards client requests to upstream services when all upstream statuses are unhealthy
Example: Status Change in Upstream Services
The following example demonstrates how APISIX active health checks respond in situations where healthy upstream services have become: partially unavailable, all unavailable, and all recovered.
Create a route to the two services and configure active health checks that run every 2 seconds:
- Admin API
- ADC
- Ingress Controller
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT -d '
{
"id": "example-hc-route",
"uri":"/",
"upstream": {
"type":"roundrobin",
"nodes": {
"nginx1:80": 1,
"nginx2:80": 1
},
"checks": {
"active": {
"type": "http",
"http_path": "/",
"healthy": {
"interval": 2,
"successes": 1
},
"unhealthy": {
"interval": 1,
"timeouts": 3
}
}
}
}
}'
❶ type: the type of active health checks.
❷ http_path: the HTTP request path to actively probe.
❸ healthy.interval: the time interval in seconds for periodically checking healthy nodes.
❹ healthy.successes: the success count threshold for ruling if an upstream node is considered healthy.
❺ unhealthy.interval: the time interval in seconds for periodically checking unhealthy nodes.
❻ unhealthy.timeouts: the timeout count threshold for ruling if an upstream node is considered unhealthy.
services:
- name: Nginx Service
routes:
- uris:
- /
name: example-hc-route
upstream:
type: roundrobin
nodes:
- host: nginx1
port: 80
weight: 1
- host: nginx2
port: 80
weight: 1
checks:
active:
type: http
http_path: /
healthy:
interval: 2
successes: 1
unhealthy:
interval: 1
timeouts: 3
❶ type: the type of active health checks.
❷ http_path: the HTTP request path to actively probe.
❸ healthy.interval: the time interval in seconds for periodically checking healthy nodes.
❹ healthy.successes: the success count threshold for ruling if an upstream node is considered healthy.
❺ unhealthy.interval: the time interval in seconds for periodically checking unhealthy nodes.
❻ unhealthy.timeouts: the timeout count threshold for ruling if an upstream node is considered unhealthy.
Synchronize the configuration to APISIX:
adc sync -f adc.yaml
- Gateway API
- APISIX CRD
APISIX Ingress Controller currently does not support using Gateway API to configure upstream health checks.
apiVersion: apisix.apache.org/v2
kind: ApisixUpstream
metadata:
namespace: ingress-apisix
name: nginx
spec:
ingressClassName: apisix
externalNodes:
- type: Domain
name: nginx1
port: 8080
- type: Domain
name: nginx2
port: 8081
healthCheck:
active:
type: http
httpPath: /
healthy:
interval: 2s
successes: 1
unhealthy:
interval: 1s
timeout: 3
---
apiVersion: apisix.apache.org/v2
kind: ApisixRoute
metadata:
namespace: ingress-apisix
name: example-hc-route
spec:
ingressClassName: apisix
http:
- name: nginx-hc
match:
paths:
- /
upstreams:
- name: nginx
❶ type: the type of active health checks.
❷ httpPath: the HTTP request path to actively probe.
❸ healthy.interval: the time interval in seconds for periodically checking healthy nodes.
❹ healthy.successes: the success count threshold for ruling if an upstream node is considered healthy.
❺ unhealthy.interval: the time interval in seconds for periodically checking unhealthy nodes.
❻ unhealthy.timeout: the timeout count threshold for ruling if an upstream node is considered unhealthy.
Apply the configuration to your cluster:
kubectl apply -f active-health-checks.yaml
Verify
You will be verifying the above configurations to understand how APISIX upstream health checks respond in different scenarios:
- when all upstream services are healthy
- when only partial services are healthy
- when none of the services is healthy
- when all services are recovered
- Docker
- Kubernetes
If you started APISIX in Docker with Getting Started quickstart, Control API port 9090 is already mapped (-p 9090:9090).
If you are using ingress controller, enable the Control API service and port-forward its port to your local machine.
First export all values (including defaults):
helm get values -n ingress-apisix apisix --all > values.yaml
In the values file, update the following section values as such:
apisix:
enabled: true
control:
enabled: true
Upgrade the release:
helm upgrade -n ingress-apisix apisix apisix/apisix -f values.yaml
Port-forward the Control API port:
kubectl port-forward service/apisix-control 9090:9090 &
Verify Both Upstream Services Are Healthy
Send a request to the route to start health checks:
curl "http://127.0.0.1:9080/"
To see upstream health statuses, send a request to the health check endpoint in Control API:
curl "http://127.0.0.1:9090/v1/healthcheck"
You should see a response similar to the following:
[
{
"name": "/apisix/routes/example-hc-route",
"type": "http",
"nodes": [
{
"port": 80,
"counter": {
"http_failure": 0,
"tcp_failure": 0,
"timeout_failure": 0,
"success": 0
},
"ip": "172.24.0.5",
"status": "healthy"
},
{
"port": 80,
"counter": {
"http_failure": 0,
"tcp_failure": 0,
"timeout_failure": 0,
"success": 0
},
"ip": "172.24.0.4",
"status": "healthy"
}
]
}
]
Verify When One Upstream Service Is Unavailable
Make one upstream service temporarily unavailable to verify if APISIX reports one of the upstream services unhealthy:
- Docker
- Kubernetes
docker container stop nginx1
kubectl scale deployment nginx1 -n ingress-apisix --replicas=0
Wait for a few seconds and send a request to the health check endpoint:
curl "http://127.0.0.1:9090/v1/healthcheck"
You should see a response similar to the following, showing one of the upstream nodes has 3 timeout failures and marked unhealthy:
[
{
"name": "/apisix/routes/example-hc-route",
"type": "http",
"nodes": [
{
"port": 80,
"counter": {
"http_failure": 0,
"tcp_failure": 0,
"timeout_failure": 0,
"success": 0
},
"ip": "172.24.0.5",
"status": "healthy"
},
{
"port": 80,
"counter": {
"http_failure": 0,
"tcp_failure": 0,
"timeout_failure": 3,
"success": 0
},
"ip": "172.24.0.4",
"status": "unhealthy"
}
]
}
]
Send a request to the route to see if APISIX forwards the request to the other healthy node:
curl -i "http://127.0.0.1:9080/"
You should receive an HTTP/1.1 200 OK response.
Verify Both Upstream Services Are Unavailable
Make the other upstream service temporarily unavailable to verify if APISIX reports both upstream services unhealthy:
- Docker
- Kubernetes
docker container stop nginx2
kubectl scale deployment nginx2 -n ingress-apisix --replicas=0
Wait for a few seconds and send a request to the health check endpoint:
curl "http://127.0.0.1:9090/v1/healthcheck"
You should see a response similar to the following, showing both upstream nodes have 3 timeout failures and marked unhealthy:
[
{
"name": "/apisix/routes/example-hc-route",
"type": "http",
"nodes": [
{
"port": 80,
"counter": {
"http_failure": 0,
"tcp_failure": 0,
"timeout_failure": 3,
"success": 0
},
"ip": "172.24.0.5",
"status": "unhealthy"
},
{
"port": 80,
"counter": {
"http_failure": 0,
"tcp_failure": 0,
"timeout_failure": 3,
"success": 0
},
"ip": "172.24.0.4",
"status": "unhealthy"
}
]
}
]
Send a request to the route:
curl -i "http://127.0.0.1:9080/"
You should receive an HTTP/1.1 502 Bad Gateway response.
Verify Both Upstream Services Are Recovered
Make both services available again to verify if APISIX reports both upstream services healthy:
- Docker
- Kubernetes
docker container start nginx1 nginx2
kubectl scale deployment -n ingress-apisix nginx1 nginx2 --replicas=1
Wait for a few seconds and send a request to the health check endpoint:
curl "http://127.0.0.1:9090/v1/healthcheck"
You should see a response showing both upstream nodes are healthy, similar to when both services are healthy at the start.
Example: Forward Requests When Statuses Are Unhealthy
The following example demonstrates that APISIX would still forward client requests to upstream services even when all upstream health statuses are unhealthy.
Create a route to the two services and configure active health checks that run every 2 seconds:
- Admin API
- ADC
- Ingress Controller
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT -d '
{
"id": "example-hc-route",
"uri":"/",
"upstream": {
"type":"roundrobin",
"nodes": {
"nginx1:80": 1,
"nginx2:80": 1
},
"checks": {
"active": {
"type": "http",
"http_path": "/404",
"healthy": {
"interval": 2,
"successes": 1
},
"unhealthy": {
"interval": 1,
"http_failures": 2
}
}
}
}
}'
❶ type: the type of active health checks.
❷ http_path: the HTTP request path to actively probe. For the convenience of demonstration, this is set to /404, which is a path that does not exist in upstream services. Consequently, both services should always be considered unhealthy by the active health checks.
❸ unhealthy.http_failures: the HTTP failure count threshold for ruling if an upstream node is considered unhealthy.
services:
- name: Nginx Service
routes:
- uris:
- /
name: example-hc-route
upstream:
type: roundrobin
nodes:
- host: nginx1
port: 80
weight: 1
- host: nginx2
port: 80
weight: 1
checks:
active:
type: http
http_path: /404
healthy:
interval: 2
successes: 1
unhealthy:
interval: 1
http_failures: 3
❶ type: the type of active health checks.
❷ http_path: the HTTP request path to actively probe. For the convenience of demonstration, this is set to /404, which is a path that does not exist in upstream services. Consequently, both services should always be considered unhealthy by the active health checks.
❸ unhealthy.http_failures: the HTTP failure count threshold for ruling if an upstream node is considered unhealthy.
Synchronize the configuration to APISIX:
adc sync -f adc.yaml
- Gateway API
- APISIX CRD
APISIX Ingress Controller currently does not support using Gateway API to configure upstream health checks.
apiVersion: apisix.apache.org/v2
kind: ApisixUpstream
metadata:
namespace: ingress-apisix
name: nginx
spec:
ingressClassName: apisix
externalNodes:
- type: Domain
name: nginx1
port: 8080
- type: Domain
name: nginx2
port: 8081
healthCheck:
active:
type: http
httpPath: /404
healthy:
interval: 2s
successes: 1
unhealthy:
interval: 1s
httpFailures: 2
---
apiVersion: apisix.apache.org/v2
kind: ApisixRoute
metadata:
namespace: ingress-apisix
name: example-hc-route
spec:
ingressClassName: apisix
http:
- name: nginx-hc
match:
paths:
- /
upstreams:
- name: nginx
❶ type: the type of active health checks.
❷ httpPath: the HTTP request path to actively probe. For the convenience of demonstration, this is set to /404, which is a path that does not exist in upstream services. Consequently, both services should always be considered unhealthy by the active health checks.
❸ unhealthy.httpFailures: the HTTP failure count threshold for ruling if an upstream node is considered unhealthy.
Apply the configuration to your cluster:
kubectl apply -f active-health-checks.yaml
Verify
- Docker
- Kubernetes
If you started APISIX in Docker with Getting Started quickstart, Control API port 9090 is already mapped (-p 9090:9090).
If you are using ingress controller, enable the Control API service and port-forward its port to your local machine.
First export all values (including defaults):
helm get values -n ingress-apisix apisix --all > values.yaml
In the values file, update the following section values as such:
apisix:
enabled: true
control:
enabled: true
Upgrade the release:
helm upgrade -n ingress-apisix apisix apisix/apisix -f values.yaml
Port-forward the Control API port:
kubectl port-forward service/apisix-control 9090:9090 &
Send a request to the route to start health checks:
curl -i "http://127.0.0.1:9080/"
You should receive an HTTP/1.1 200 OK response.
Send a request to the health check endpoint:
curl "http://127.0.0.1:9090/v1/healthcheck"
You should see a response similar to the following:
[
{
"name": "/apisix/routes/example-hc-route",
"nodes": [
{
"counter": {
"timeout_failure": 0,
"http_failure": 2,
"success": 0,
"tcp_failure": 0
},
"port": 80,
"ip": "172.25.0.5",
"status": "unhealthy"
},
{
"counter": {
"timeout_failure": 0,
"http_failure": 2,
"success": 0,
"tcp_failure": 0
},
"port": 80,
"ip": "172.25.0.4",
"status": "unhealthy"
}
],
"type": "http"
}
]
Send a request to the route to see if APISIX still forwards the request:
curl -i "http://127.0.0.1:9080/"
You should receive an HTTP/1.1 200 OK response. This verifies that APISIX would still forward client requests to upstream services, despite both services being marked as unhealthy.
Configure Passive Health Checks
APISIX requires the use of active health checks with passive health checks. When an upstream service becomes unhealthy, the active health check is in place to periodically check if the upstream service has recovered.
There is a known issue where the health check data displayed through the Control API do not accurately reflect the actual health statuses, so your testing results may differ from the example shown. This issue should be resolved in version 3.15.0. However, the passive health check mechanism itself is functioning correctly and continues to route requests as expected.
Example: Status Change in Upstream Services
Create a route to the two services, and configure both active and passive health checks:
- Admin API
- ADC
- Ingress Controller
curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT -d '
{
"id": "example-hc-route",
"uri": "/404",
"upstream": {
"type": "roundrobin",
"nodes": {
"nginx1:80": 1,
"nginx2:80": 1
},
"checks": {
"active": {
"type": "http",
"http_path": "/",
"healthy": {
"interval": 99999,
"successes": 1
},
"unhealthy": {
"interval": 30
}
},
"passive": {
"healthy": {
"http_statuses": [200,201,202,300,301,302],
"successes": 1
},
"unhealthy": {
"http_statuses": [429,404,500,501,502,503,504,505],
"http_failures": 3
}
}
}
}
}'
❶ uri: the URI path that the route matches. For the convenience of demonstration, this is set to /404, which is a path that does not exist in upstream services. Consequently, when a request is made, both upstream services should respond with a 404 status code.
❷ active.healthy.interval: the time interval in seconds for periodically checking healthy nodes.
❸ active.unhealthy.interval: the time interval in seconds for periodically checking unhealthy nodes.
❹ passive.healthy.http_statuses: the response HTTP status codes that are considered healthy.
❺ passive.unhealthy.http_statuses: the response HTTP status codes that are considered unhealthy. The unhealthy responses are counted towards the http_failures.
❻ passive.unhealthy.http_failures: the HTTP failure count threshold for ruling if an upstream node is considered unhealthy.
services:
- name: Nginx Service
routes:
- uris:
- /404
name: example-hc-route
upstream:
type: roundrobin
nodes:
- host: nginx1
port: 80
weight: 1
- host: nginx2
port: 80
weight: 1
checks:
active:
type: http
http_path: /
healthy:
interval: 99999
successes: 1
unhealthy:
interval: 30
passive:
healthy:
http_statuses:
- 200
- 201
- 202
- 300
- 301
- 302
successes: 1
unhealthy:
http_statuses:
- 429
- 404
- 500
- 501
- 502
- 503
- 504
- 505
http_failures: 3
❶ uris: the URI paths that the route matches. For the convenience of demonstration, this is set to /404, which is a path that does not exist in upstream services. Consequently, when a request is made, both upstream services should respond with a 404 status code.
❷ active.healthy.interval: the time interval in seconds for periodically checking healthy nodes.
❸ active.unhealthy.interval: the time interval in seconds for periodically checking unhealthy nodes.
❹ passive.healthy.http_statuses: the response HTTP status codes that are considered healthy.
❺ passive.unhealthy.http_statuses: the response HTTP status codes that are considered unhealthy. The unhealthy responses are counted towards the http_failures.
❻ passive.unhealthy.http_failures: the HTTP failure count threshold for ruling if an upstream node is considered unhealthy.
Synchronize the configuration to APISIX:
adc sync -f adc.yaml
- Gateway API
- APISIX CRD
APISIX Ingress Controller currently does not support using Gateway API to configure upstream health checks.
apiVersion: apisix.apache.org/v2
kind: ApisixUpstream
metadata:
namespace: ingress-apisix
name: nginx
spec:
ingressClassName: apisix
externalNodes:
- type: Domain
name: nginx1
port: 8080
- type: Domain
name: nginx2
port: 8081
healthCheck:
active:
type: http
httpPath: /
healthy:
interval: 99999s
successes: 1
unhealthy:
interval: 30s
passive:
healthy:
httpCodes: [200,201,202,300,301,302]
successes: 1
unhealthy:
httpCodes: [429,404,500,501,502,503,504,505]
httpFailures: 3
---
apiVersion: apisix.apache.org/v2
kind: ApisixRoute
metadata:
namespace: ingress-apisix
name: example-hc-route
spec:
ingressClassName: apisix
http:
- name: nginx-hc
match:
paths:
- /404
upstreams:
- name: nginx
❶ paths: the URI paths that the route matches. For the convenience of demonstration, this is set to /404, which is a path that does not exist in upstream services. Consequently, when a request is made, both upstream services should respond with a 404 status code.
❷ active.healthy.interval: the time interval in seconds for periodically checking healthy nodes.
❸ active.unhealthy.interval: the time interval in seconds for periodically checking unhealthy nodes.
❹ passive.healthy.httpCodes: the response HTTP status codes that are considered healthy.
❺ passive.unhealthy.httpCodes: the response HTTP status codes that are considered unhealthy. The unhealthy responses are counted towards the httpFailures.
❻ passive.unhealthy.httpFailures: the HTTP failure count threshold for ruling if an upstream node is considered unhealthy.
Apply the configuration to your cluster:
kubectl apply -f passive-health-checks.yaml
Verify
- Docker
- Kubernetes
If you started APISIX in Docker with Getting Started quickstart, Control API port 9090 is already mapped (-p 9090:9090).
If you are using ingress controller, enable the Control API service and port-forward its port to your local machine.
First export all values (including defaults):
helm get values -n ingress-apisix apisix --all > values.yaml
In the values file, update the following section values as such:
apisix:
enabled: true
control:
enabled: true
Upgrade the release:
helm upgrade -n ingress-apisix apisix apisix/apisix -f values.yaml
Port-forward the Control API port:
kubectl port-forward service/apisix-control 9090:9090 &
Send a request to the route to start health checks:
curl -i "http://127.0.0.1:9080/404"
You should see an HTTP/1.1 404 Not Found response.
Send a request to the health check endpoint:
curl "http://127.0.0.1:9090/v1/healthcheck"
You should see a response similar to the following:
[
{
"name": "/apisix/routes/example-hc-route",
"nodes": [
{
"counter": {
"timeout_failure": 0,
"http_failure": 1,
"success": 0,
"tcp_failure": 0
},
"port": 80,
"ip": "172.25.0.5",
"status": "mostly_healthy"
},
{
"counter": {
"timeout_failure": 0,
"http_failure": 0,
"success": 0,
"tcp_failure": 0
},
"port": 80,
"ip": "172.25.0.4",
"status": "healthy"
}
],
"type": "http"
}
]
❶ http_failure has a count of 1 due to the previous request with a 404 response.
❷ mostly_healthy status means the current node status is healthy, but APISIX starts to receive unhealthy indications during health checks.
Generate consecutive requests to invoke 404 responses:
resp=$(seq 10 | xargs -I{} curl "http://127.0.0.1:9080/404" -o /dev/null -s -w "%{http_code}\n") && \
count=$(echo "$resp" | grep "404" | wc -l) && \
echo "Invoked $count responses with 404 status code."
Send a request to the health check endpoint:
curl "http://127.0.0.1:9090/v1/healthcheck"
You should see a response similar to the following:
[
{
"name": "/apisix/routes/example-hc-route",
"nodes": [
{
"counter": {
"timeout_failure": 0,
"http_failure": 3,
"success": 0,
"tcp_failure": 0
},
"port": 80,
"ip": "172.25.0.4",
"status": "unhealthy"
},
{
"counter": {
"timeout_failure": 0,
"http_failure": 4,
"success": 0,
"tcp_failure": 0
},
"port": 80,
"ip": "172.25.0.5",
"status": "unhealthy"
}
],
"type": "http"
}
]
Wait for at least 30 seconds for active checks to probe the upstream services at / and bring them back as healthy upstream services. Then, send a request to the health check endpoint:
curl "http://127.0.0.1:9090/v1/healthcheck"
You should see a response similar to the following:
[
{
"name": "/apisix/routes/example-hc-route",
"nodes": [
{
"counter": {
"timeout_failure": 0,
"http_failure": 0,
"success": 1,
"tcp_failure": 0
},
"port": 80,
"ip": "172.25.0.4",
"status": "healthy"
},
{
"counter": {
"timeout_failure": 0,
"http_failure": 0,
"success": 1,
"tcp_failure": 0
},
"port": 80,
"ip": "172.25.0.5",
"status": "healthy"
}
],
"type": "http"
}
]
Disable All Health Checks
You can disable all upstream health checks globally. This is useful in scenarios such as emergency maintenance, where health checks might interfere with routing or fallback behavior.
- Docker
- Kubernetes
To disable all health checks, update your configuration file as follows:
apisix:
disable_upstream_healthcheck: true
Reload APISIX for configuration changes to take effect:
docker exec apisix-quickstart apisix reload
The Helm chart does not yet include this value. As a temporary workaround (not recommended for production), you can manually update the APISIX ConfigMap:
kubectl edit cm apisix -n ingress-apisix
Update the following section values:
apisix:
disable_upstream_healthcheck: true
After saving the change, restart the APISIX deployment to apply the change:
kubectl rollout restart deployment apisix -n ingress-apisix
Next Steps
You have now learned how to configure active and passive health checks for upstream services in APISIX. To learn more about the available configuration options for upstream health checks, see Admin API, Upstream for reference.
APISIX also offers an api-breaker plugin, which implements circuit breaker functionality based on the health of upstream services and helps improve application resilience. See the api-breaker plugin doc for more details (coming soon).