Skip to main content

Version: latest

Performance Benchmark

This page documents the published benchmark results for API7 Gateway and the methodology behind them, so you can both understand the numbers and reproduce them in your own environment. All test assets — ADC configurations, wrk2 scripts, and deployment manifests — are available in the api7-gateway-performance-benchmark repository.

Methodology

The benchmark uses a fixed set of variables so that results are comparable across test runs and environments:

  • Single gateway node: one API7 Gateway data plane instance, with worker_processes configured to match the vCPU count of the host. To build a clean single-core baseline, start with worker_processes: 1 and scale up once you have confirmed the single-core numbers.
  • Exclude upstream interference for the ceiling number: enable only the mocking plugin to measure API7 Gateway's raw request-processing ceiling. The mocking plugin returns canned responses without forwarding to an upstream. This number is not comparable to real-world throughput.
  • Realistic scenarios forward to an upstream: every other test row uses a real NGINX upstream running nginx/1.25.4, configured with access_log off and a static 200-byte response so that the upstream itself is never the bottleneck.
  • Sample size: each test case runs 5 times, 2 minutes per run, and the reported result is the average of the 5 runs.
  • Test scenarios: 5 plugin combinations × 2 route/consumer scales = 10 test cases:
    1. Only mocking (baseline ceiling)
    2. No plugins
    3. Only limit-count
    4. Only key-auth
    5. Both key-auth and limit-count

Each plugin scenario is run with (a) 1 route / 1 consumer and (b) 100 routes / 100 consumers to measure the impact of resource count on matching and plugin evaluation.

Published results — AWS EKS

These results were obtained in an AWS EKS environment using c5.4xlarge EC2 instances (16 vCPUs, 32 GB RAM) running Amazon Linux 2 (AL2_x86_64) on Kubernetes 1.29. API7 Gateway, the NGINX upstream, and wrk2 each ran on separate nodes in the same VPC to avoid resource contention.

note

The Forward to Upstream column distinguishes the mocking-only ceiling (where API7 Gateway returns canned responses and no request leaves the gateway) from realistic proxy scenarios. The mocking-only row measures the theoretical ceiling of the gateway's request-processing pipeline; it is not comparable to real-world throughput, where the backend and network become additional factors. Use the rows with Forward to Upstream: True when sizing for production.

Test scenarioRoutes / ConsumersForward to upstreamQPSP99 (ms)P95 (ms)
Only mocking1 route, 0 consumersFalse310,392.071.161.08
No plugins1 route, 0 consumersTrue167,019.372.302.16
No plugins100 routes, 0 consumersTrue162,753.172.312.16
Only limit-count1 route, 0 consumersTrue145,370.102.432.24
Only limit-count100 routes, 0 consumersTrue143,108.402.452.25
Only key-auth1 route, 1 consumerTrue147,869.492.412.22
Only key-auth100 routes, 100 consumersTrue145,070.932.432.25
key-auth + limit-count1 route, 1 consumerTrue136,725.472.432.26
key-auth + limit-count100 routes, 100 consumersTrue133,782.952.482.30

Run your own benchmark

To reproduce these numbers or measure your own workload, follow the same methodology used for the published results and apply the optimization guidance below.

Before you start

  • Select an appropriate number of gateway nodes. Use one API7 Gateway node per test. Configure worker_processes to match the vCPU count of the host. Do not run multiple API7 Gateway instances with small worker_processes values in parallel — the results are harder to interpret.
  • Build the single-core baseline first. Set worker_processes: 1 and run the "No plugins" scenario against a local NGINX upstream on the same host. Your result should be in the same ballpark as the single-host baseline table above before you scale out.
  • Exclude upstream interference for the ceiling. Enable only the mocking plugin for the ceiling number. This isolates API7 Gateway's own processing cost.
  • Watch the upstream. During every realistic test, monitor the CPU, memory, and event-loop utilization of the NGINX upstream. If the upstream is saturated, the numbers reflect the upstream's limit, not API7 Gateway's.
  • Run multiple samples and apply statistics. Every test case should run at least 5 times. Report the mean and standard deviation so that noise can be separated from signal.

For a full AWS EKS walkthrough — cluster creation, node labeling, Helm install, NGINX upstream deployment, and wrk2 deployment — see Run Benchmarks on AWS EKS.

Optimization recommendations

Raise the maximum number of open files

Check the system's current maximum number of open file descriptors:

cat /proc/sys/fs/file-nr

The last number is the system-wide maximum. If it is too small, raise it in /etc/sysctl.conf:

/etc/sysctl.conf
fs.file-max = 1020000
net.ipv4.ip_conntrack_max = 1020000
net.ipv4.netfilter.ip_conntrack_max = 1020000

Reload:

sudo sysctl -p /etc/sysctl.conf

Raise the per-process ulimit

Each incoming connection consumes a file descriptor. Raise ulimit -n to a seven-figure value so that the gateway can accept the connection volume a benchmark generates.

Temporary (current shell only):

ulimit -n 1024000

Permanent (edit /etc/security/limits.conf):

/etc/security/limits.conf
* hard nofile 1024000
* soft nofile 1024000

Disable access logs

Access logs write to disk for every request, which can cap QPS at I/O speeds during high-volume tests. Disable them for benchmarks:

config.yaml
nginx_config:
error_log_level: error
worker_processes: auto
http:
enable_access_log: false

Setting the error log level to error also reduces log I/O during the run.

Avoid resource contention

Put wrk2, API7 Gateway, and the upstream service on separate machines in the same local network. When running in Kubernetes, use nodeSelector (or taints and tolerations) so that the three pods land on three different nodes, otherwise CPU and network contention will distort the results.

Avoid burstable cloud instances

Do not benchmark on burstable cloud instance families (for example, AWS t3/t4g). Burstable instances have a credit-based CPU model that produces non-repeatable results. Use a fixed-performance family like c5.4xlarge instead.

Also note that cloud providers' vCPUs are not always 1:1 with physical cores — many use hyper-threading, which means the actual physical core count can be half of the vCPU count. For the most accurate sizing, consult your cloud provider's instance documentation (for AWS, see AWS Instance CPU options).

Watch for internal errors in the gateway

Before each benchmark run, tail the API7 Gateway error log and verify that it is clean. An internal error loop (for example, an upstream DNS failure that triggers retries) will silently reduce measured throughput. Set the gateway log level to error and fix anything the log reports before measuring.

Verify connection capacity with c1000k

If your target is above a few hundred thousand concurrent connections, verify that the host kernel and file descriptor limits can actually sustain them. The c1000k tool is a small probe that can simulate one million concurrent connections:

# On the server node
./server 7000

# On the client node
./client <server-ip> 7000

Next steps

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation