Skip to main content

Metrics Reference

AISIX exposes Prometheus metrics on GET /metrics. By default, this endpoint is served on the admin listener. For managed gateways, metrics are served on a dedicated listener configured with observability.metrics.prometheus.addr.

The /metrics endpoint is unauthenticated by design. Keep the listener private to your monitoring network.

Metric families are registered lazily on first observation. Immediately after boot, /metrics can return an empty body. Send one request through the proxy, then scrape again for series to appear.

Request and Latency

MetricTypeLabelsDescription
aisix_requests_totalcounterprovider, model, status, outcomeTotal proxy requests. outcome is success, client_error, upstream_error, or rate_limited.
aisix_request_duration_secondshistogramprovider, model, statusEnd-to-end proxy request latency.
aisix_llm_requests_totalcounterendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_id, status, outcomeLLM-shaped requests through the proxy.
aisix_llm_request_duration_secondshistogramendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_id, status, outcomeEnd-to-end latency for LLM requests.
aisix_llm_api_latency_secondshistogramendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_idUpstream API latency only, excluding gateway overhead.
aisix_llm_time_to_first_token_secondshistogramendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_idTime from request entry to first generated token chunk on streaming paths.

Usage and Cost

MetricTypeLabelsDescription
aisix_tokens_consumed_totalcounterprovider, modelSum of usage.total_tokens across completed non-streaming calls.
aisix_llm_input_tokens_totalcounterendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_idInput tokens reported by the upstream.
aisix_llm_output_tokens_totalcounterendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_idOutput tokens reported by the upstream.
aisix_llm_total_tokens_totalcounterendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_idTotal tokens reported by the upstream.
aisix_llm_spend_micro_usd_totalcounterendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_idEstimated spend in micro-USD (1 USD = 1,000,000).

Proxy Health

MetricTypeLabelsDescription
aisix_proxy_requests_totalcounterendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_id, status, outcomeAll proxy requests with full label granularity.
aisix_proxy_failed_requests_totalcounterendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_id, status, outcomeSubset of aisix_proxy_requests_total where outcome is not success.
aisix_proxy_request_duration_secondshistogramendpoint, inbound_protocol, provider, model, upstream_model, provider_key_id, api_key_id, team_id, user_id, status, outcomeEnd-to-end latency with full label granularity.
aisix_proxy_in_flight_requestsgaugeendpoint, inbound_protocolCurrently active proxy requests.

Rate Limits and Budgets

MetricTypeLabelsDescription
aisix_ratelimit_rejections_totalcounterscopeRate-limit rejections by scope, such as requests or tokens.
aisix_ratelimit_remaining_requestsgaugeapi_key_id, modelRemaining request quota for the key/model pair.
aisix_ratelimit_remaining_tokensgaugeapi_key_id, modelRemaining token quota for the key/model pair.
aisix_budget_limit_usdgaugeapi_key_id, team_id, user_idBudget limit in USD.
aisix_budget_spent_usdgaugeapi_key_id, team_id, user_idBudget spent in USD.
aisix_budget_remaining_usdgaugeapi_key_id, team_id, user_idBudget remaining in USD.
aisix_budget_reset_secondsgaugeapi_key_id, team_id, user_idSeconds until the budget period resets.
aisix_budget_details_presentgaugeapi_key_id, team_id, user_id1 when budget gauges are populated, 0 when cleared.

Deployment and Routing

MetricTypeLabelsDescription
aisix_deployment_requests_totalcounterprovider, model, upstream_model, provider_key_idTotal requests dispatched to a target model.
aisix_deployment_success_responses_totalcounterprovider, model, upstream_model, provider_key_idSuccessful upstream responses from a target model.
aisix_deployment_failure_responses_totalcounterprovider, model, upstream_model, provider_key_idFailed upstream responses from a target model.
aisix_deployment_stategaugeprovider, model, upstream_model, provider_key_idRuntime health state: 0 = healthy, 1 = partial failure, 2 = down.
aisix_deployment_cooled_down_totalcounterprovider, model, upstream_model, provider_key_idTimes a target model entered cooldown.
aisix_routing_successful_fallbacks_totalcountermodelSuccessful failovers to the next routing candidate.
aisix_routing_failed_fallbacks_totalcountermodelFailed failovers where no candidate was available.

Guardrails

MetricTypeLabelsDescription
aisix_guardrail_blocks_totalcounterNoneRequests rejected by a guardrail on input or output.
aisix_guardrail_bypasses_totalcounterreasonFail-open events where a remote guardrail was unreachable but fail_open allowed the request through. reason values include bedrock_5xx, bedrock_timeout, bedrock_throttled.

Usage Events and Exporters

MetricTypeLabelsDescription
aisix_usage_events_emitted_totalcounterhandler, status_code, inbound_protocolUsage events successfully queued for delivery. status_code is bucketed as 2xx, 3xx, 4xx, 5xx, or other. handler is the endpoint name, such as chat, embeddings, or messages.
aisix_usage_event_drops_totalcounterreasonUsage events dropped because the sink was full or closed.
aisix_otlp_fanout_drops_totalcounterexporter, reasonOTLP trace spans dropped during fan-out.
aisix_otlp_fanout_failures_totalcounterexporterOTLP trace span delivery failures.

Cache

MetricTypeLabelsDescription
aisix_redis_failures_totalcounteroperationRedis cache operation failures when the Redis backend is configured.
API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation