Skip to main content

Version: 3.15.0

Proxy Vertex AI Requests

Vertex AI provides access to Google's Gemini models through an OpenAI-compatible API.

This guide shows how to integrate APISIX with Vertex AI using the ai-proxy plugin. With provider set to vertex-ai, you can configure your project and region through provider_conf without specifying a custom endpoint.

Prerequisite(s)

Obtain a Vertex AI Service Account Key

Create a service account and JSON key by following the Google Cloud service account documentation. Ensure the service account has permissions to call Vertex AI (for example, Vertex AI User).

Optionally save the service account JSON to an environment variable:

export GCP_SERVICE_ACCOUNT_JSON="$(cat /path/to/service-account.json)"

Create a Route to Vertex AI

Create a route with the ai-proxy plugin as such:

curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT -d '{
"id": "vertex-ai-chat",
"uri": "/anything",
"plugins": {
"ai-proxy": {
"provider": "vertex-ai",
"provider_conf": {
"project_id": "evident-xxx",
"region": "us-central1"
},
"auth": {
"gcp": {
"service_account_json": "'"$GCP_SERVICE_ACCOUNT_JSON"'"
}
},
"options": {
"model": "google/gemini-2.5-flash"
}
}
}
}'

❶ Set the provider to vertex-ai and configure project_id and region.

❷ Replace with your service account JSON.

❸ Set a model supported by Vertex AI, for example google/gemini-2.5-flash.

Verify

Send a request with the following prompts to the route:

curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "What is 1+1?" }
]
}'

You should receive a response similar to the following:

{
"choices": [
{
"message": {
"role": "assistant",
"content": "1 + 1 = 2\n"
},
"index": 0,
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"completion_tokens": 8,
"extra_properties": {
"google": {
"traffic_type": "ON_DEMAND"
}
},
"total_tokens": 19,
"prompt_tokens": 11
},
"object": "chat.completion",
"model": "google/gemini-2.5-flash",
...
}

Next Steps

You have learned how to integrate APISIX with Vertex AI. See the Vertex AI documentation and Gemini models pages for more details.

If you would like to stream responses, enable streaming in your request and use the proxy-buffering plugin to disable NGINX proxy_buffering to avoid server-sent events (SSE) being buffered.

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation