Skip to main content

Google Vertex AI Upstream

In this guide, you will route AISIX AI Gateway to Google Vertex AI. Callers can reach Vertex-hosted Gemini and partner models through the gateway's OpenAI-compatible API.

This configuration is for Vertex-hosted models that should use AISIX authentication, model allowlists, rate limits, and usage accounting. AISIX authenticates to Vertex with a GCP OAuth2 bearer token and can mint that token from a service-account key.

Prerequisites

Before starting, prepare the following:

  • A gateway with the admin API on :3001 and the proxy API on :3000.
  • The admin key from the gateway config.yaml.
  • The Vertex AI API enabled in the target GCP project.
  • A Vertex region, such as us-central1, and a service account that can invoke the target model.
  • The GCP project ID, Vertex model ID, caller-facing alias, and optional proxy or private endpoint host.

Configure the Vertex Upstream

Create a Vertex provider key, model alias, and caller API key. The provider key stores the GCP project, region, and credential mode; the model selects the Vertex publisher model ID.

Create a Vertex Provider Key

Create the Vertex provider key with the GCP credential settings:

export AISIX_ADMIN_KEY="admin-local-only-change-me"

curl -sS -X POST "http://127.0.0.1:3001/admin/v1/provider_keys" \
-H "Authorization: Bearer ${AISIX_ADMIN_KEY}" \
-H "Content-Type: application/json" \
-d '{
"display_name": "vertex-prod",
"provider": "google-vertex",
"adapter": "vertex",
"secret": "{\"project\":\"my-gcp-project\",\"region\":\"us-central1\",\"service_account_json\":{\"type\":\"service_account\",\"private_key\":\"-----BEGIN PRIVATE KEY-----\\nYOUR_SERVICE_ACCOUNT_PRIVATE_KEY\\n-----END PRIVATE KEY-----\\n\",\"client_email\":\"vertex-sa@my-gcp-project.iam.gserviceaccount.com\",\"token_uri\":\"https://oauth2.googleapis.com/token\"}}"
}'

Provider key secrets follow the credential-handling behavior described in Provider Keys.

provider labels the upstream.

adapter selects Vertex.

secret is a JSON string with project, region, and exactly one credential mode. The example uses service_account_json as a nested object inside the secret. The region drives the <region>-aiplatform.googleapis.com host unless you override api_base.

Use service_account_json unless you already manage short-lived GCP access tokens yourself. If you use access_token, you are responsible for refreshing it.

Save the returned provider key ID for the model resource.

Create a Model

Map a caller-facing alias to the Vertex model ID:

export PROVIDER_KEY_ID="YOUR_PROVIDER_KEY_ID"

curl -sS -X POST "http://127.0.0.1:3001/admin/v1/models" \
-H "Authorization: Bearer ${AISIX_ADMIN_KEY}" \
-H "Content-Type: application/json" \
-d '{
"display_name": "gemini-prod",
"provider": "google-vertex",
"model_name": "gemini-2.5-flash",
"provider_key_id": "'"${PROVIDER_KEY_ID}"'"
}'

provider uses the same label as the provider key.

model_name is the Vertex publisher model ID.

provider_key_id attaches the model to the Vertex credential.

Other supported examples include Claude on Vertex, OpenAI-compatible partner models such as Llama, Mistral, and AI21.

AISIX chooses the Vertex route from model_name:

Model ID FamilyHow AISIX Sends It to Vertex
Gemini models, such as gemini-*Uses the Google Gemini publisher route.
Claude models, such as claude-*Uses the Anthropic publisher route with an Anthropic Messages body.
OpenAI-compatible MaaS models, such as Llama, DeepSeek, Qwen, GPT-OSS, MiniMax, Moonshot, or Z.aiUses Vertex's OpenAI-compatible chat-completions route.
Mistral and AI21 modelsUses the partner publisher route with an OpenAI-compatible body.

Create a Caller API Key

Choose the plaintext caller API key that the application will send to AISIX, then hash it for the admin resource:

export AISIX_API_KEY="sk-vertex-caller"

CALLER_KEY_HASH=$(printf '%s' "${AISIX_API_KEY}" | shasum -a 256 | awk '{print $1}')

Create the API key resource with access to the Vertex-backed model alias:

curl -sS -X POST "http://127.0.0.1:3001/admin/v1/apikeys" \
-H "Authorization: Bearer ${AISIX_ADMIN_KEY}" \
-H "Content-Type: application/json" \
-d '{
"key_hash": "'"${CALLER_KEY_HASH}"'",
"allowed_models": ["gemini-prod"]
}'

allowed_models must match the model alias you created.

Verify the Upstream

Send a chat-completions request through the AISIX proxy. The example uses Gemini, which requires at least one user or assistant turn.

curl -sS -X POST "http://127.0.0.1:3000/v1/chat/completions" \
-H "Authorization: Bearer ${AISIX_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-prod",
"messages": [
{"role": "user", "content": "Say hello from Vertex."}
]
}'

The gateway returns an OpenAI-compatible response with the caller-facing alias:

{
"object": "chat.completion",
"model": "gemini-prod",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello from Vertex!"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 4,
"completion_tokens": 4,
"total_tokens": 8
}
}

Check Vertex logs, metrics, quota usage, or provider-side request records for the test request. If AISIX returns a token-minting or upstream authentication error, check the service-account key, region, Vertex API enablement, IAM role, and model access.

Behavior and Limits

Publisher selection is prefix-based. If model_name does not match a supported prefix, the gateway rejects the request before the provider request with an unsupported-publisher configuration error.

The example uses Gemini because it is the main Google publisher path. For partner models, validate the exact model ID, quota, and regional availability in your Vertex project before exposing the alias to callers.

Provider-key request and response overrides can apply on Vertex routes, but they are most directly useful on OpenAI-compatible routes. Gemini's native contents format does not match every OpenAI-style override target.

Next Steps

You have now configured Google Vertex AI as an upstream provider family. Use the same pattern for other Vertex-hosted models by changing the project, region, credential, model ID, and caller-facing alias.

API7.ai Logo

The digital world is connected by APIs,
API7.ai exists to make APIs more efficient, reliable, and secure.

Sign up for API7 newsletter

Product

API7 Gateway

SOC2 Type IIISO 27001HIPAAGDPRRed Herring

Copyright © APISEVEN PTE. LTD 2019 – 2026. Apache, Apache APISIX, APISIX, and associated open source project names are trademarks of the Apache Software Foundation