Product Release2026-05-04

Google Vertex AI embeddings join the multi-provider AI surface

Google Vertex AI embedding models are now addressable from the platform's AI configuration layer alongside OpenAI and Cohere. Endpoint-override support covers on-premises Vertex deployments ; provider choice is configuration, not a code change.

Vertex AI embedding models — text-embedding-005, text-embedding-large-exp-03-07 and the multilingual variants — now resolve through the platform's AI configuration layer alongside OpenAI and Cohere. The same RAG, vector-search and semantic-deduplication pipelines that talk to OpenAI today talk to Vertex with a single configuration change ; application code is unaffected.

Global model configuration accepts an endpoint override for on-premises Vertex deployments, where Google's customer-side managed service runs inside the customer's VPC and the public endpoint is unreachable. The override applies per-environment so dev and prod can point at different Vertex instances without conditional code.

Organisations standardising on Google Cloud infrastructure inherit a complete Google-native AI path : Vertex for the LLM, Vertex for embeddings, BigQuery for the OLAP store, Hangouts Meet for the collaboration layer. Provider choice remains a per-axis decision — embedding-model and chat-model can target different providers when the workload calls for it.

See the feature →

← All posts