Skip to main content

Announcing Envoy AI Gateway 1.0 — A Stable, Production-Ready AI Gateway

· 6 min read
Erica Hughberg
Envoy AI Gateway Maintainer - Tetrate
Dan Sun
Envoy AI Gateway Maintainer - Bloomberg
Takeshi Yoneda
Envoy AI Gateway Maintainer - Netflix
Aaron Choo
Envoy AI Gateway Maintainer - Bloomberg
Yao Weng
Envoy AI Gateway Maintainer - Bloomberg
Xunzhuo (Bit) Liu
Envoy AI Gateway Maintainer - Tencent
Ignasi Barrera
Founding Engineer - Tetrate
Johnu George
Envoy AI Gateway Maintainer - Nutanix
Gavrish Prabhu
Envoy AI Gateway Maintainer - Nutanix

Announcing Envoy AI Gateway 1.0

Today we're thrilled to announce Envoy AI Gateway 1.0 — the first stable, generally available release of the open source AI gateway built on CNCF's Envoy Gateway.

When we shipped v0.1 in February 2025, we closed that post with three words: "Onward to 1.0!" Sixteen months and many releases later, backed by a community of maintainers and adopters across the industry, we're here. 1.0 means you can build on Envoy AI Gateway with confidence: a control-plane API we're committing to keep stable, running on the same battle-tested Envoy foundation that already powers production traffic at the world's largest companies.

🔒 What 1.0 Means: Stability You Can Build On

The headline of 1.0 isn't a single new feature — it's a promise. Our release policy has always said that we'd cut the major v1.0.0 release once we had a first stable control-plane API. That moment is here.

For stable APIs, the commitment is simple and strong:

We will never break the APIs unless there is a critical security issue, and we will always provide a migration path in the release notes if we ever must.

Concretely, that means:

GuaranteeWhat it means for you
Stable CRDsThe resources you author — AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, MCPRoute, all served at v1beta1 — won't break under you.
Predictable upgradesUpgrading the controller won't break a valid, migrated configuration.
Documented migrationsAny future change that requires action will ship with a clear, documented upgrade path.

This is the foundation enterprises have been asking for — the ability to standardize on a single, provider-agnostic AI gateway without betting on a moving target.

🛣️ The Road from 0.1 to 1.0

v0.1 launched with a unified API in front of two providers and the essentials: upstream authorization and token-based rate limiting. 1.0 is a different animal. Here's how far the project has come:

Capabilityv0.1 (Feb 2025)1.0
AI providers2 (OpenAI, AWS Bedrock)16, with cross-provider request/response translation
API surfaceChat completionsChat, completions, embeddings, image generation, audio (transcription / translation / speech), and the OpenAI Responses API
MCP (Model Context Protocol)A full MCP gateway: server multiplexing, tool routing & filtering, and fine-grained authorization
MultimodalImage, audio, and video inputs across supported providers
ObservabilityBasic metricsOpenTelemetry tracing, OpenInference, GenAI token metrics, separate reasoning-token accounting
Multi-tenancy & routingToken rate limitingHostname-based routing, model virtualization, and quota-aware rate limiting
Control-plane APIv1alpha1 (experimental)v1beta1 (stable)

Sixteen providers, integrated through a single OpenAI-compatible interface, including OpenAI, Azure OpenAI, Google Gemini, Google Vertex AI, AWS Bedrock, Anthropic, Mistral, Cohere, Groq, Together AI, DeepInfra, DeepSeek, Hunyuan, SambaNova, Grok, and the Tetrate Agent Router Service.

✨ What's in 1.0

One API, every provider

Point your application at a single OpenAI-compatible endpoint and let the gateway handle provider-specific translation, authentication, and routing. Switch or mix providers without touching application code — and use model virtualization to keep your app code stable while routing changes underneath:

backendRefs:
- name: openai-backend
modelNameOverride: "gpt-4o"
- name: anthropic-backend
modelNameOverride: "claude-opus-4"

This is the key to A/B testing, gradual migrations, multi-provider strategies, and safeguarding against vendor lock-in.

Provider authentication, handled for you

BackendSecurityPolicy keeps provider credentials out of your application and centralizes upstream auth — API keys, AWS, Azure, and GCP cloud-native identity (including Workload Identity), all managed at the gateway.

An MCP gateway for the agent era

Aggregate multiple Model Context Protocol servers behind one endpoint, route and filter the tools clients can see, and enforce fine-grained, CEL-based authorization — so tools/list only ever returns what a caller is actually allowed to use.

Enterprise observability, built in

Token-aware metrics, OpenTelemetry tracing with OpenInference compatibility (for evaluation tools like Arize Phoenix), and separate accounting for reasoning tokens give you the cost control and visibility that AI workloads demand.

Standards all the way down

Built on the Kubernetes Gateway API and the Gateway API Inference Extension, Envoy AI Gateway is an additive layer on Envoy Gateway — it expands what Envoy can do for GenAI traffic without changing how you already deploy and operate it.

🌍 Built by a Community, on CNCF Envoy

1.0 is the work of a genuinely cross-industry community. Maintainers come from Tetrate, Bloomberg, Tencent, Netflix, and Nutanix, alongside a growing roster of independent contributors who join our weekly community meetings, file issues, and ship code.

Just as importantly, the project has been shaped by real-world use. Our thanks to the organizations who used, tested, and shared feedback along the way — including LY Corporation, Alan by Comma Soft, and the National Research Platform.

Star History Chart

Visit us on GitHub and star the repo to show your support.

🔭 What's Next, Beyond 1.0

A stable API is a starting line, not a finish line. On the roadmap:

  • A dedicated MCPBackend CRD, decoupling MCP backend configuration from MCPRoute.
  • Deeper MCP authorization across tools, resources, and prompts.
  • Fuller quota-aware routing that automatically steers around rate-limited upstreams.
  • More provider translation paths and expanded multimodal support.

The roadmap is community-driven — and we'd love your help shaping it.

🚀 Get Involved

ActionResourceDescription
Try 1.0 todayDownload the releaseGet the latest release and start exploring
Follow the getting started guideStep-by-step setup instructions
Explore the examplesReal-world configuration examples
Join the communityWeekly community meetingsAdd it to your calendar
Slack #envoy-ai-gatewayJoin the conversation on Envoy Slack
GitHub DiscussionsShare experiences and ask questions
ContributeReport issuesHelp us improve by reporting bugs
Request featuresTell us what you need next
Submit codeBuild the next release with us

🙏 Acknowledgments

1.0 belongs to everyone who got us here: the maintainers and contributors who wrote the code and reviews, the early adopters who tested pre-releases and told us what broke, and the broader Gateway API, Envoy, and CNCF communities whose standards we build on. Thank you.

The future of AI infrastructure is open, stable, and community-driven. We can't wait to see what you build on it.

🚀 This is just the beginning.


Envoy AI Gateway 1.0 is available now. For detailed release notes, API changes, and upgrade guidance, visit our release notes page.