Skip to main content

Envoy AI Gateway v1.0.x

The first stable, generally available release. The v1beta1 control-plane API is now covered by our stability guarantee — with 16 LLM providers, an MCP gateway, multimodal and audio endpoints, enterprise observability, and multi-tenant routing, all on the proven Envoy Gateway foundation.

v1.0.0

June 23, 2026
General AvailabilityStable v1beta1 APISemVer Guarantee16 ProvidersMCP GatewayMultimodal & AudioEnterprise ObservabilityMulti-Tenant Routing
Envoy AI Gateway v1.0.0 marks General Availability. With this release the core control-plane API — AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, and MCPRoute, all served at v1beta1 — is declared stable: within the 1.x series we will not make breaking changes to it unless required by a critical security fix, and any such change will ship with a documented migration path. Upgrading from v0.7 requires no changes to your resources. 1.0 brings together everything built since the first release in February 2025: a single OpenAI-compatible API across 16 providers with cross-provider translation, a full Model Context Protocol gateway, multimodal and audio endpoints, enterprise-grade observability, and multi-tenant, quota-aware routing — all as an additive layer on CNCF Envoy Gateway. This is the foundation we set out to build when we said “Onward to 1.0”: a stable base you can run production AI infrastructure on.

🎉 What 1.0 Means

1.0 is not just another feature release — it is a commitment. From the very first release we said the major version would arrive once we had a first stable control-plane API. With v1.0, that moment is here: the core CRDs you author every day are now a stable contract you can build on with confidence.

Practically, General Availability means three things:

  • A stable API. Your AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, and MCPRoute resources — all served at v1beta1 — will not break under you within the 1.x series.
  • Predictable upgrades. Upgrading the controller will not break a valid, migrated configuration, and any change that ever requires action will ship with a documented path.
  • A complete platform. Everything assembled since v0.1 — 16 providers, a full MCP gateway, multimodal and audio endpoints, enterprise observability, and multi-tenant routing — is now production-ready on the proven Envoy Gateway foundation.

Our API stability commitment

For stable releases, we will never break the APIs unless there is a critical security issue, and we will always provide a migration path in the release notes if we ever must. Following Semantic Versioning, the v1beta1 control-plane API will remain backward compatible for the entire 1.x series — breaking changes would only ever land in a future 2.0. See the full support policy.

🧭 The Road to 1.0

When v0.1 shipped in February 2025, Envoy AI Gateway put a unified API in front of two providers. 1.0 is the culmination of everything built since:

16 LLM providers

From 2 at launch to 16 today — all behind one OpenAI-compatible API.

Full endpoint coverage

Chat, completions, embeddings, images, audio, and the Responses API.

MCP gateway

Multiplex, route, and authorize Model Context Protocol servers.

Enterprise observability

OpenTelemetry + OpenInference tracing and GenAI token metrics.

Multi-tenant routing

Hostname scoping, model virtualization, and quota-aware limits.

A stable v1beta1 API

A SemVer-backed contract you can build production on.

✨ The 1.0 Feature Surface

These are the capabilities the stable 1.0 control plane brings together.

A Stable, Versioned Control-Plane API

Core CRDs now covered by the stability guarantee

The control-plane API you build on — AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, and MCPRoute, all served at v1beta1 — is now a stable contract. Within the 1.x series these APIs will not change in a breaking way unless required by a critical security fix, and any such change will ship with a documented migration path. You can standardize on Envoy AI Gateway without tracking a moving target.

Universal LLM Access

One OpenAI-compatible API across 16 providers

Reach OpenAI, Azure OpenAI, Google Gemini, Google Vertex AI, AWS Bedrock, Anthropic, Mistral, Cohere, Groq, Together AI, DeepInfra, DeepSeek, Hunyuan, SambaNova, Grok, and the Tetrate Agent Router Service through a single endpoint. Point your application at the gateway once and switch or mix providers without changing client code.

Cross-provider request/response translation

Translate between provider protocols transparently — Anthropic /v1/messages to OpenAI /v1/chat/completions, and Anthropic Messages to AWS Bedrock Converse and InvokeModel — including streaming, tool use, reasoning/thinking blocks, and images, so clients reach any backend without rewriting their integration.

Model virtualization with modelNameOverride

Expose stable, application-facing model names while the gateway maps them to provider-specific models. Enables A/B testing, gradual migrations, and multi-provider strategies without touching client code, and guards against provider lock-in.

Full Endpoint Coverage

Chat, completions, embeddings, and images

/v1/chat/completions, /v1/completions, /v1/embeddings, and /v1/images/generations are supported across compatible providers, all behind the same auth, rate limiting, and observability.

Audio: transcription, translation, and speech

/v1/audio/transcriptions, /v1/audio/translations, and /v1/audio/speech bring speech-to-text and text-to-speech workloads through the gateway alongside the rest of your AI traffic.

OpenAI Responses API and multimodal inputs

The /v1/responses endpoint is supported, including on Azure OpenAI backends, and chat requests accept image, audio_url, and video_url content parts for compatible backends.

MCP Gateway

Aggregate and route Model Context Protocol servers

Multiplex multiple MCP servers behind one endpoint with MCPRoute, including tool routing and include/exclude filtering, so clients see one unified tool surface.

Fine-grained, CEL-based authorization

Enforce per-tool authorization. tools/list applies the same rules as tools/call, so callers only discover the tools they are allowed to invoke — no leaked tool names, no wasted LLM turns.

Per-backend header forwarding with JWT claim projection

Forward selected request headers and project JWT claims to individual MCP backends, enabling identity-aware access to downstream tool servers.

Traffic Management & Multi-Tenancy

Hostname-based multi-tenant routing

Serve different model sets per hostname from a single Gateway with AIGatewayRoute.spec.hostnames; the /v1/models endpoint scopes its response to the models for the matching host.

Token- and quota-aware rate limiting

Rate limit on model tokens and per QuotaPolicy, with backend rate limit filter injection to enforce quota-based throttling against upstream provider limits.

Provider fallback and InferencePool support

Automatic failover across providers, plus intelligent endpoint selection for self-hosted models via the Gateway API Inference Extension.

Provider Authentication & Compliance

BackendSecurityPolicy for upstream authentication

Centralize provider credentials with API key, AWS, Azure, and GCP cloud-native identity — including GKE Workload Identity via Application Default Credentials — keeping secrets out of application code.

Request/response body redaction

Redact sensitive request and response bodies to meet compliance requirements without losing the rest of your observability.

Enterprise Observability

OpenTelemetry tracing with OpenInference

Full request-lifecycle tracing, compatible with AI evaluation tools like Arize Phoenix, for deep visibility into every LLM and MCP request.

GenAI token metrics and reasoning-token accounting

Prometheus metrics for token usage, time-to-first-token, and inter-token latency, with separate accounting for reasoning tokens so cost attribution stays accurate for thinking models.

🔗 API & Stability

  • The v1beta1 API is now stable: v1.0 does not change the API surface. Instead it elevates the existing v1beta1 CRDs to a stable contract under our support policy: no new apiVersion is introduced and no resource migration is required. New fields added during the 1.x series will remain backward compatible.

⚠️ Breaking Changes

None. v1.0 introduces no breaking changes. The v1beta1 API is unchanged — 1.0 declares it stable rather than altering it — so there is no apiVersion bump and no resource migration. If you are running v0.7, your existing resources work as-is.

🛡️ Support & Compatibility Policy

With 1.0, the project's support policy applies in full:

  • API compatibility. The v1beta1 CRDs are stable for the 1.x series. New fields are added in a backward-compatible way; breaking changes are reserved for a future major version and would ship with a migration path.
  • Controller upgrades. Upgrading the controller will not break a valid configuration. As with every release, upgrade at most two minor versions at a time, following any documented migration steps.
  • Envoy Gateway compatibility. Each release is built on the latest stable Envoy Gateway (and therefore Envoy Proxy); keep Envoy Gateway up to date before upgrading Envoy AI Gateway.
  • End of life. A release is supported until two releases after it, consistent with prior versions.

📖 Upgrade Guidance

Upgrading from v0.7 is a drop-in change — there are no API or resource changes:

  1. Update the Helm chart / controller image to the v1.0.0 release.
  2. Roll out as usual. Your existing v1beta1 resources require no edits.

If you are on an older release, upgrade one or two minor versions at a time and follow the migration steps in each series' release notes (notably the v0.6 promotion of the core CRDs to v1beta1) before moving to 1.0.

📦 Dependencies Versions

Go 1.26.4

Built with Go 1.26.4.

Envoy Gateway v1.8.1

Built on Envoy Gateway v1.8.1 for proven data plane capabilities.

Envoy v1.38

Leveraging Envoy Proxy v1.38.1 for battle-tested networking.

Gateway API v1.5.1

Support for Gateway API v1.5.1 specifications.

Gateway API Inference Extension v1.0.2

Continued integration with Gateway API Inference Extension v1.0.2 for intelligent endpoint selection.

MCP Go SDK v1.6.1

Built on MCP Go SDK v1.6.1 for the latest Model Context Protocol features.

⏩ Patch Releases

🙏 Acknowledgements

1.0 belongs to everyone who got us here. Our deepest thanks to:

  • The maintainers across Tetrate, Bloomberg, Tencent, and Nutanix, and the many independent contributors who shaped the project through code, reviews, and weekly community meetings.
  • The early adopters — including LY Corporation, Alan by Comma Soft, and NRP — who used, tested, and fed back what mattered.
  • The broader Gateway API, Envoy, and CNCF communities whose standards this project is built on.

🔮 What's Next (Beyond 1.0)

A stable API is a starting line, not a finish line. On the roadmap:

  • A dedicated MCPBackend CRD, decoupling MCP backend configuration from MCPRoute.
  • Deeper MCP authorization across tools, resources, and prompts.
  • Fuller quota-aware routing that automatically steers around rate-limited upstreams.
  • More provider translation paths and expanded multimodal support.

The roadmap is community-driven — join us and help shape it.