Skip to main content
Version: latest

Supported API Endpoints

The Envoy AI Gateway provides OpenAI-compatible API endpoints as well as the Anthropic-compatible API for routing and managing LLM/AI traffic. This page documents which OpenAI API endpoints and Anthropic-compatible API endpoints are currently supported and their capabilities.

Overview

The Envoy AI Gateway acts as a proxy that accepts OpenAI-compatible and Anthropic-compatible requests and routes them to various AI providers. While it maintains compatibility with the OpenAI API specification, it currently supports a subset of the full OpenAI API.

Supported Endpoints

Chat Completions

Endpoint: POST /v1/chat/completions

Status: ✅ Fully Supported

Description: Create a chat completion response for the given conversation.

Features:

  • ✅ Streaming and non-streaming responses
  • ✅ Function calling
  • ✅ Response format specification (including JSON schema)
  • ✅ Temperature, top_p, and other sampling parameters
  • ✅ System and user messages
  • ✅ Model selection via request body or x-ai-eg-model header
  • ✅ Token usage tracking and cost calculation
  • ✅ Provider fallback and load balancing

Supported Providers:

  • OpenAI
  • AWS Bedrock (with automatic translation)
  • Azure OpenAI (with automatic translation)
  • GCP VertexAI (with automatic translation)
  • GCP Anthropic (with automatic translation)
  • Any OpenAI-compatible provider (Groq, Together AI, Mistral, Tetrate Agent Router Service, etc.)

Example:

curl -H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
]
}' \
$GATEWAY_URL/v1/chat/completions

Anthropic Messages

Endpoint: POST /anthropic/v1/messages

Status: ✅ Fully Supported

Description: Send a structured list of input messages with text and/or image content, and the model will generate the next message in the conversation.

Features:

  • ✅ Streaming and non-streaming responses
  • ✅ Function calling
  • ✅ Extended thinking
  • ✅ Response format specification (including JSON schema)
  • ✅ Temperature, top_p, and other sampling parameters
  • ✅ System and user messages
  • ✅ Model selection via request body or x-ai-eg-model header
  • ✅ Token usage tracking and cost calculation
  • ✅ Provider fallback and load balancing

Supported Providers:

  • Anthropic
  • GCP Anthropic

Example:

curl -H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"max_tokens": 100
}' \
$GATEWAY_URL/anthropic/v1/messages

Completions

Endpoint: POST /v1/completions

Status: ✅ Fully Supported

Description: Create a text completion for the given prompt (legacy endpoint).

Features:

  • ✅ Non-streaming responses
  • ✅ Streaming responses
  • ✅ Model selection via request body or x-ai-eg-model header
  • ✅ Temperature, top_p, and other sampling parameters
  • ✅ Single and batch prompt processing
  • ✅ Token usage tracking and cost calculation
  • ✅ Provider fallback and load balancing
  • ✅ Full metrics support (token usage, request duration, time to first token, inter-token latency)

Supported Providers:

  • OpenAI
  • Any OpenAI-compatible provider that supports completions

Example:

curl -H "Content-Type: application/json" \
-d '{
"model": "babbage-002",
"prompt": "def fib(n):\n if n <= 1:\n return n\n else:\n return fib(n-1) + fib(n-2)",
"max_tokens": 25,
"temperature": 0.4,
"top_p": 0.9
}' \
$GATEWAY_URL/v1/completions

Embeddings

Endpoint: POST /v1/embeddings

Status: ✅ Fully Supported

Description: Create embeddings for the given input text.

Features:

  • ✅ Single and batch text embedding
  • ✅ Model selection via request body or x-ai-eg-model header
  • ✅ Token usage tracking and cost calculation
  • ✅ Provider fallback and load balancing

Supported Providers:

  • OpenAI
  • Any OpenAI-compatible provider that supports embeddings, including Azure OpenAI.

Models

Endpoint: GET /v1/models

Description: List available models configured in the AI Gateway.

Features:

  • ✅ Returns models declared in AIGatewayRoute configurations
  • ✅ OpenAI-compatible response format
  • ✅ Model metadata (ID, owned_by, created timestamp)

Example:

curl $GATEWAY_URL/v1/models

Response Format:

{
"object": "list",
"data": [
{
"id": "gpt-4o-mini",
"object": "model",
"created": 1677610602,
"owned_by": "openai"
}
]
}

Provider-Endpoint Compatibility Table

The following table summarizes which providers support which endpoints:

ProviderChat CompletionsCompletionsEmbeddingsAnthropic MessagesNotes
OpenAI
AWS Bedrock🚧🚧Via API translation
Azure OpenAI🚧Via API translation or via OpenAI-compatible API
Google Gemini⚠️Via OpenAI-compatible API
GroqVia OpenAI-compatible API
Grok⚠️Via OpenAI-compatible API
Together AI⚠️⚠️⚠️Via OpenAI-compatible API
Cohere⚠️⚠️⚠️Via OpenAI-compatible API
Mistral⚠️⚠️⚠️Via OpenAI-compatible API
DeepInfra⚠️Via OpenAI-compatible API
DeepSeek⚠️⚠️Via OpenAI-compatible API
Hunyuan⚠️⚠️⚠️Via OpenAI-compatible API
Tencent LLM Knowledge Engine⚠️Via OpenAI-compatible API
Tetrate Agent Router Service (TARS)⚠️⚠️⚠️Via OpenAI-compatible API
Google Vertex AI🚧🚧Via OpenAI-compatible API
Anthropic on Vertex AI🚧Via OpenAI-compatible API and Native Anthropic API
SambaNova⚠️Via OpenAI-compatible API
AnthropicVia OpenAI-compatible API and Native Anthropic API
  • ✅ - Supported and Tested on Envoy AI Gateway CI
  • ⚠️️ - Expected to work based on provider documentation, but not tested on the CI.
  • ❌ - Not supported according to provider documentation.
  • 🚧 - Unimplemented, or under active development but planned for future releases

What's Next

To learn more about configuring and using the Envoy AI Gateway with these endpoints: