OpenAI-compatible providers

Verified

Configure any LLM provider that implements the OpenAI API format with agentgateway. Use the openAI provider type with hostOverride to point to the provider’s API host, and pathOverride if the provider uses a non-standard chat completions path.

Before you begin

Install the agentgateway binary.

You also need the following prerequisites.

An API key for your chosen provider (except for local providers like Ollama).

Cloud providers

xAI (Grok)

xAI provides OpenAI-compatible endpoints for their Grok models.

cat > /tmp/test-xai.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      apiKey: "$XAI_API_KEY"
      hostOverride: "api.x.ai:443"
    backendTLS: {}
EOF

Cohere

Cohere provides an OpenAI-compatible endpoint for their models. Cohere uses a custom API path, so pathOverride is required.

cat > /tmp/test-cohere.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      model: command-r-plus
      apiKey: "$COHERE_API_KEY"
      hostOverride: "api.cohere.ai:443"
      pathOverride: "/compatibility/v1/chat/completions"
    backendTLS: {}
EOF

Together AI

Together AI provides access to open-source models via OpenAI-compatible endpoints.

cat > /tmp/test-together.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: llama-3.2-90b
    provider: openAI
    params:
      model: meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo
      apiKey: "$TOGETHER_API_KEY"
      hostOverride: "api.together.xyz:443"
    backendTLS: {}
EOF

Groq

Groq provides fast inference via OpenAI-compatible endpoints. Groq uses a custom API path, so pathOverride is required.

cat > /tmp/test-groq.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      model: llama-3.3-70b-versatile
      apiKey: "$GROQ_API_KEY"
      hostOverride: "api.groq.com:443"
      pathOverride: "/openai/v1/chat/completions"
    backendTLS: {}
EOF

Fireworks AI

Fireworks AI offers fast inference for open-source models via OpenAI-compatible API. Fireworks uses a custom API path, so pathOverride is required.

cat > /tmp/test-fireworks.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      model: accounts/fireworks/models/llama-v3p1-70b-instruct
      apiKey: "$FIREWORKS_API_KEY"
      hostOverride: "api.fireworks.ai:443"
      pathOverride: "/inference/v1/chat/completions"
    backendTLS: {}
EOF

DeepSeek

DeepSeek provides access to their reasoning and chat models via OpenAI-compatible API.

cat > /tmp/test-deepseek.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      model: deepseek-chat
      apiKey: "$DEEPSEEK_API_KEY"
      hostOverride: "api.deepseek.com:443"
    backendTLS: {}
EOF

Mistral

Mistral La Plateforme provides access to Mistral models via OpenAI-compatible endpoints.

cat > /tmp/test-mistral.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      model: mistral-large-latest
      apiKey: "$MISTRAL_API_KEY"
      hostOverride: "api.mistral.ai:443"
    backendTLS: {}
EOF

Perplexity

Perplexity provides OpenAI-compatible chat completion endpoints with built-in web search.

cat > /tmp/test-perplexity.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      model: llama-3.1-sonar-large-128k-online
      apiKey: "$PERPLEXITY_API_KEY"
      hostOverride: "api.perplexity.ai:443"
    backendTLS: {}
EOF

Self-hosted solutions

Ollama

Ollama runs models locally and provides an OpenAI-compatible API. For a dedicated setup guide, see Ollama.

cat > /tmp/test-ollama.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      hostOverride: "localhost:11434"
EOF

vLLM

vLLM is a high-performance LLM serving engine for self-hosted deployments.

cat > /tmp/test-vllm.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      model: meta-llama/Llama-3.1-8B-Instruct
      hostOverride: "localhost:8000"
EOF

ℹ️

If your vLLM server uses HTTPS, add backendTLS: {} to the model configuration and include the port 443 in hostOverride.

LM Studio

LM Studio provides a desktop application for running models locally with an OpenAI-compatible API.

cat > /tmp/test-lmstudio.yaml << 'EOF'
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      hostOverride: "localhost:1234"
EOF

Enable the local server in LM Studio: Settings > Local Server > Start Server.

Generic configuration

For any OpenAI-compatible provider, use this template:

# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
  port: 3000
  models:
  - name: "*"
    provider: openAI
    params:
      apiKey: "$PROVIDER_API_KEY"
      hostOverride: "<provider-host>:<port>"
      pathOverride: "<custom-path>"  # only if non-standard
    backendTLS: {}  # only for HTTPS providers

Review the following table to understand this configuration.

Setting	Description
`name`	The model name to match in incoming requests. Use `*` to match any model name.
`provider`	Set to `openAI` for OpenAI-compatible providers.
`params.model`	The model name as expected by the provider. If omitted, the model from the client request is passed through.
`params.apiKey`	The provider’s API key. Reference environment variables with the `$VAR_NAME` syntax.
`params.hostOverride`	The provider’s API host and port (e.g., `api.example.com:443`).
`params.pathOverride`	Override the request path for providers that use non-standard endpoints (e.g., `/openai/v1/chat/completions`). Omit for providers that use the standard `/v1/chat/completions` path.
`backendTLS`	Enable TLS for the upstream connection. Required for HTTPS providers, omit for local HTTP providers.

OpenAI Ollama