Content-based routing

Verified

Route requests to different LLM backends based on request body content, such as the requested model name.

About content-based routing

Content-based routing (also known as body-based routing or intelligent routing) allows you to route requests to different backends based on the content of the request body, not just headers or path. This is particularly useful for LLM applications where you want to route to different providers based on the model field in the request JSON.

For example, you might want to:

Route gpt-4 requests to OpenAI and claude-3 requests to Anthropic
Direct certain models to specific backend endpoints based on cost or performance
Route different model families to dedicated infrastructure

Agentgateway implements content-based routing by using route-level transformations to extract values from the request body into headers, then using header-based routing rules to select the appropriate backend.

How it works

Content-based routing works in two steps:

Extract body field to header: Use a transformation policy on each route to extract a field from the JSON request body (like model) into a custom header
Match on header: Use standard header matching in the HTTPRoute to route based on that header value

This pattern lets you route based on any field in the request body while using the standard Gateway API routing capabilities.

Before you begin

Set up an agentgateway proxy.
Set up API access to each LLM provider that you want to route to.

Route by model name

This example shows how to route requests to different backends based on the model field in the request body.

Create multiple AgentgatewayBackend resources for different models. This example creates backends for OpenAI and Anthropic models.

kubectl apply -f- <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: openai-backend
  namespace: agentgateway-system
spec:
  ai:
    provider:
      openai:
        model: gpt-4o
  policies:
    auth:
      secretRef:
        name: openai-secret
---
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: anthropic-backend
  namespace: agentgateway-system
spec:
  ai:
    provider:
      anthropic:
        model: claude-3-5-sonnet-latest
  policies:
    auth:
      secretRef:
        name: anthropic-secret
EOF

Create an HTTPRoute with multiple rules that match on the x-model header. The transformation policy (created in step 3) will extract the model name from the request body into this header.

kubectl apply -f- <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: content-routing
  namespace: agentgateway-system
spec:
  parentRefs:
    - name: agentgateway-proxy
      namespace: agentgateway-system
  rules:
    # Route GPT models to OpenAI
    - matches:
        - path:
            type: PathPrefix
            value: /v1/chat/completions
          headers:
            - type: RegularExpression
              name: x-model
              value: "^gpt-.*"
      backendRefs:
        - name: openai-backend
          namespace: agentgateway-system
          group: agentgateway.dev
          kind: AgentgatewayBackend
    # Route Claude models to Anthropic
    - matches:
        - path:
            type: PathPrefix
            value: /v1/chat/completions
          headers:
            - type: RegularExpression
              name: x-model
              value: "^claude-.*"
      backendRefs:
        - name: anthropic-backend
          namespace: agentgateway-system
          group: agentgateway.dev
          kind: AgentgatewayBackend
EOF

Create a AgentgatewayPolicy resource to extract the model field from the request body into the x-model header. The transformation uses a CEL expression to parse the JSON body and extract the model field. This policy must target the Gateway with phase: PreRouting to run before route selection.

kubectl apply -f- <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
  name: extract-model
  namespace: agentgateway-system
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: agentgateway-proxy
  traffic:
    phase: PreRouting
    transformation:
      request:
        set:
        - name: "x-model"
          value: 'json(request.body).model'
EOF

Send a request with gpt-4o in the model field. Verify that the request routes to the OpenAI backend.

curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json -d '{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "Say hello"}]
}' | jq -r '.model'

Example output:

gpt-4o-2024-08-06

curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "Say hello"}]
}' | jq -r '.model'

Example output:

gpt-4o-2024-08-06

Send a request with claude-3-5-sonnet-latest in the model field. Verify that the request routes to the Anthropic backend.

curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json -d '{
  "model": "claude-3-5-sonnet-latest",
  "messages": [{"role": "user", "content": "Say hello"}]
}' | jq -r '.model'

Example output:

claude-3-5-sonnet-20241022

curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{
  "model": "claude-3-5-sonnet-latest",
  "messages": [{"role": "user", "content": "Say hello"}]
}' | jq -r '.model'

Example output:

claude-3-5-sonnet-20241022

Route by custom field

You can extract any field from the request body for routing decisions, not just the model field.

This example shows routing based on a custom priority field in the request body to route high-priority requests to dedicated infrastructure.

Create backends for different priority levels.

kubectl apply -f- <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: high-priority-backend
  namespace: agentgateway-system
spec:
  ai:
    provider:
      openai:
        model: gpt-4o
  policies:
    auth:
      secretRef:
        name: openai-secret
---
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: standard-priority-backend
  namespace: agentgateway-system
spec:
  ai:
    provider:
      openai:
        model: gpt-4o-mini
  policies:
    auth:
      secretRef:
        name: openai-secret
EOF

Create an HTTPRoute with rules that extract a custom field (like priority or user_tier) from the request body.

kubectl apply -f- <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: priority-routing
  namespace: agentgateway-system
spec:
  parentRefs:
    - name: agentgateway-proxy
      namespace: agentgateway-system
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /v1/chat/completions
          headers:
            - type: Exact
              name: x-priority
              value: "high"
      filters:
        - type: ExtensionRef
          extensionRef:
            group: gateway.kgateway.dev
            kind: AgentgatewayPolicy
            name: extract-priority
      backendRefs:
        - name: high-priority-backend
          namespace: agentgateway-system
          group: agentgateway.dev
          kind: AgentgatewayBackend
    - matches:
        - path:
            type: PathPrefix
            value: /v1/chat/completions
      filters:
        - type: ExtensionRef
          extensionRef:
            group: gateway.kgateway.dev
            kind: AgentgatewayPolicy
            name: extract-priority
      backendRefs:
        - name: standard-priority-backend
          namespace: agentgateway-system
          group: agentgateway.dev
          kind: AgentgatewayBackend
EOF

Create a AgentgatewayPolicy to extract the custom field. Use the has() macro to provide a default value if the field is not present. This policy must target the Gateway with phase: PreRouting to run before route selection.

kubectl apply -f- <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayPolicy
metadata:
  name: extract-priority
  namespace: agentgateway-system
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: agentgateway-proxy
  traffic:
    phase: PreRouting
    transformation:
      request:
        set:
        - name: "x-priority"
          value: 'has(json(request.body).priority) ? json(request.body).priority : "standard"'
EOF

Test the routing by sending requests with different priority values.

curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{
  "model": "gpt-4o",
  "priority": "high",
  "messages": [{"role": "user", "content": "Urgent request"}]
}' | jq -r '.model'

Routes to the high-priority backend using gpt-4o.

curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{
  "model": "gpt-4o",
  "messages": [{"role": "user", "content": "Normal request"}]
}' | jq -r '.model'

Routes to the standard-priority backend using gpt-4o-mini.

Known limitations

When implementing content-based routing, be aware of these limitations:

⚠️

PreRouting phase required: Content-based routing requires traffic.phase: PreRouting and must target the Gateway (not HTTPRoute). This way, transformations run before route selection. Without PreRouting, the extracted header arrives too late for route matching.

Performance impact: Extracting fields from the request body adds processing overhead. For high-throughput scenarios, consider using header-based routing when possible.
JSON parsing: The json() CEL function requires valid JSON. Malformed JSON in the request body will cause routing failures.

Cleanup

You can remove the resources that you created in this guide.

kubectl delete httproute content-routing priority-routing -n agentgateway-system
kubectl delete AgentgatewayPolicy extract-model extract-priority -n agentgateway-system
kubectl delete AgentgatewayBackend openai-backend anthropic-backend high-priority-backend standard-priority-backend -n agentgateway-system

Next steps

Learn about transformations for more advanced request manipulation
Set up load balancing across multiple providers
Configure failover for high availability
Use cost tracking to monitor spending per route

Model failover Streaming

Content-based routing

About content-based routing

How it works

Before you begin

Route by model name

Route by custom field

Known limitations

Cleanup

Next steps

What could be improved?