Content-based routing

Verified Code examples on this page have been automatically tested and verified.

Route requests to different LLM backends based on request body content, such as the requested model name.

About content-based routing

Content-based routing (also known as body-based routing or intelligent routing) allows you to route requests to different backends based on the content of the request body, not just headers or path. This is particularly useful for LLM applications where you want to route to different providers based on the model field in the request JSON.

For example, you might want to:

  • Route gpt-4 requests to OpenAI and claude-3 requests to Anthropic
  • Direct certain models to specific backend endpoints based on cost or performance
  • Route different model families to dedicated infrastructure

Agentgateway implements content-based routing by using route-level transformations to extract values from the request body into headers, then using header-based routing rules to select the appropriate backend.

How it works

Content-based routing works in two steps:

  1. Extract body field to header: Use a transformation policy on each route to extract a field from the JSON request body (like model) into a custom header
  2. Match on header: Use standard header matching in the HTTPRoute to route based on that header value

This pattern lets you route based on any field in the request body while using the standard Gateway API routing capabilities.

Before you begin

  1. Set up an agentgateway proxy.
  2. Set up API access to each LLM provider that you want to route to.

Route by model name

This example shows how to route requests to different backends based on the model field in the request body.

  1. Create multiple AgentgatewayBackend resources for different models. This example creates backends for OpenAI and Anthropic models.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayBackend
    metadata:
      name: openai-backend
      namespace: agentgateway-system
    spec:
      ai:
        provider:
          openai:
            model: gpt-4o
      policies:
        auth:
          secretRef:
            name: openai-secret
    ---
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayBackend
    metadata:
      name: anthropic-backend
      namespace: agentgateway-system
    spec:
      ai:
        provider:
          anthropic:
            model: claude-3-5-sonnet-latest
      policies:
        auth:
          secretRef:
            name: anthropic-secret
    EOF
  2. Create an HTTPRoute with multiple rules that match on the x-model header. The transformation policy (created in step 3) will extract the model name from the request body into this header.

    kubectl apply -f- <<EOF
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: content-routing
      namespace: agentgateway-system
    spec:
      parentRefs:
        - name: agentgateway-proxy
          namespace: agentgateway-system
      rules:
        # Route GPT models to OpenAI
        - matches:
            - path:
                type: PathPrefix
                value: /v1/chat/completions
              headers:
                - type: RegularExpression
                  name: x-model
                  value: "^gpt-.*"
          backendRefs:
            - name: openai-backend
              namespace: agentgateway-system
              group: agentgateway.dev
              kind: AgentgatewayBackend
        # Route Claude models to Anthropic
        - matches:
            - path:
                type: PathPrefix
                value: /v1/chat/completions
              headers:
                - type: RegularExpression
                  name: x-model
                  value: "^claude-.*"
          backendRefs:
            - name: anthropic-backend
              namespace: agentgateway-system
              group: agentgateway.dev
              kind: AgentgatewayBackend
    EOF
  3. Create a AgentgatewayPolicy resource to extract the model field from the request body into the x-model header. The transformation uses a CEL expression to parse the JSON body and extract the model field. This policy must target the Gateway with phase: PreRouting to run before route selection.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayPolicy
    metadata:
      name: extract-model
      namespace: agentgateway-system
    spec:
      targetRefs:
      - group: gateway.networking.k8s.io
        kind: Gateway
        name: agentgateway-proxy
      traffic:
        phase: PreRouting
        transformation:
          request:
            set:
            - name: "x-model"
              value: 'json(request.body).model'
    EOF
  1. Send a request with gpt-4o in the model field. Verify that the request routes to the OpenAI backend.

    curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json -d '{
      "model": "gpt-4o",
      "messages": [{"role": "user", "content": "Say hello"}]
    }' | jq -r '.model'

    Example output:

    gpt-4o-2024-08-06
    curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{
      "model": "gpt-4o",
      "messages": [{"role": "user", "content": "Say hello"}]
    }' | jq -r '.model'

    Example output:

    gpt-4o-2024-08-06
  2. Send a request with claude-3-5-sonnet-latest in the model field. Verify that the request routes to the Anthropic backend.

    curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json -d '{
      "model": "claude-3-5-sonnet-latest",
      "messages": [{"role": "user", "content": "Say hello"}]
    }' | jq -r '.model'

    Example output:

    claude-3-5-sonnet-20241022
    curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{
      "model": "claude-3-5-sonnet-latest",
      "messages": [{"role": "user", "content": "Say hello"}]
    }' | jq -r '.model'

    Example output:

    claude-3-5-sonnet-20241022

Route by custom field

You can extract any field from the request body for routing decisions, not just the model field.

This example shows routing based on a custom priority field in the request body to route high-priority requests to dedicated infrastructure.

  1. Create backends for different priority levels.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayBackend
    metadata:
      name: high-priority-backend
      namespace: agentgateway-system
    spec:
      ai:
        provider:
          openai:
            model: gpt-4o
      policies:
        auth:
          secretRef:
            name: openai-secret
    ---
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayBackend
    metadata:
      name: standard-priority-backend
      namespace: agentgateway-system
    spec:
      ai:
        provider:
          openai:
            model: gpt-4o-mini
      policies:
        auth:
          secretRef:
            name: openai-secret
    EOF
  2. Create an HTTPRoute with rules that extract a custom field (like priority or user_tier) from the request body.

    kubectl apply -f- <<EOF
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: priority-routing
      namespace: agentgateway-system
    spec:
      parentRefs:
        - name: agentgateway-proxy
          namespace: agentgateway-system
      rules:
        - matches:
            - path:
                type: PathPrefix
                value: /v1/chat/completions
              headers:
                - type: Exact
                  name: x-priority
                  value: "high"
          filters:
            - type: ExtensionRef
              extensionRef:
                group: gateway.kgateway.dev
                kind: AgentgatewayPolicy
                name: extract-priority
          backendRefs:
            - name: high-priority-backend
              namespace: agentgateway-system
              group: agentgateway.dev
              kind: AgentgatewayBackend
        - matches:
            - path:
                type: PathPrefix
                value: /v1/chat/completions
          filters:
            - type: ExtensionRef
              extensionRef:
                group: gateway.kgateway.dev
                kind: AgentgatewayPolicy
                name: extract-priority
          backendRefs:
            - name: standard-priority-backend
              namespace: agentgateway-system
              group: agentgateway.dev
              kind: AgentgatewayBackend
    EOF
  3. Create a AgentgatewayPolicy to extract the custom field. Use the has() macro to provide a default value if the field is not present. This policy must target the Gateway with phase: PreRouting to run before route selection.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayPolicy
    metadata:
      name: extract-priority
      namespace: agentgateway-system
    spec:
      targetRefs:
      - group: gateway.networking.k8s.io
        kind: Gateway
        name: agentgateway-proxy
      traffic:
        phase: PreRouting
        transformation:
          request:
            set:
            - name: "x-priority"
              value: 'has(json(request.body).priority) ? json(request.body).priority : "standard"'
    EOF
  4. Test the routing by sending requests with different priority values.

    curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{
      "model": "gpt-4o",
      "priority": "high",
      "messages": [{"role": "user", "content": "Urgent request"}]
    }' | jq -r '.model'

    Routes to the high-priority backend using gpt-4o.

    curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{
      "model": "gpt-4o",
      "messages": [{"role": "user", "content": "Normal request"}]
    }' | jq -r '.model'

    Routes to the standard-priority backend using gpt-4o-mini.

Known limitations

When implementing content-based routing, be aware of these limitations:

⚠️
PreRouting phase required: Content-based routing requires traffic.phase: PreRouting and must target the Gateway (not HTTPRoute). This way, transformations run before route selection. Without PreRouting, the extracted header arrives too late for route matching.
  • Performance impact: Extracting fields from the request body adds processing overhead. For high-throughput scenarios, consider using header-based routing when possible.
  • JSON parsing: The json() CEL function requires valid JSON. Malformed JSON in the request body will cause routing failures.

Cleanup

You can remove the resources that you created in this guide.
kubectl delete httproute content-routing priority-routing -n agentgateway-system
kubectl delete AgentgatewayPolicy extract-model extract-priority -n agentgateway-system
kubectl delete AgentgatewayBackend openai-backend anthropic-backend high-priority-backend standard-priority-backend -n agentgateway-system

Next steps

Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.