Content-based routing
Verified Code examples on this page have been automatically tested and verified.Route requests to different LLM backends based on request body content, such as the requested model name.
About content-based routing
Content-based routing (also known as body-based routing or intelligent routing) allows you to route requests to different backends based on the content of the request body, not just headers or path. This is particularly useful for LLM applications where you want to route to different providers based on the model field in the request JSON.
For example, you might want to:
- Route
gpt-4requests to OpenAI andclaude-3requests to Anthropic - Direct certain models to specific backend endpoints based on cost or performance
- Route different model families to dedicated infrastructure
Agentgateway implements content-based routing by using route-level transformations to extract values from the request body into headers, then using header-based routing rules to select the appropriate backend.
How it works
Content-based routing works in two steps:
- Extract body field to header: Use a transformation policy on each route to extract a field from the JSON request body (like
model) into a custom header - Match on header: Use standard header matching in the HTTPRoute to route based on that header value
This pattern lets you route based on any field in the request body while using the standard Gateway API routing capabilities.
Before you begin
- Set up an agentgateway proxy.
- Set up API access to each LLM provider that you want to route to.
Route by model name
This example shows how to route requests to different backends based on the model field in the request body.
Create multiple AgentgatewayBackend resources for different models. This example creates backends for OpenAI and Anthropic models.
kubectl apply -f- <<EOF apiVersion: agentgateway.dev/v1alpha1 kind: AgentgatewayBackend metadata: name: openai-backend namespace: agentgateway-system spec: ai: provider: openai: model: gpt-4o policies: auth: secretRef: name: openai-secret --- apiVersion: agentgateway.dev/v1alpha1 kind: AgentgatewayBackend metadata: name: anthropic-backend namespace: agentgateway-system spec: ai: provider: anthropic: model: claude-3-5-sonnet-latest policies: auth: secretRef: name: anthropic-secret EOFCreate an HTTPRoute with multiple rules that match on the
x-modelheader. The transformation policy (created in step 3) will extract the model name from the request body into this header.kubectl apply -f- <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: content-routing namespace: agentgateway-system spec: parentRefs: - name: agentgateway-proxy namespace: agentgateway-system rules: # Route GPT models to OpenAI - matches: - path: type: PathPrefix value: /v1/chat/completions headers: - type: RegularExpression name: x-model value: "^gpt-.*" backendRefs: - name: openai-backend namespace: agentgateway-system group: agentgateway.dev kind: AgentgatewayBackend # Route Claude models to Anthropic - matches: - path: type: PathPrefix value: /v1/chat/completions headers: - type: RegularExpression name: x-model value: "^claude-.*" backendRefs: - name: anthropic-backend namespace: agentgateway-system group: agentgateway.dev kind: AgentgatewayBackend EOFCreate a AgentgatewayPolicy resource to extract the
modelfield from the request body into thex-modelheader. The transformation uses a CEL expression to parse the JSON body and extract the model field. This policy must target the Gateway withphase: PreRoutingto run before route selection.kubectl apply -f- <<EOF apiVersion: agentgateway.dev/v1alpha1 kind: AgentgatewayPolicy metadata: name: extract-model namespace: agentgateway-system spec: targetRefs: - group: gateway.networking.k8s.io kind: Gateway name: agentgateway-proxy traffic: phase: PreRouting transformation: request: set: - name: "x-model" value: 'json(request.body).model' EOF
Send a request with
gpt-4oin the model field. Verify that the request routes to the OpenAI backend.curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Say hello"}] }' | jq -r '.model'Example output:
gpt-4o-2024-08-06curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Say hello"}] }' | jq -r '.model'Example output:
gpt-4o-2024-08-06Send a request with
claude-3-5-sonnet-latestin the model field. Verify that the request routes to the Anthropic backend.curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json -d '{ "model": "claude-3-5-sonnet-latest", "messages": [{"role": "user", "content": "Say hello"}] }' | jq -r '.model'Example output:
claude-3-5-sonnet-20241022curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{ "model": "claude-3-5-sonnet-latest", "messages": [{"role": "user", "content": "Say hello"}] }' | jq -r '.model'Example output:
claude-3-5-sonnet-20241022
Route by custom field
You can extract any field from the request body for routing decisions, not just the model field.
This example shows routing based on a custom priority field in the request body to route high-priority requests to dedicated infrastructure.
Create backends for different priority levels.
kubectl apply -f- <<EOF apiVersion: agentgateway.dev/v1alpha1 kind: AgentgatewayBackend metadata: name: high-priority-backend namespace: agentgateway-system spec: ai: provider: openai: model: gpt-4o policies: auth: secretRef: name: openai-secret --- apiVersion: agentgateway.dev/v1alpha1 kind: AgentgatewayBackend metadata: name: standard-priority-backend namespace: agentgateway-system spec: ai: provider: openai: model: gpt-4o-mini policies: auth: secretRef: name: openai-secret EOFCreate an HTTPRoute with rules that extract a custom field (like
priorityoruser_tier) from the request body.kubectl apply -f- <<EOF apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: priority-routing namespace: agentgateway-system spec: parentRefs: - name: agentgateway-proxy namespace: agentgateway-system rules: - matches: - path: type: PathPrefix value: /v1/chat/completions headers: - type: Exact name: x-priority value: "high" filters: - type: ExtensionRef extensionRef: group: gateway.kgateway.dev kind: AgentgatewayPolicy name: extract-priority backendRefs: - name: high-priority-backend namespace: agentgateway-system group: agentgateway.dev kind: AgentgatewayBackend - matches: - path: type: PathPrefix value: /v1/chat/completions filters: - type: ExtensionRef extensionRef: group: gateway.kgateway.dev kind: AgentgatewayPolicy name: extract-priority backendRefs: - name: standard-priority-backend namespace: agentgateway-system group: agentgateway.dev kind: AgentgatewayBackend EOFCreate a AgentgatewayPolicy to extract the custom field. Use the
has()macro to provide a default value if the field is not present. This policy must target the Gateway withphase: PreRoutingto run before route selection.kubectl apply -f- <<EOF apiVersion: agentgateway.dev/v1alpha1 kind: AgentgatewayPolicy metadata: name: extract-priority namespace: agentgateway-system spec: targetRefs: - group: gateway.networking.k8s.io kind: Gateway name: agentgateway-proxy traffic: phase: PreRouting transformation: request: set: - name: "x-priority" value: 'has(json(request.body).priority) ? json(request.body).priority : "standard"' EOFTest the routing by sending requests with different priority values.
curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{ "model": "gpt-4o", "priority": "high", "messages": [{"role": "user", "content": "Urgent request"}] }' | jq -r '.model'Routes to the high-priority backend using
gpt-4o.curl "localhost:8080/v1/chat/completions" -H content-type:application/json -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Normal request"}] }' | jq -r '.model'Routes to the standard-priority backend using
gpt-4o-mini.
Known limitations
When implementing content-based routing, be aware of these limitations:
traffic.phase: PreRouting and must target the Gateway (not HTTPRoute). This way, transformations run before route selection. Without PreRouting, the extracted header arrives too late for route matching.- Performance impact: Extracting fields from the request body adds processing overhead. For high-throughput scenarios, consider using header-based routing when possible.
- JSON parsing: The
json()CEL function requires valid JSON. Malformed JSON in the request body will cause routing failures.
Cleanup
You can remove the resources that you created in this guide.kubectl delete httproute content-routing priority-routing -n agentgateway-system
kubectl delete AgentgatewayPolicy extract-model extract-priority -n agentgateway-system
kubectl delete AgentgatewayBackend openai-backend anthropic-backend high-priority-backend standard-priority-backend -n agentgateway-systemNext steps
- Learn about transformations for more advanced request manipulation
- Set up load balancing across multiple providers
- Configure failover for high availability
- Use cost tracking to monitor spending per route