Prompt templates
Verified Code examples on this page have been automatically tested and verified.Use model-level transformations to dynamically customize LLM request parameters based on request context such as headers, user identity, or other runtime information. Agentgateway uses CEL (Common Expression Language) expressions to evaluate and set LLM request fields at runtime.
About LLM transformations
Model-level transformations allow you to dynamically compute LLM request fields using CEL expressions that can reference incoming request headers, existing request fields, and other context. This is useful for enforcing per-user policies, customizing model behavior based on caller identity, and applying conditional request modifications without changing client code.
To learn more about CEL, see the following resources:
Before you begin
Install theagentgateway binary.Conditionally set max tokens based on user identity
Use a CEL expression in the model-level transformation field to dynamically set max_tokens based on the caller’s identity from a request header. This example gives admin users a higher token limit than regular users.
cat <<'EOF' > config.yaml
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
models:
- name: "*"
provider: openAI
params:
apiKey: "$OPENAI_API_KEY"
transformation:
max_tokens: "request.headers['x-user-id'] == 'admin' ? 100 : 10"
EOFThe response follows the prepended and appended guidelines even though they were not in the original request.
Dynamic prompt templates
Dynamic templates use CEL transformations to inject variables from the request context into prompts. This is ideal for personalizing prompts with user identity, adding request metadata, or applying conditional prompt modification based on headers or claims.
Inject user identity from headers
Configure transformations to inject user identity from request headers into the prompt.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- backends:
- ai:
name: openai
provider:
openAI:
model: gpt-3.5-turbo
policies:
backendAuth:
key: "$OPENAI_API_KEY"
transformations:
request:
body: |
json(request.body).with(body,
{
"model": body.model,
"messages": [{"role": "system", "content": "You are assisting user: " + default(request.headers["x-user-id"], "anonymous")}]
+ body.messages
}
).toJson()Send a request as a regular user and verify the response is capped at the lower token limit.
curl -s http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-user-id: alice" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Tell me a story"}]
}' | jq .In the responses, the admin user receives up to 100 completion tokens while the regular user is capped at 10.
Available CEL variables
You can use these variables in your CEL transformation expressions.
| Variable | Description | Example |
|---|---|---|
request.headers["name"] | Request header values | request.headers["x-user-id"] |
request.path | Request path | request.path returns / |
request.method | HTTP method | request.method returns POST |
llmRequest.max_tokens | Original max_tokens from the request | min(llmRequest.max_tokens, 100) |
llmRequest.model | Requested model name | llmRequest.model |
For a complete list of available variables and functions, see the CEL reference documentation.
Common transformation patterns
Cap token usage
Enforce a maximum token limit regardless of what the client requests.
llm:
models:
- name: "*"
provider: openAI
params:
apiKey: "$OPENAI_API_KEY"
transformation:
max_tokens: "min(llmRequest.max_tokens, 1024)"Set temperature based on headers
Allow callers to control creativity through a header while enforcing bounds.
llm:
models:
- name: "*"
provider: openAI
params:
apiKey: "$OPENAI_API_KEY"
transformation:
temperature: "request.headers['x-creativity'] == 'high' ? 0.9 : 0.1"Combine multiple transformations
Apply several field-level transformations in a single configuration.
llm:
models:
- name: "*"
provider: openAI
params:
apiKey: "$OPENAI_API_KEY"
transformation:
max_tokens: "request.headers['x-user-tier'] == 'premium' ? 4096 : 256"
temperature: "request.headers['x-user-tier'] == 'premium' ? 0.8 : 0.3"Next steps
- Learn about CEL expressions for advanced expression logic.
- Explore transformations for more LLM request transformation examples.
- Set up authentication to use JWT claims in transformations.