Metrics and logs

Review LLM-specific metrics and logs.

ℹ️
To calculate costs from token usage metrics, see the cost tracking guide.
ℹ️
For external logging platforms (also known as prompt logging, request/response logging, or audit trail) like Langfuse and LangSmith, see the LLM Observability integrations.

Before you begin

Complete an LLM guide, such as an LLM provider-specific guide. This guide sends a request to the LLM and receives a response. You can use this request and response example to verify metrics and logs.

View LLM metrics

You can access the agentgateway metrics endpoint to view LLM-specific metrics, such as the number of tokens that you used during a request or response.

  1. Port-forward the agentgateway proxy on port 15020.
    kubectl port-forward deployment/agentgateway-proxy -n agentgateway-system 15020  
  2. Open the agentgateway metrics endpoint.
  3. Look for the agentgateway_gen_ai_client_token_usage metric. This metric is a histogram and includes important information about the request and the response from the LLM, such as:
    • gen_ai_token_type: Whether this metric is about a request (input) or response (output).
    • gen_ai_operation_name: The name of the operation that was performed.
    • gen_ai_system: The LLM provider that was used for the request/response.
    • gen_ai_request_model: The model that was used for the request.
    • gen_ai_response_model: The model that was used for the response.

For more information, see the Semantic conventions for generative AI metrics in the OpenTelemetry docs.

Track per-user metrics

When you set up API key authentication with per-user rate limiting, you can filter token usage metrics by user ID to track spending and usage patterns for each virtual key.

For a complete virtual key setup guide, see Virtual key management.

Example PromQL query for per-user token usage:

# Total tokens consumed by each user
sum by (user_id) (
  agentgateway_gen_ai_client_token_usage_sum{gen_ai_token_type="input"} +
  agentgateway_gen_ai_client_token_usage_sum{gen_ai_token_type="output"}
)

View logs

Agentgateway automatically logs information to stdout. When you run agentgateway on your local machine, you can view a log entry for each request that is sent to agentgateway in your CLI output.

To view the logs:

kubectl logs deployment/agentgateway-proxy -n agentgateway-system

Example for a successful request to the OpenAI LLM:

2025-12-12T21:56:02.809082Z	info	request gateway=agentgateway-system/agentgateway-proxy listener=http
route=agentgateway-system/openai endpoint=api.openai.com:443 src.addr=127.0.0.1:60862 http.method=POST
http.host=localhost http.path=/openai http.version=HTTP/1.1 http.status=200 protocol=llm gen_ai.
operation.name=chat gen_ai.provider.name=openai gen_ai.request.model=gpt-3.5-turbo gen_ai.response.
model=gpt-3.5-turbo-0125 gen_ai.usage.input_tokens=68 gen_ai.usage.output_tokens=298 duration=2488ms 
Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.