OpenAI moderation

The OpenAI Moderation API detects potentially harmful content across categories including hate, harassment, self-harm, sexual content, and violence.

Before you begin

  1. Set up an agentgateway proxy.
  2. Set up access to the OpenAI LLM provider.

Block harmful content

  1. Configure the prompt guard to use OpenAI Moderation:

    kubectl apply -f - <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayPolicy
    metadata:
      name: openai-prompt-guard
      namespace: agentgateway-system
    spec:
      targetRefs:
      - group: gateway.networking.k8s.io
        kind: HTTPRoute
        name: openai
      backend:
        ai:
          promptGuard:
            request:
            - openAIModeration:
                policies:
                  auth:
                    secretRef:
                      name: openai-secret
                model: omni-moderation-latest
              response:
                message: "Content blocked by moderation policy"
    EOF
  2. Test with content that triggers moderation.

    curl -i "$INGRESS_GW_ADDRESS/openai" \
      -H "content-type: application/json" \
      -d '{
        "model": "gpt-4o-mini",
        "messages": [
          {
            "role": "user",
            "content": "I want to harm myself"
          }
        ]
      }'
    curl -i "localhost:8080/openai" \
      -H "content-type: application/json" \
      -d '{
        "model": "gpt-4o-mini",
        "messages": [
          {
            "role": "user",
            "content": "I want to harm myself"
          }
        ]
      }'

    Expected response:

    HTTP/1.1 403 Forbidden
    Content blocked by moderation policy

Cleanup

You can remove the resources that you created in this guide.
kubectl delete AgentgatewayPolicy openai-prompt-guard -n agentgateway-system 
Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.