AI agents run with real capabilities: shell access, API keys, database credentials, tool integrations. Users interact with these agents through client apps (Slack bots, web UIs, custom CLIs). Without a mediation layer, every client has a direct channel to a privileged workload, and every agent response is an unfiltered pipe back to the user. The core risk: the agent has more authority than the caller, and the caller can trick it into using that authority.

This post documents a safe default architecture for mediating all agent communication through a gateway, and its trade-offs.

System description

An AI Agent Gateway sits between client applications and AI agents, mediating every request and response. The gateway enforces authorization, inspects payloads in both directions, and logs every interaction. Agents are workloads. The gateway is the control plane. Client apps are untrusted callers.

Architecture choice

There are three common deployment models for mediating agent traffic. The security trade-offs differ for each.

Centralized gateway

A single gateway cluster handles all agent traffic. Every client connects to the gateway; every agent is reachable only through it.

Use this when:

  • You have a manageable number of agents (tens, not thousands)

  • You want a single policy enforcement point and audit log

  • Your team can operate a stateful proxy tier

Main risks: Single point of failure. The gateway is on the critical path for all agent interactions, so an outage blocks all agent access. Scaling the gateway independently of agents requires capacity planning.

Sidecar proxy

Each agent gets a co-located proxy (container sidecar, local process) that enforces policy locally. Clients still connect through a thin ingress layer, but authZ and data inspection happen at the edge of each agent.

Use this when:

  • Agents are distributed across many environments (developer laptops, on-prem VMs, multiple cloud regions)

  • You need per-agent policy enforcement without routing all traffic through a central bottleneck

  • Agent count is high or dynamic (auto-scaling agent pools)

Trade-off: Policy distribution becomes the hard problem. Every sidecar needs a current policy bundle, and stale policies mean stale authZ. You also lose centralized request-level audit unless sidecars ship logs to a collector.

Service mesh

Agents and the gateway are part of a mesh (Istio, Linkerd). mTLS between all participants is automatic. Policy is declared centrally and enforced by mesh-managed proxies.

Use this when:

  • You already operate a service mesh

  • You need mTLS everywhere without managing certificates per agent

  • You want to layer agent-specific policy on top of existing mesh authZ primitives

Trade-off: Mesh infrastructure is operationally heavy. If you don't already have one, adopting a mesh solely for agent governance is overkill.

Common middle ground: Start with a centralized gateway for the first deployment. Add sidecar proxies only for agents that can't route through the central gateway (e.g., on-prem or developer-local agents). Use a mesh only if you already have one.

Golden path

Build this first. Then relax constraints only if you have a specific reason:

Client authenticates → TLS/mTLS to gateway → gateway evaluates authZ policy → inbound data inspection → forward to agent → agent executes and responds → outbound data inspection → deliver response to client

Each step is a gate. A failure at any gate stops the flow and returns an error to the client.

Core design

Identity and authentication

Every client app must authenticate to the gateway. The gateway does not trust the client's claim of user identity; it verifies it. Never derive user identity from client-supplied fields user_id in the request body) or forwarded headers X-User-ID, X-Forwarded-User). Extract identity exclusively from verified credentials (OAuth token, client certificate, platform signature).

  • Web UIs and custom apps: OAuth 2.0 / OIDC tokens validated at the gateway. The gateway extracts user_id, tenant_id, and roles from the verified token

  • Slack and chat integrations: The gateway validates the platform's request signature (e.g., Slack's X-Slack-Signature), then maps the platform user ID to an internal principal

  • mTLS clients: Client certificate CN or SAN provides the identity. The gateway maps the certificate identity to a principal in its policy store

Agents also authenticate to the gateway. Each agent has a unique identity (client certificate, API key, or service account token) that the gateway uses to verify the agent is who it claims to be. Agent credentials are never exposed to clients.

Authorization policy

The gateway evaluates policy on every request. Policy answers two questions:

  1. Can this user talk to this agent? A binding between user identity or role and agent identity. Users in the engineering group can access code-review-agent; users in finance can access reporting-agent.

  2. Can this user perform this action? Finer-grained control over what the user can ask the agent to do. A read-only role can query the agent but cannot trigger tool executions that modify state.

Store policy centrally and version it. Changes take effect on the next request (no cache that outlives a policy update). Deny by default: if no policy matches, the request is rejected. If the policy store is unreachable, the gateway must fail closed (reject all requests), not fail open.

Data inspection

The gateway inspects payloads flowing in both directions.

Inbound inspection (client → agent):

  • Reject requests containing patterns that indicate prompt injection targeting the agent's tool-use capabilities (e.g., instructions to ignore system prompts, requests to execute commands not permitted by the user's role)

  • Enforce payload size limits to prevent resource exhaustion on the agent

Outbound inspection (agent → client):

  • Scan responses for sensitive data patterns: API keys, credentials, PII, internal hostnames, file paths, database connection strings

  • Redact or block responses that match DLP rules before they reach the client

  • Flag responses where the agent attempted to return data outside the scope of the user's query

  • For streaming responses (SSE, WebSocket), buffer the complete response and scan it before delivery. If latency constraints require chunked delivery, accept the partial-leak risk

If the DLP engine is unavailable, the gateway should block responses rather than delivering uninspected content. Accept the availability hit.

Data inspection is not a silver bullet. Pattern-based DLP has false positives and false negatives. The goal is to catch accidental leakage and obvious exfiltration, not to prevent a determined adversary who controls the agent runtime.

Audit logging

The gateway logs every request and response:

  • timestamp, request_id, user_id, tenant_id, agent_id, action

  • AuthZ decision (allow / deny + policy rule that matched)

  • DLP scan result (pass / redacted / blocked + rule that triggered)

  • Truncated request / response payloads (configurable: omit for sensitive workloads, include for compliance-heavy ones)

  • Scrub tool-call arguments from logs when they may contain credentials or secrets (e.g., API keys passed as tool parameters)

Logs are append-only. Ship them to a centralized log store. They are the primary artifact for incident investigation and compliance audits.

Agent-to-agent isolation

Agents must not communicate with each other directly. If agent A needs to invoke agent B, that request goes through the gateway as a new request with agent A's identity, subject to the same authZ and data inspection as any client request. Enforce isolation with network controls: agents accept inbound connections only from the gateway (security group, network policy, firewall rule). Agents should not be routable from client networks or from each other.

Threat model

Baseline assumptions

  • Clients are untrusted: they can craft arbitrary payloads, replay requests, and impersonate other users if authentication is weak

  • Agents are semi-trusted: they run your code and have real capabilities, but their outputs are not trusted for data safety (LLM outputs are non-deterministic and can contain leaked context)

  • Compromise of the gateway is out of scope for this model. The gateway concentrates AuthN, AuthZ, DLP, credential injection, and audit; restrict admin access, version policy changes, and monitor for tampering as you would any Tier-0 control plane

  • Network between gateway and agents is private (VPC, WireGuard, or mTLS), not the open internet

  • Agents do not share conversational state across tenants. If multiple tenants use the same agent pool, the gateway (or the agent runtime) enforces per-tenant context isolation. Without this, an agent that retains state across requests can return one tenant's context in another tenant's response

  • Standard infra controls such as TLS termination, WAF, database AuthN, and OS-level hardening are assumed to be in place. This model focuses on the agent communication pattern

A note on risk: you won’t fix everything

This table isn’t a checklist where every row must be fully eliminated. Focus on preventing the worst failures and limiting blast radius. In practice: ship prevention for the High rows first, then add monitoring and response for what you can’t realistically prevent.

Phase 1: Client to gateway

Focus: Preventing unauthorized access to agents and injection of malicious payloads

Asset

Threat

Baseline Controls

Mitigation Options

Risk

Agent access

AuthZ bypass: Attacker crafts request to reach an agent they're not permitted to use (e.g., manipulating routing headers or agent IDs)

AuthZ policy evaluation on every request

1. Deny default: Reject if no explicit policy match

2. Routing lockdown: Clients address agents by logical name only; the gateway resolves to internal endpoints and rejects requests containing direct agent addresses

High

User identity

Session hijack: Attacker steals OAuth token or session cookie to impersonate a legitimate user

Token validation at gateway

1. Short-lived tokens: Access tokens expire in minutes, not hours

2. Binding: Bind sessions to client fingerprint (IP, TLS session) where feasible

3. Revocation: Support immediate token revocation via introspection or short cache TTL

Medium

Agent runtime

Prompt injection via client: User crafts input designed to override the agent's system prompt, causing it to execute unintended tool calls

Input validation at gateway

1. Input scanning: Pattern-match known injection templates (e.g., "ignore previous instructions")

2. Tool governance: See tool execution row in Phase 2

3. Agent hardening: Agents use structured tool-call interfaces, not free-form command parsing

High

Gateway availability

Request flooding: Attacker sends high volume of requests to exhaust gateway resources

Rate limiting

1. Per-user rate limits: Throttle by authenticated identity

2. Per-agent limits: Protect individual agents from traffic spikes

3. Backpressure: Gateway returns 429 and sheds load before agents are affected

Medium

Phase 2: Gateway to agent

Focus: Securing the communication channel and preventing credential leakage

Asset

Threat

Baseline Controls

Mitigation Options

Risk

Agent credentials

Credential leak: Agent API keys or service account tokens are exposed in logs, error messages, or configuration

Credentials stored in secrets manager

1. Injection: Gateway injects agent credentials at request time; clients never see them

2. Rotation: Automate credential rotation; detect stale credentials

3. Redaction: Scrub credentials from all log outputs

High

Tool execution

Unauthorized tool call: User or injected prompt triggers a tool invocation (shell, SQL, HTTP) beyond the user's permitted scope, causing data modification or lateral movement

AuthZ policy evaluation

1. Tool allow-list: Gateway maintains per-role allow-lists of permitted tool types and blocks unlisted calls before they reach the agent

2. Structured intents: Agents emit structured tool-call intents (tool name, arguments as typed fields); gateway parses and validates before execution

3. Parameter validation: Gateway enforces argument constraints (e.g., allowed hostnames for HTTP fetch, read-only for SQL)

High

Agent identity

Agent spoofing: Rogue process registers as a legitimate agent and receives user requests

Agent authentication required

1. mTLS: Require client certificates for agent-to-gateway connections

2. Registration: Agents must be registered in the gateway's agent inventory before receiving traffic

3. Health checks: Gateway periodically verifies agent identity and liveness

Medium

Request integrity

Tampering: Man-in-the-middle modifies request payloads between gateway and agent

Private network / VPC

1. mTLS: Encrypt and authenticate all gateway-to-agent traffic

2. Signing: Gateway signs forwarded requests; agent verifies signature before processing

Low

Phase 3: Agent response to client (via the gateway)

Focus: Preventing data exfiltration and ensuring response integrity

Asset

Threat

Baseline Controls

Mitigation Options

Risk

Sensitive data

Data exfil via response: Agent returns API keys, database credentials, PII, or internal infrastructure details in its natural-language response

Outbound DLP scan

1. Pattern matching: Scan for known secret formats (AWS keys, connection strings, SSNs)

2. Redaction: Replace matched patterns with [REDACTED] before delivering to client

3. Alerting: Notify security team when DLP rules trigger repeatedly for the same agent

High

Response scope

Over-fetching: Agent retrieves and returns data beyond what the user's role permits (e.g., agent has broad DB access but user should only see their own records)

AuthZ at gateway

1. Scoped context: Gateway passes the user's permission scope to the agent as part of the request context

2. Structured responses: Where agents return structured data (JSON, SQL results), validate response fields against the user's authorized scope. For free-form text responses, this control does not apply; rely on scoped agent credentials instead

3. Least privilege agents: Run agents with minimal credentials scoped to the task

Medium

Audit trail

Silent exfil: Data leaves through agent tool calls (HTTP requests, file writes) rather than through the response path

Agent in private network

1. Egress control: Restrict agent outbound network access to an allowlist

2. Tool-call logging: Log all tool invocations and their arguments

3. Alerting: Alert on unexpected egress destinations or high-volume data tool calls

High

Client trust

Response injection: Agent response contains executable content (scripts, links) that the client renders unsafely

Client-side rendering controls

1. Content typing: Gateway sets response content type to plain text / structured JSON

2. Sanitization: Strip HTML / script tags from natural-language responses

3. Client hardening: Client apps treat agent responses as untrusted content (no eval, no innerHTML)

Low

If you use a sidecar or mesh instead

If you deploy sidecars instead of a centralized gateway, the threat profile shifts:

  • Policy staleness becomes a primary risk: Each sidecar enforces a local copy of the policy. If policy distribution lags, users may retain access after revocation or lose access prematurely

  • Audit aggregation is harder: Logs are distributed across sidecars. You need a reliable log shipping pipeline; gaps mean blind spots in incident response

  • Agent-to-agent isolation requires mesh policy: Without a central chokepoint, you rely on mesh-level network policy to prevent direct agent-to-agent communication. Misconfigured mesh policy can silently allow lateral movement

  • DLP consistency is harder to maintain: DLP rules must be distributed to every sidecar. Version drift between sidecars means inconsistent data inspection

  • Sidecar failure mode matters: If a sidecar loses contact with the policy distributor, it must fail closed (block all requests). A sidecar that fails open exposes the agent to unauthenticated traffic until policy is restored

The sidecar model is not less secure. It trades centralized simplicity for distributed resilience, but the operational cost of keeping distributed policy and DLP in sync is real.

Verification checklist

  • Authentication

    • Unauthenticated requests to the gateway return 401 before reaching any agent

    • Slack / chat integration requests are rejected if platform signature validation fails

    • mTLS clients with expired or unknown certificates are rejected at the TLS handshake

  • Authorization

    • A user with no explicit policy binding receives 403 when addressing any agent

    • Removing a user's role binding immediately prevents access on the next request (no stale cache)

    • Agent logical names are resolved by the gateway; direct agent endpoint addresses in client requests are rejected

  • Data inspection (inbound)

    • Oversized payloads return 413 before forwarding

    • Known prompt-injection patterns in request payloads trigger a block or flag (verified with test payloads)

  • Data inspection (outbound)

    • Agent responses containing test secret patterns (e.g., AKIA... AWS key format) are redacted before delivery to the client

    • DLP rule triggers are logged with the request_id, agent_id, and matched rule

  • Agent isolation

    • Agent A cannot send a request to Agent B without that request passing through the gateway's authZ and DLP pipeline

    • Only agents registered in the gateway's agent inventory receive traffic; unregistered agents are rejected

    • Agents cannot reach the gateway's admin API or policy store

  • Credential management

    • Agent credentials (API keys, service account tokens) never appear in gateway logs or client-visible error messages

    • Credential rotation requires zero gateway downtime

  • Audit and detection

    • Every authZ decision (allow and deny) is logged with user_id, agent_id, action, and policy rule

    • DLP scan results (pass, redact, block) are logged for every response

    • Alerts fire when a single user triggers more than N DLP blocks within a time window

Implementation & Review

The full threat model matrix, architectural diagrams, and a printable verification checklist for this pattern are available in the Secure Patterns repository. Use these artifacts to guide your design reviews and internal audits.

Keep reading