Webhook Delivery Architecture: Securing the Trust Boundary

Webhook delivery requires your platform to make outbound HTTP requests to URLs your customers choose; if the sender doesn't validate destinations, you've built an SSRF proxy into your own control plane. This post documents safe defaults for both sides and their trade-offs.

System description

An event system detects state changes, signs the payload with a per-endpoint secret, and delivers an HTTP POST through an egress proxy that blocks internal network ranges. The customer's server verifies the signature, checks the timestamp, and processes the event idempotently.

Reference architecture for secure webhook delivery

Architecture choice

There are two common models for webhook dispatch, and the security trade-offs change depending on which one you pick.

Inline dispatch

The application sends the webhook synchronously in the request handling path. The webhook fires before the API response is returned.

Use this when:

You have very few subscribers and low event volume
Latency to customer endpoints is consistently low
You need the simplest possible implementation

Main risks: A slow or unresponsive customer endpoint blocks your request handling. A malicious subscriber that hangs for 30 seconds blocks your API call for 30 seconds, creating a denial-of-service vector. Retry logic in the request path compounds the problem.

Async dispatch (queue-based)

Events are published to a message queue. A separate dispatcher dequeues events, resolves the target endpoint, signs the payload, and sends the request through an egress proxy. Retries happen in the background with exponential backoff.

Use this when:

You have more than a handful of subscribers
You need retry guarantees without blocking the main application
You want to isolate webhook delivery failures from your core API

Trade-off: Adds infrastructure (queue, dispatcher workers, dead letter store). Delivery is eventually consistent; there is a delay between the event and the POST.

Default to async dispatch. Inline dispatch works for low-volume systems but offers no isolation between delivery failures and your core API.

Golden path

Build this first. Then relax constraints only if you have a specific reason:

Event fires → enqueue → resolve endpoint → sign payload (HMAC-SHA256) → POST through egress proxy → customer verifies signature + timestamp → process idempotently → return 2xx

Each step is a gate. A failure at any gate either retries (sender side) or rejects (receiver side).

Related patterns:

Webhook receivers should process retries safely. See Designing API Idempotency Keys to Prevent Duplicate Writes for request fingerprints and stored results.
If webhook payloads point to uploaded files, pair this with Pre-signed URLs: The Secure Implementation Guide so storage access is short-lived and scanned before publish.
If a webhook triggers third-party API calls, see OAuth Token Storage: Securing Third-Party Credentials in Multi-Tenant SaaS.

Minimal system context

Event bus (queue): Decouples event production from delivery
Webhook dispatcher (delivery): Dequeues events, resolves endpoints, signs payloads, delivers via egress proxy
Endpoint registry (configuration): Stores customer URLs, signing secrets, and subscriptions
HMAC signer (credential): Signs payloads with per-endpoint secrets using HMAC-SHA256 over {webhook_id}.{timestamp}.{raw_body}. Both sides must operate on the exact same byte sequence; sign the raw bytes, not parsed JSON
Egress proxy (network boundary): Enforces destination restrictions independently of application code
Webhook endpoint (untrusted): Customer-controlled HTTP receiver that verifies signatures before processing

Webhooks are at-least-once delivery. The receiver stores processed webhook-id values and skips duplicates.

Threat model

Baseline assumptions

Customers are semi-trusted: they have accounts on your platform, but they control the webhook endpoint URL and can set it to anything
The webhook delivery network path is the public internet (HTTPS)
The signing secret is shared only between your platform and the specific customer endpoint
Webhook payloads contain event data that may include business-sensitive information but not raw credentials
The egress proxy and firewall are correctly configured and maintained
Standard infra controls such as TLS, WAF, API authentication, and database AuthN are assumed to be in place. This model focuses on the webhook delivery pattern

A note on risk: you won’t fix everything

This table isn’t a checklist where every row must be fully eliminated. Focus on preventing the worst failures and limiting blast radius. In practice: ship prevention for the High rows first, then add monitoring and response for what you can’t realistically prevent.

Phase 1: Webhook registration

Focus: Preventing registration of malicious endpoints

Asset	Threat	Baseline Controls	Mitigation Options	Risk
Internal network	SSRF via malicious URL: Customer registers an internal IP or cloud metadata endpoint as their webhook URL	URL validation at registration	1. Block private / loopback / link-local IPs 2. Block internal domains 3. Require HTTPS 4. Egress proxy blocks internal ranges at network level	High
Platform availability	DDoS amplification: Attacker registers many endpoints pointing at a victim, then triggers mass events to flood the target from your infrastructure	Per-account endpoint cap (5-10)	1. Destination dedup: Limit subscriptions per destination URL across accounts 2. Rate limit test / ping endpoints 3. Per-destination delivery throttling	Medium
Signing secret	Secret extraction: Attacker uses API to repeatedly retrieve or enumerate signing secrets	Auth-gated access	1. Role-based secret visibility 2. Audit log every secret access 3. Rate limit secret retrieval API	Medium
Endpoint configuration	Unauthorized registration: Low-privilege user registers or modifies webhook URL to redirect events or extract signing secrets	API authentication	1. Role-based access: Restrict endpoint CRUD to admin roles 2. Re-validate URL on every update 3. Rotate secret on URL change 4. Audit log all endpoint mutations	Medium

Phase 2: Webhook delivery

Focus: Securing the outbound request path

Asset	Threat	Baseline Controls	Mitigation Options	Risk
Internal network	DNS rebinding: URL resolves to a public IP at validation but a private IP at delivery time	DNS check at registration	1. Re-resolve DNS at delivery time 2. Egress proxy blocks private ranges 3. Pin resolved IP for the delivery attempt 4. Block HTTP redirects	High
Dispatcher workers	Resource exhaustion: Malicious endpoint responds slowly, tying up worker threads	HTTP timeout (5-15s connect + read)	1. Concurrency cap: Limit concurrent deliveries per destination 2. Circuit breaker on consistently slow endpoints 3. Isolated worker pools per priority tier	Medium
Event data	Interception in transit: Webhook payload captured on the network	HTTPS-only registration (HTTP rejected)	1. Certificate pinning: Verify TLS certificates, never disable cert validation 2. Minimum TLS 1.2 3. Payload encryption: Encrypt sensitive fields inside the signed body for defense-in-depth	Low
Audit trail	Silent failure: Webhooks fail without visibility, causing data loss the customer cannot diagnose	Async retry with backoff	1. Log every delivery attempt with status code, latency, and event ID 2. Surface delivery status to customers via API or dashboard 3. Alert on sustained failures 4. Dead letter after max attempts for controlled replay	Medium
Dispatcher integrity	Retry storm: Persistent failure or malformed event causes unbounded retries, consuming dispatcher capacity	Retry cap (5-8 attempts) with exponential backoff and jitter	1. Dead-letter: Route failed events after max attempts for manual replay 2. Deduplicate events by ID before dispatch 3. Circuit breaker on persistently failing endpoints	Medium

Phase 3: Webhook verification (receiver side)

Focus: Preventing the customer from processing forged or replayed webhooks

Asset	Threat	Baseline Controls	Mitigation Options	Risk
Business logic	Forged webhook: Attacker sends a fake POST to the customer's endpoint to trigger unauthorized actions (e.g., fake "payment completed" event)	HMAC signature verified on every request; unsigned requests rejected	1. Constant-time comparison: Prevent timing side-channels in signature check 2. Raw bytes: Verify against raw body bytes, not parsed/re-serialized JSON 3. Allow-list source IPs: Restrict inbound to your platform's published egress range	High
Business logic	Replay attack: Attacker captures a legitimate signed webhook and re-sends it later to re-trigger processing	Timestamp in signature	1. Reject timestamps older than 5 minutes 2. Deduplicate on `webhook-id` 3. Store processed IDs with a TTL covering the retry window	Medium
Signing secret	Secret compromise: Attacker obtains the signing secret and can forge arbitrary webhooks	Encrypted secret per endpoint	1. Rotate immediately on suspected compromise 2. Support dual-secret rotation window 3. Role-gated secret visibility	High
Endpoint availability	Volumetric flooding: Attacker sends high volume of requests to the webhook endpoint, saturating network or application capacity	Signature verified before any processing	1. Rate limit by source IP at the edge 2. Async processing: Return 200 quickly, process in a background queue 3. Place endpoint behind a CDN or load balancer with rate limiting 4. Allow-list source IPs: Restrict inbound to your platform's published egress range	Medium

If you use inline dispatch

If you skip the queue and send webhooks synchronously in the request path, the threat profile shifts:

Resource exhaustion moves from Medium to High: a slow endpoint blocks your API, not just a background worker
Retry becomes harder. Retrying in the request path multiplies latency and ties up threads for the duration of every attempt
Isolation disappears. A webhook delivery surge affects your core API latency directly, not an isolated dispatcher pool
Dead-letter recovery is harder to implement without a queue to catch failures

For low-volume systems inline dispatch works, but accept that every delivery failure is a direct hit to your API, not a background worker.

FAQs

Why should webhook delivery be asynchronous?

Asynchronous delivery isolates customer endpoint failures from your core API. A slow, failing, or malicious webhook receiver should not block user-facing request handling.

How should webhook receivers prevent replay attacks?

Verify the HMAC signature, check the timestamp window, and store processed event IDs. If the same event arrives again, return the original result or acknowledge it without repeating side effects.

Verification checklist

Signing (sender)
- Every outgoing webhook includes webhook-id, webhook-timestamp, and webhook-signature headers
- Signature is HMAC-SHA256 over {id}.{timestamp}.{raw_body}
- Each endpoint has its own signing secret with at least 256 bits of entropy
- Rotation supports two concurrent active secrets
- Secrets never appear in logs or error responses. Dashboard access is auth-gated and audit-logged
SSRF prevention (sender)
- Registration requires HTTPS and rejects any URL that resolves to private, loopback, link-local, or internal ranges (including 169.254.169.254, 127.0.0.1, [::1], and fc00::/7)
- DNS re-resolved before each delivery; if any resolved address falls in a blocked range or resolution fails, delivery is rejected and logged
- HTTP redirects blocked, or every redirect target re-validated against the same rules
- Dispatcher runs behind an egress proxy. Network policy enforces this, not application config
- Resolved IP pinned per delivery attempt to prevent rebinding between resolution and connection
- Dispatcher cannot reach RFC 1918 ranges, loopback, or link-local addresses even if application-layer checks are bypassed
Delivery (sender)
- Non-2xx responses trigger retry with exponential backoff and jitter
- HTTP client timeout is 15 seconds or less
- Retries capped at 5-8 attempts over 24-48 hours
- Failed deliveries retained for inspection and controlled replay
- Every delivery attempt logged with status code, latency, and event ID
Abuse prevention (sender)
- Endpoints per account capped (e.g., 5-10)
- Test / ping endpoints rate-limited
- Per-destination delivery rate capped
Verification (receiver)
- Every incoming request is signature-verified before any processing
- Constant-time comparison used for signature check
- Verification runs against raw request body bytes, not parsed JSON
- Signature verification fails on re-serialized (key-reordered) JSON body
- Requests with timestamps older than 5 minutes rejected
- Missing webhook-id, webhook-timestamp, or webhook-signature headers return 401
Idempotency (receiver)
- webhook-id stored and checked on every request; duplicates acknowledged but not re-processed
- Processed event IDs stored with a TTL covering the retry window
Endpoint hardening (receiver)
- Webhook endpoint returns 200 quickly and processes asynchronously
- Response bodies do not expose internal error details

Implementation & Review

The full threat model matrix, architectural diagrams, and a printable verification checklist for this pattern are available in the Secure Patterns repository. Use these artifacts to guide your design reviews and internal audits.

Secure Webhook Delivery: Signing, Verification, and SSRF Prevention