Kubernetes Workload Identity Guide: Secure OIDC Federation

K8s in 30 seconds — A pod is your application running as one or more containers on a Kubernetes cluster. A workload is a higher-level object (like a Deployment or StatefulSet) that manages the lifecycle, scaling, and desired state of those pods. A namespace is an isolation boundary within the cluster that separates teams or environments. A ServiceAccount is an identity object assigned to pods: the identity that determines what tokens the pod receives.

Workload identity federation lets a pod authenticate to cloud services (AWS, GCP, Azure) using a short-lived Kubernetes ServiceAccount token instead of a static IAM key. If the federation is misconfigured (wrong audience, over-scoped IAM binding, or missing namespace constraint), pods outside the intended workload can assume its cloud role. This post documents a safe default and its trade-offs.

System description

In this pattern, a pod proves its identity to a cloud provider without static keys. Kubernetes issues a short-lived, signed token, the cloud verifies that token against the cluster's public keys, and, if the token matches a set of trust rules, exchanges it for temporary cloud credentials. The cluster acts as an OIDC issuer, publishing public keys (JWKS) that let the cloud verify your pod's tokens cryptographically, without a direct connection to the Kubernetes API.

Minimal system context

ServiceAccount (identity): One per workload. The namespace / serviceaccount pair is the identity the cloud side binds to. Avoid sharing ServiceAccounts across different workloads to prevent one compromised app from hijacking the cloud access of another
Projected token volume (credential): A kubelet-managed JWT mounted into the pod filesystem. The token is signed, auto-rotated, and scoped to a specific audience (the aud claim) defined in the pod spec
OIDC issuer (verification): The cluster exposes a /.well-known/openid-configuration endpoint and a JWKS URI that cloud IAM uses to verify token signatures without calling the Kubernetes API
Cloud trust policy (authorization): An IAM policy that specifies which issuer, audience, namespace, and ServiceAccount name are allowed to assume a given cloud role
STS (token exchange): The cloud's Security Token Service (STS) that validates the token against the trust policy and returns temporary credentials

Golden path

Build this first. Then relax constraints only if you have a specific reason:

Dedicated ServiceAccount per workload → projected volume mounts short-lived JWT → explicit audience on token → namespace + SA pinning in trust policy → session duration matches token TTL

Dedicated ServiceAccount per workload: Never reuse the default account, which lacks the specific identity needed for cloud pinning
Projected volume mounts short-lived JWT: The kubelet handles the token lifecycle. No static Kubernetes Secrets involved
Explicit audience on token: Set the audience field within the projected volume configuration to the exact STS endpoint (e.g., sts.amazonaws.com)
Namespace + SA pinning in trust policy: Verify both the namespace and ServiceAccount name in the cloud IAM policy
Session duration matches token TTL: Set the IAM role's max session duration to match the projected token's expirationSeconds

Each step is a gate. If any gate fails (wrong audience, unrecognized issuer, or namespace mismatch), the exchange is denied and no cloud credentials are issued.

Related patterns:

If the workload stores customer OAuth credentials, combine workload identity with OAuth Token Storage: Securing Third-Party Credentials in Multi-Tenant SaaS.
If the workload serves files from object storage, see Multi-Tenant File Sharing: Secure Control Plane Architecture for tenant-aware authorization above S3 or GCS.
If the workload is an agent runtime, see AI Agent Gateway: The Authorization Chokepoint.

Security properties of projected ServiceAccount tokens

A projected ServiceAccount token binds:

The issuer (cluster OIDC URL)
The subject (system:serviceaccount:<namespace>:<name>)
The audience (explicit string, e.g., sts.amazonaws.com)
A short expiration (default varies; typically 1h, configurable down to 10m)

It does not guarantee that:

The cloud IAM role is scoped to the minimum permissions the workload needs
Only the intended pod mounts the token (any pod using the same ServiceAccount gets it)
The OIDC issuer endpoint is access-controlled (public by default on most managed clusters)
- While the endpoint is public, it only serves public keys (JWKS); the private keys never leave the cluster control plane

That's why the trust policy must bind to namespace + ServiceAccount name, and the IAM role must be scoped to the narrowest permission set the workload requires.

Threat model

Baseline assumptions

The Kubernetes cluster is managed (EKS, GKE, AKS) and the control plane is operated by the cloud provider
RBAC is enforced. Developers cannot create or modify ServiceAccounts in namespaces they don't own
- If this assumption weakens which is common in orgs without admission control (the cluster's gatekeeper that validates or rejects API requests), every row in this table is affected, especially the cross-namespace confused deputy threat
Network policies (rules controlling pod-to-pod traffic) and pod security standards (restrictions on what pods can do, e.g., running as root) are in place. This model does not cover container escapes or node compromise
Standard infra controls such as TLS, etcd (the cluster's configuration database) encryption, audit logging, and admission control are assumed to be in place. This model focuses on the workload-to-cloud authentication path

A note on risk: you won’t fix everything

This table isn’t a checklist where every row must be fully eliminated. Focus on preventing the worst failures and limiting blast radius. In practice: ship prevention for the High rows first, then add monitoring and response for what you can’t realistically prevent.

Asset	Threat	Baseline Controls	Mitigation Options	Risk
Cloud IAM role	Over-scoped binding: Trust policy uses wildcard or omits namespace condition, allowing any ServiceAccount in the cluster to assume the role	Trust policy exists	1. Namespace + SA pinning: Always include both `namespace` and `serviceaccount` conditions in the trust policy 2. Policy-as-code: Use OPA / Gatekeeper or cloud IAM policy linting in CI to reject wildcard conditions 3. Audit: Periodically enumerate trust policies and flag any that lack namespace constraints	High
Cloud IAM role	Confused deputy (cross-namespace): Attacker creates a ServiceAccount with the same name in a different namespace and assumes the role	RBAC on SA creation	1. Namespace binding: Include the namespace claim in the trust policy condition (not just the SA name) 2. Admission control: Restrict ServiceAccount creation in sensitive namespaces to specific principals 3. Drift Detection: Alert only on ServiceAccounts created manually (outside of CI/CD pipelines) in production namespaces	High
Projected token	Token theft from pod filesystem: Attacker with exec access reads `/var/run/secrets/tokens/` and replays the token from outside the cluster	Projected volumes	1. Short TTL: Set `expirationSeconds` to the minimum your SDK supports (e.g., 15–30 min) 2. Network restriction: Cloud trust policy can include a source IP / VPC condition where supported 3. Detection: Alert on `AssumeRoleWithWebIdentity` calls from unexpected source IPs	Medium
Token audience	Audience misconfiguration: Token is issued with a broad or default audience, making it valid for unintended cloud trust policies	Projected volumes (which require an audience field)	1. Explicit pinning: Always set the `audience` field in the projected volume to the exact STS endpoint (e.g., `sts.amazonaws.com`) 2. Validation: Admission webhook rejects pod specs where projected token audience is missing or set to a wildcard	Medium
OIDC issuer	Issuer spoofing: Attacker with `iam:CreateOpenIDConnectProvider` (or equivalent) access registers a rogue OIDC issuer URL in a cloud trust policy, issuing tokens that the cloud accepts	Cloud trust policy specifies issuer URL	1. Issuer inventory: Maintain an allowlist of known OIDC issuer URLs per cloud account 2. Policy review: Flag trust policies that reference unrecognized issuer URLs 3. Access control: Restrict who can create or modify identity provider configurations in cloud IAM	Medium
IAM permissions	Blast radius: The cloud IAM role granted via federation has far more permissions than the workload needs	Role exists per workload	1. Least privilege: Scope IAM policies to specific resources (ARNs, project IDs) and actions 2. Separate roles: One IAM role per workload, not per namespace or per cluster 3. Access Analyzer: Use cloud-native tools (IAM Access Analyzer, Policy Troubleshooter) to identify unused permissions	Low
Credential refresh	Revocation gap: Cached cloud credentials persist after a pod's ServiceAccount binding is changed (e.g. during incident response), maintaining access until the session expires	Standard SDK behavior (auto-refresh based on TTL)	1. Short session duration: Set the IAM role's max session duration to match the projected token TTL to shrink the revocation window 2. Pod restart: Terminate affected pods immediately after SA binding changes to force credential re-acquisition 3. Rotation testing: Include SA rebinding and credential expiry in integration tests	Low
Workload availability	OIDC issuer drift: Cluster upgrade or recreation changes the OIDC issuer URL or rotates signing keys, breaking all cloud trust policies silently	Managed OIDC endpoint	1. Pre-flight check: Verify OIDC issuer URL stability before cluster operations 2. Integration test: Run STS exchange test in CI after cluster changes 3. Monitoring: Alert on STS authentication failure rate spikes	Medium

If you use static IAM keys instead

If you skip workload identity and store IAM access keys in Kubernetes Secrets:

Key rotation becomes a manual operational burden; missed rotations leave long-lived credentials active indefinitely
A single leaked Secret (etcd backup, misconfigured RBAC, log exposure) grants persistent cloud access with no expiry
There is no namespace or ServiceAccount binding; any pod that mounts the Secret gets the same permissions
Blast radius increases because static keys are typically broader-scoped and harder to audit than per-workload federated roles
Revocation requires generating new keys and redeploying every consumer, rather than simply updating a trust policy

FAQs

Why is workload identity safer than static IAM keys?

Workload identity uses short-lived, audience-scoped tokens instead of long-lived secrets stored in containers, CI variables, or config files. If a pod token leaks, its lifetime and trust policy should limit the blast radius.

What should a cloud trust policy pin for Kubernetes workload identity?

Pin the issuer, audience, namespace, and ServiceAccount name. Avoid broad trust policies that allow any pod in the cluster, or any ServiceAccount in a namespace, to assume the same cloud role.

Verification checklist

Identity binding
- Each workload has a dedicated ServiceAccount (no reuse of the default SA)
- Cloud trust policies include both namespace and serviceaccount conditions (not just SA name)
- Creating a ServiceAccount with the same name in a different namespace does not grant access to another workload's cloud role
Token configuration
- Projected token audience is set to the exact STS endpoint string (e.g., sts.amazonaws.com, not blank or *)
- Projected token expirationSeconds is set explicitly and is less than or equal to 3600s
- Pods do not mount the default ServiceAccount token (automountServiceAccountToken: false disables the automatic token mount) when only the projected token is needed
IAM policy scoping
- IAM Access Analyzer (or equivalent) reports no unused permissions for the workload's role
- No trust policy uses wildcard conditions on namespace, ServiceAccount, or audience claims
- IAM roles are not shared across workloads with different permission requirements
Credential lifecycle
- The application SDK refreshes credentials before the projected token expires (observable in logs or metrics)
- Changing a pod's ServiceAccount and restarting causes the old cloud role to become inaccessible within one token TTL
- Deleting the trust policy immediately blocks new token exchanges (existing sessions drain within max session duration)
Detection
- Cloud audit logs capture every AssumeRoleWithWebIdentity (or equivalent) call with the source ServiceAccount identity
- Alerts fire on token exchange attempts from unexpected namespaces or ServiceAccount names
- A query of all cloud trust policies returns only known, documented OIDC issuer URLs
- STS authentication failure rate is monitored, and alerts fire on spikes correlated with cluster operations

Implementation & Review

The full threat model matrix, architectural diagrams, and a printable verification checklist for this pattern are available in the Secure Patterns repository. Use these artifacts to guide your design reviews and internal audits.

Kubernetes Workload Identity: Eliminating Static Cloud Credentials