Container and Kubernetes Security — System Design Space

Container and Kubernetes security is not a single manifest flag — it is a chain of controls: from which image enters the cluster, to what a workload is allowed to do at runtime and on the network. Containers share the host kernel, so weak isolation of one pod puts its neighbors and the node itself at risk.

Adjacent chapters cover related ground: application-level risks in OWASP Top 10 in System Design, provenance and artifact signing in Supply Chain Security, and how Kubernetes works in Kubernetes Fundamentals. The focus here is narrow: securing the containers and the cluster themselves — images, admission control, runtime isolation, networking, and detection.

A useful frame is 4C: Cloud → Cluster → Container → Code. Each outer layer defines the trust boundary for the inner one, and a breached outer layer devalues the protections inside it. We walk these layers top down.

Practical value of this chapter

Design in practice

Use guidance on Securing containers and the cluster across the 4C layers: images, admission control, runtime isolation, networking, and detection to define architectural security requirements before implementation starts.

Decision quality

Validate solutions through threat model, security invariants, and production control operability, not compliance checklists alone.

Interview articulation

Frame answers as threat -> control -> residual risk, linking business scenario to concrete protection mechanisms.

Trade-off framing

Make trade-offs explicit for Securing containers and the cluster across the 4C layers: images, admission control, runtime isolation, networking, and detection: UX friction, latency overhead, operational cost, and compliance constraints.

Context

Kubernetes Fundamentals

This chapter is about securing the cluster; how it works is covered in Kubernetes Fundamentals.

Open chapter

Container and Kubernetes security does not break on a single manifest flag. It holds on a chain of controls: from which image even enters the cluster, to what a workload is allowed to do at runtime and on the network. The price of a weak link is high: containers share the host kernel, and leaky isolation of one pod puts its neighbors and the node itself at risk.

Adjacent chapters cover related ground: application-level risks are in "OWASP Top 10 in System Design", while provenance and artifact signing live in "Supply Chain Security". How Kubernetes actually works (pods, control plane, scheduler) is in "Kubernetes Fundamentals". The focus here is narrow: securing the containers and the cluster themselves — images, admission control, runtime isolation, networking, and detection.

A handy frame for laying out the decisions is 4C: Cloud → Cluster → Container → Code. Each outer layer defines the trust boundary for the inner one, so breaching an outer layer devalues every protection nested inside it. We walk these layers top down.

Threat model: what is actually attacked

Shared host kernel

Containers are processes with namespaces and cgroups sharing one host kernel. A kernel vulnerability or a misconfiguration enables a container escape and access to neighboring workloads.

Control: Least process privilege, seccomp/AppArmor, no privileged, and a sandbox (gVisor/Kata) for high-risk workloads.

Escalation out of the container

Running as root, extra Linux capabilities, or mounting host paths or the daemon socket let an attacker escalate privileges up to the node.

Control: non-root, read-only root filesystem, dropped capabilities, and no host resource mounts.

Compromised image

An image carrying a vulnerability, a backdoor, or a swapped dependency reaches the cluster from an untrusted registry.

Control: Scanning, minimal bases, trusted registries — and signature checks at admission so an untrusted image never reaches a running pod.

Exposed API server

An internet-reachable API server, a control plane with weak RBAC, or anonymous access hands over full control of the cluster.

Control: Private endpoint, strong authentication, least-privilege RBAC, and an API audit log.

The four 4C layers

1Cloud

Infrastructure and provider: network, IAM, node access, and access to the control plane itself. If this layer is compromised, controls above it no longer help.

2Cluster

Kubernetes components: API server, etcd, RBAC, admission control, and network policies.

3Container

Images, the pod runtime profile, capabilities, seccomp/AppArmor, and isolation from the host.

4Code

The application inside the container: dependency vulnerabilities, secrets, input validation — where code-level threat modeling applies.

The outer layer is the trust boundary for the inner one: a compromised Cloud devalues Code-level protection.

Images: what we trust the cluster with

Control	Why	Failure action
Vulnerability scanning	Find known CVEs in image layers and dependencies, triaged by severity and exploitability.	Block publishing and deploying images with critical vulnerabilities.
Minimal / distroless bases	Fewer packages, shells, and tools mean a smaller attack surface and fewer CVEs to keep patched.	Disallow heavy base images for production workloads without justification.
Signing and provenance	Sigstore/cosign plus SLSA provenance give a verifiable chain of who built the image and from what.	Reject images without a valid signature and build attestation.
Ban the latest tag	A floating `latest` tag breaks reproducibility: the same manifest deploys different code over time.	Require immutable digest references (sha256) instead of floating tags.

Build and supply chain

Generate an SBOM at build time — the image contents are recorded, so a new vulnerability maps to affected workloads in minutes.
Pull images only from trusted registries; mirror and re-scan external images rather than using them directly.
Attach a build attestation (provenance, scan results) and sign it — part of software supply chain security.
Verify signature and provenance at admission: an image without a valid signature from a trusted publisher never enters the cluster.

Admission control

Mechanism	Role	Example
Validating webhook	Accepts or rejects an object without changing it — pure policy enforcement.	Reject a pod with `privileged: true`.
Mutating webhook	Augments or edits an object before it is persisted to etcd.	Inject a mesh sidecar, default `runAsNonRoot`.
Pod Security Admission	The built-in controller applying Pod Security Standards per namespace in enforce / audit / warn modes.	Profiles privileged → baseline → restricted by increasing strictness.
OPA Gatekeeper / Kyverno	Policy-as-code for arbitrary rules not covered by the built-in standards.	Allowed registries, required labels, resource limits, no hostPath.

Pod Security Admission covers the baseline minimum; Gatekeeper/Kyverno handle arbitrary organizational policy.

Pod runtime isolation

runAsNonRoot + runAsUser

The process is not root inside the container, so a compromise yields no superuser rights.

readOnlyRootFilesystem

Root filesystem is read-only; writes go only to explicit volumes, hindering persistence.

drop ALL capabilities

All Linux capabilities are removed; only the genuinely needed ones (usually none) are added back.

seccomp: RuntimeDefault

A syscall filter narrows the kernel attack surface reachable from the container.

allowPrivilegeEscalation: false

Blocks privilege growth via setuid binaries and similar mechanisms.

privileged: false (always)

A privileged container is effectively root on the node — almost always a mistake.

Stronger isolation for untrusted code

User namespaces: root inside the container maps to an unprivileged host user, so an escape is pointless without further escalation.
Sandboxes (gVisor, Kata Containers): syscall interception or a lightweight micro-VM add a second boundary for untrusted and multi-tenant workloads, at a runtime cost.
Separate nodes and namespaces by trust level: do not co-locate internet-facing front ends with privileged system workloads.

Network, access, and secrets

NetworkPolicy with default-deny: pod-to-pod traffic is allowed explicitly, not open by default.
Least-privilege RBAC: no cluster-admin for workloads, a dedicated ServiceAccount per service.
Keep secrets out of environment variables and out of the image: an external secret manager or KMS, volume mounts, and etcd encryption.
mTLS between services via a service mesh: encryption and mutual authentication by default, in the spirit of zero trust.

Runtime detection

Runtime detection (Falco and eBPF sensors): anomalous syscalls, a shell spawned in a container, unexpected network connections, and writes to sensitive paths.
API server audit log: who changed what and when in the cluster — the basis for incident analysis and spotting RBAC abuse.
Reconciling the actual pod state against the admitted policy: drift detection catches manual privileged edits that bypass the pipeline.

Trade-offs and common mistakes

Running workloads as privileged or root "just to make it work" — equivalent to root on the node.

Keeping everything in the default namespace with no network policies and no Pod Security Admission: every pod can reach every other, and there is no boundary to separate trust levels along.

Trusting images without scanning, signing, and provenance checks at admission — the first compromised image reaches a running pod unnoticed.

Putting secrets in environment variables and the image — they leak into logs, process dumps, and image layers.

Leaving the API server internet-reachable with broad or anonymous RBAC.

Not restricting egress traffic: lets a compromised pod exfiltrate data and pull in further payloads.

Hard isolation costs convenience and sometimes performance (sandboxes) — a deliberate trade-off against the workload's trust level.

References

Source map: NSA/CISA provides the hardening checklist; Kubernetes docs provide the current security-controls model, Pod Security Standards, and Pod Security Admission. Version details change, so before applying this, check your Kubernetes version, enabled admission controllers, runtimeClass/seccomp profiles, and policy-engine capabilities — otherwise the checklist drifts from what the cluster actually enforces.

Related chapters

Kubernetes Fundamentals - explains how pods, the control plane, and the scheduler work — the model every control in this chapter relies on.
OWASP Top 10 in System Design - provides the architecture framework for application-level risks (the Code layer of 4C) living inside the container.
Supply Chain Security - goes deeper on provenance, SBOM, and artifact signing that the image admission checks depend on.
Secrets Management Patterns - covers storing and rotating secrets that must not be baked into images or pod environment variables.