The Kubernetes Pause Container
Engineers working with Kubernetes often focus on application containers and in the words of Tom Jones - "it's not usual to overlook a tiny pause container".
Okay, granted Tom Jones never said that but I could imagine him belting out this classic at a KubeCon Karaoke party, singing his praises for this component that quietly holds everything together in Kubernetes: the Pause container.
If you've ever listed containers on a Kubeadm built Kubernetes node, you've likely seen several instances of an image called "pause" running with the command "/pause"
- seemingly doing nothing.
For example, on a containerd based kubeadm installation, we can use the convenient nerdctl tool to take a peek at the running containers on the host system, using the command nerdctl -n k8s.io ps
-
# nerdctl -n k8s.io ps | grep pause | sed -E 's/ +/ /g' 1be166813208 registry.k8s.io/pause:3.10 "/pause" 8 minutes ago Up k8s://kube-system/kube-apiserver-control-plane 33642cd24c77 registry.k8s.io/pause:3.10 "/pause" 8 minutes ago Up k8s://kube-system/kube-controller-manager-control-plane 3d0de24a5754 registry.k8s.io/pause:3.10 "/pause" 7 minutes ago Up k8s://kube-system/coredns-6f6b679f8f-p86pc 9d85fdeb1eea registry.k8s.io/pause:3.10 "/pause" 7 minutes ago Up k8s://kube-system/coredns-6f6b679f8f-lgr68 d0717a8a7dbb registry.k8s.io/pause:3.10 "/pause" 8 minutes ago Up k8s://kube-system/kube-scheduler-control-plane d7624b3f2235 registry.k8s.io/pause:3.10 "/pause" 7 minutes ago Up k8s://kube-system/kube-proxy-fw2z6 fe514cb6fdbb registry.k8s.io/pause:3.10 "/pause" 8 minutes ago Up k8s://kube-system/etcd-control-plane
In this output we can see that there is a pause container associated with each kubernetes pod that is running the Kubernetes infrastructure. For example, we can see a pause container for the kube-scheduler pod.
These Pause containers don't serve traffic or run any business logic, yet they are critical components for Kubernetes Pods to function.
Let's dive into why the Pause container is so important. What exactly it does for Pod namespaces and how we can emulate a Kubernetes Pod with one of my favourite tools - Docker, using the same Pause container image that is used by Kubernetes!
By the end, you'll have a clearer mental model of how Pods function under the hood and a better understanding of the mechanics and linux components, that makes this possible.
What is the Pause Container?
Every Kubernetes Pod has an associated Pause container (sometimes referred to as the infra container or sandbox container). Kubernetes launches this container first when creating a Pod.
It's easy to overlook as it performs no complex tasks itself - it literally just runs an infinite sleep (via the pause
binary) in the background.
Why Does the Pause Container Exist?
The Pause container's responsibility is to setup a shared environment for other containers that exist in the Pod, whether it's a pod with a single container or multiple containers. Pause acts as the parent container for the Pod, meaning it creates and holds the Linux namespaces that the Pod uses.
What are Namespaces?
Namespaces are the fundamental Linux feature that isolate resources like networking, inter-process communication (IPC), and process IDs (PIDs) for containers. By having one container boot up first (the Pause container) and claim these namespaces, Kubernetes ensures that all other containers in the Pod can join those namespaces and effectively behave as though they're processes on the same host. The Pause container is important to how pods function, even if it's mostly doing nothing the majority of the time.
In a nutshell, the Pause container:
- Holds the network namespace for the Pod: It grabs the Pod's unique IP address and network stack, allowing other containers to share the same IP and ports.
- Holds the shared IPC namespace (and UTS namespace for hostname) for the Pod: This enables containers to communicate via inter-process mechanisms if needed (e.g. shared memory segments), it also ensures they see the same hostname.
- Optionally it holds the PID namespace for the Pod: When a Pod has process namespace sharing enabled, the Pause container takes on the role of PID 1 (the init process) for the Pod, handling zombie processes and serving as the init for all containers within the Pod.
The Pause Container's Minimal Implementation
It's worth emphasising that the Pause container itself does not run any application logic - its implementation is minimal (the source code is basically an infinite loop that sits and waits for signals). You can review the code for yourself. Even if you're not proficient in C, it's fairly easy to understand -
/* Copyright 2016 The Kubernetes Authors. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. */ #include <signal.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #define STRINGIFY(x) #x #define VERSION_STRING(x) STRINGIFY(x) #ifndef VERSION #define VERSION HEAD #endif static void sigdown(int signo) { psignal(signo, "Shutting down, got signal"); exit(0); } static void sigreap(int signo) { while (waitpid(-1, NULL, WNOHANG) > 0) ; } int main(int argc, char **argv) { int i; for (i = 1; i < argc; ++i) { if (!strcasecmp(argv[i], "-v")) { printf("pause.c %s\n", VERSION_STRING(VERSION)); return 0; } } if (getpid() != 1) /* Not an error because pause sees use outside of infra containers. */ fprintf(stderr, "Warning: pause should be the first process\n"); if (sigaction(SIGINT, &(struct sigaction){.sa_handler = sigdown}, NULL) < 0) return 1; if (sigaction(SIGTERM, &(struct sigaction){.sa_handler = sigdown}, NULL) < 0) return 2; if (sigaction(SIGCHLD, &(struct sigaction){.sa_handler = sigreap, .sa_flags = SA_NOCLDSTOP}, NULL) < 0) return 3; for (;;) pause(); fprintf(stderr, "Error: infinite loop terminated\n"); return 42; }
This minimalism is by design: it reduces the chances of the Pause container crashing or exiting. In fact, if the Pause container were to die, the Pod would be considered dead - Kubernetes would kill and may recreate the entire Pod (subject to the restart policies of the Pod) if the Pause container exits. It is the glue holding the entire Pod together.
In day-to-day use you rarely interact with it directly, but it's quietly doing the groundwork so that the real containers (your application, sidecars, etc.) can seamlessly work together and live harmoniously in Kubernetes.
Shared Namespaces: How Pause Enables Pod Containers to Work as One
One of Kubernetes' core concepts is that containers in the same Pod share certain namespaces and therefore share resources like network and IPC. The Pause container is the mechanism Kubernetes uses to achieve this namespace sharing.
Let's break down the key namespaces and how the Pause container facilitates sharing:
Network Namespace Sharing
Network Namespace (Net NS): All containers in a Pod share a single network namespace - they have the same IP address and network interfaces. From the perspective of processes inside the Pod, they appear to be on the same localhost. This means containers can talk to each other via localhost
and don't need to expose ports to the outside world for intra-pod communication.
The Pause container sets up this network namespace and holds it for the Pod's lifetime. Kubernetes assigns the Pod an IP address (through the container runtime and CNI plugin) to the Pause container's network interface.
When other containers start, instead of receiving their own network stack, they join the Pause container's network namespace. If one container in the Pod is listening on port 3306, another container in the same Pod can reach it via 127.0.0.1:3306
or localhost:3306
because to them, it's all the same network space. The pause container essentially acts as the pod IP holder - it's the reason all containers in a Pod share one IP.
This is vital for Kubernetes Services and DNS to be able to treat the Pod as a single endpoint.
Inter-Process Communication Namespace
IPC Namespace: The inter-process communication namespace allows processes to share memory segments, Unix domain sockets, semaphores, etc. Kubernetes makes all containers in a Pod share the same IPC namespace by default (so they can communicate via these IPC mechanisms if needed). The Pause container creates this IPC namespace when it starts and other containers join it. In practice, many applications don't explicitly use IPC between containers but some sidecar patterns or complex apps might. Even if you're not using it, this shared IPC namespace is there because of the Pause container.
Process ID Namespace
PID Namespace: By default, containers in a Pod have isolated process ID spaces (i.e. each container will only sees its own processes). If we were to run a quick pod, with a single container, that container can only see it's own processes -
# kubectl run -it --rm ubuntu --image=ubuntu -- bash If you don't see a command prompt, try pressing enter. root@ubuntu:/# ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 19:27 pts/0 00:00:00 bash root 12 1 0 19:27 pts/0 00:00:00 ps -ef
Kubernetes offers an option to share the process namespace (shareProcessNamespace: true
) via the Pod spec, which can be useful for debugging or sidecar scenarios. If enabled, all containers in the Pod will see each other's processes. The Pause container will be PID 1 of the Pod.
As an example, we'll bootstrap yaml for a ubuntu pod. We'll add in the required shareProcessNamespace: true
parameter, we'll run the pod, exec into it and then inspect the process table -
# kubectl run ubuntu --image=ubuntu --restart=Never --command --dry-run=client -o yaml -- sleep infinity > ubuntu.yaml # sed -i '/containers:/i\ shareProcessNamespace: true' ubuntu.yaml # cat ubuntu.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: ubuntu name: ubuntu spec: shareProcessNamespace: true containers: - command: - sleep - infinity image: ubuntu name: ubuntu resources: {} dnsPolicy: ClusterFirst restartPolicy: Never status: {} # # kubectl apply -f ubuntu.yaml pod/ubuntu created # kubectl exec -it ubuntu -- bash root@ubuntu:/# root@ubuntu:/# root@ubuntu:/# ps -ef UID PID PPID C STIME TTY TIME CMD 65535 1 0 0 19:34 ? 00:00:00 /pause root 7 0 0 19:34 ? 00:00:00 sleep infinity root 15 0 0 19:34 pts/0 00:00:00 bash root 23 15 0 19:34 pts/0 00:00:00 ps -ef
With the shareProcessNamespace
parameter set, we can see /pause
as PID 1
, in our Pod.
Why PID 1 Matters
Whichever process creates a new PID namespace becomes the init process for that namespace. In our Pod, the Pause container is launched first to set up the namespace so it naturally becomes PID 1 inside that Pod's PID namespace. Acting as PID 1, the Pause container can reap zombie processes from application containers. A container's init process is responsible for cleaning up child processes that have terminated.
This is a subtle but important function - it prevents defunct processes from accumulating if your containers spawn child processes. In normal Docker containers that don't share a PID namespace, each container's PID 1 (often the application process or a tiny init system) handles its own zombies. In a shared PID setup, Pause takes on that responsibility for the Pod. Most Pods don't enable PID sharing unless needed but it's useful to to know this functionality exists and that Pause, is the enabler of this functionality.
Hands-On: Emulating a Kubernetes Pod with Docker
Let's put theory into practice! We'll recreate the Pause container behavior using Docker to understand how it works behind the scenes.
Demonstration: Creating a Multi-Container Pod with MySQL and WordPress
In this example, we'll set up a WordPress site and a MySQL database as if they were in a single Pod. In Kubernetes, you might deploy WordPress and MySQL in a single Pod if, say, you had a sidecar container or tightly coupled app. Though typically, WordPress and MySQL would run in separate Pods for scalability - bear with the example as it is for learning purposes. We're going to include a Pause container to bind them together, just like Kubernetes does with a Pod.
Step 1: Setting Up the Pause Container
Run a Pause container to act as the Pod's infrastructure. We'll use the official k8s.gcr.io/pause
image. This container will create the shared namespaces. We also want to access WordPress from our host machine's browser later, so we'll publish port 8080
on the host to port 80
in the Pod's network namespace (note: we publish ports against the Pause container, not on WordPress, because the Pause container owns the network namespace).
Execute the following:
% docker run -d --name pod-pause --ipc=shareable -p 8080:80 k8s.gcr.io/pause:3.10 Unable to find image 'k8s.gcr.io/pause:3.10' locally 3.10: Pulling from pause 75e060e453aa: Pull complete Digest: sha256:ee6521f290b2168b6e0935a181d4cff9be1ac3f505666ef0e3c98fae8199917a Status: Downloaded newer image for k8s.gcr.io/pause:3.10 f830b6274ce4d3c9d3c3174848eeb30a62098db9dcd532c301497b9c3a05faf4
This starts a container named pod-pause
that is running (and sleeping) in the background. It has now set up a network stack (with an IP, and it's listening on port 80 inside that namespace, though nothing is actually handling requests yet), an IPC namespace, and it's ready to be the "hub" for our Pod.
Step 2: Adding MySQL to the Pod
Run the MySQL container in the same namespaces as the Pause container. We use Docker flags to join the network (--net
), we specify the IPC (--ipc
), and PID (--pid
) namespaces of the pod-pause
container. By doing so, the MySQL container will not get its own separate networking or IPC; it will share those of pod-pause
(and thus any other container joining pod-pause
). We also provide the necessary environment variables for MySQL (the root password, and we create the database that we will use for WordPress). For example:
% docker run -d --name pod-mysql \ --net=container:pod-pause \ --ipc=container:pod-pause \ --pid=container:pod-pause \ -e MYSQL_ROOT_PASSWORD=strongpassword \ -e MYSQL_DATABASE=wordpress \ mysql Unable to find image 'mysql:latest' locally latest: Pulling from library/mysql 1281dea9bbdc: Pull complete a8868fca330e: Pull complete 12d652dc2508: Pull complete b874233f830c: Pull complete ca07bba4ff38: Pull complete 65a492f1b8dd: Pull complete 00735954ac4b: Pull complete cffd736c905d: Pull complete a71c45291463: Pull complete 147b5c0a118e: Pull complete Digest: sha256:072f96c2f1ebb13f712fd88d0ef98f2ef9a52ad4163ae67b550ed6720b6d642e Status: Downloaded newer image for mysql:latest 6a0fb154aca3def5aeafdec70b044074b60d2496b206e0f8ff7a4bbd6fbe3f5a
When this executes, Docker doesn't create a new network interface for MySQL. Instead, it places the MySQL's process inside the existing network namespace of pod-pause
. Now, both pod-pause
and pod-mysql
share their network.
MySQL's ports are essentially Pod ports now. Similarly, we joined IPC, so any shared memory or semaphores could be accessed by WordPress (if it needed to). We also joined the PID namespace: if we take a look in the MySQL container and inspect the /proc table for process id 1, we can confirm that this is the pause process (I used cat /proc for this example as the mysql container is minimised and doesn't contain host tools like ps
) -
% docker exec pod-mysql cat /proc/1/cmdline /pause
Step 3: Adding WordPress to the Pod
Run the WordPress container in the same namespaces. We'll create an appropriate docker run command for WordPress, joining the same pod-pause
container's namespaces. We need to provide WordPress with the database connection info.
Normally, you'd point WORDPRESS_DB_HOST
to a hostname like mysql
or an IP, but since WordPress will share the network with MySQL, we can simply use 127.0.0.1
(localhost).
Both containers consider localhost to be the Pod's network. We also provide the DB name, user, and password so WordPress can connect to our mysql database.
docker run -d --name pod-wordpress \ --net=container:pod-pause \ --ipc=container:pod-pause \ --pid=container:pod-pause \ -e WORDPRESS_DB_HOST=127.0.0.1:3306 \ -e WORDPRESS_DB_USER=root \ -e WORDPRESS_DB_PASSWORD=strongpassword \ -e WORDPRESS_DB_NAME=wordpress \ wordpress:latest
Now we have three running containers: pod-pause
(pause), pod-mysql
(MySQL), and pod-wordpress
(WordPress). But unlike a normal Docker setup, all three are acting like a single logical host. They share the same network namespace, so WordPress and MySQL communicate over the loopback interface. Indeed, inside the WordPress container, the environment variable WORDPRESS_DB_HOST=127.0.0.1:3306
points to the MySQL container's service and this works because of the shared namespace.
We didn't have to link containers or use Docker networking - Kubernetes' Pod model (emulated here by the pause container) makes inter-container communication trivial: we just use localhost!
Similarly, if WordPress and MySQL wanted to use a shared memory segment or Unix socket, they could, thanks to the shared IPC namespace.
Step 4: Verifying the Pod Setup
As we have three different containers running, let's verify the setup. A nice item to check is the hostname. If we check our running containers, we can see the container id for pod-pause
, in my case at the time of writing, this is f830b6274ce4
-
% docker ps | egrep 'pod-' ae2a00c6a9ea wordpress:latest "docker-entrypoint.s…" 6 minutes ago Up 6 minutes pod-wordpress 6a0fb154aca3 mysql "docker-entrypoint.s…" 11 minutes ago Up 11 minutes pod-mysql f830b6274ce4 k8s.gcr.io/pause:3.10 "/pause" 13 minutes ago Up 13 minutes 0.0.0.0:8080->80/tcp, [::]:8080->80/tcp pod-pause
If we were to check the hostname from the wordpress container, we'll be able to confirm that from the viewpoint of this container, it is running in the pause container namespace as the containers hostname, matches that of our pod-pause
container id -
% docker exec -it pod-wordpress hostname f830b6274ce4
Respectfully, if we ran a ps -ef
from this container we would also see the processes from all 3 of the containers with PID 1, being the pause container -
% docker exec -it pod-wordpress ps -ef UID PID PPID C STIME TTY TIME CMD 65535 1 0 0 19:56 ? 00:00:00 /pause 999 7 0 0 19:58 ? 00:00:03 mysqld root 304 0 0 20:03 ? 00:00:00 apache2 -DFOREGROUND www-data 360 304 0 20:03 ? 00:00:00 apache2 -DFOREGROUND www-data 361 304 0 20:03 ? 00:00:00 apache2 -DFOREGROUND
How This Relates to Kubernetes
This Docker demo underscores what the Pause container is doing in Kubernetes: it's duplicating what we did manually with --net=container:...
and other flags, but in an automated way. In Kubernetes, when a Pod is scheduled, the kubelet on the node effectively:
- Starts the Pause container (with an image like
k8s.gcr.io/pause:3.x
) to create the namespaces (net, IPC, etc.). - Starts each app container in the Pod, instructing the container runtime to put them in the same namespaces as the Pause container (under the hood, our container runtime, for example containerd is told "join this sandbox/pause container's namespaces"). All volume mounts defined at the Pod level are also mounted in, so containers can share files if needed.
- If any container (other than Pause) crashes or is restarted, the Pause container remains running, so the namespaces (and Pod IP) persist across restarts. Only when the Pod is deleted or evicted does the Pause container go away, releasing the IP and resources.
Kubernetes handles all of this for you, as a user you just create a Pod, and you may not even realise that a pause container was involved!
Finally, to recap, when we started our pause-pod container, we forwarded port 8080
on our localhost
to port 80 in the container. This was a pre-emptive step so that we could access Wordpress, when it is running in our shared pod infrastructure.
In fact, if we access this on our system via localhost (of our local system) using http://localhost:8080
- You'll be greeted with the WordPress Installation page!
Conclusion: The Unsung Hero of Kubernetes
In conclusion, the Pause container might be "boring" - it literally pauses - but it is the unsung hero ensuring your multi-container Pods work seamlessly.
Kubernetes takes care of managing Pause containers so you don't have to but by peeking behind the curtain we gain insight into how Kubernetes achieves pod sandboxing.
Next time you deploy an app and marvel at how two containers can magically find each other on localhost
, give a gentle nod to the Pause container that makes all of this possible.
It's a prime example of Kubernetes' engineering elegance: simple, stable, and effective. Understanding these fundamentals will help you become a better Kubernetes troubleshooter and architect!