Kubelet kills containers after cpuset disappears from the cgroup on cgroupv2

### What happened?

Kublet kills containers after cpuset disappears from its cgroup

It happens after some short time after kubeadm init. 

For some time the static pod containers (etcd, kube-apiserver, ...) are in the Running state.

Then, after 5-130 seconds one of them becomes Exited. I never had all of them running longer than that, it happens all the time. 

For some time cpuset is present here:
```console
/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod<UID>.slice/cgroup.controllers
```
like this:
**cpuset** cpu io memory hugetlb pids rdma misc

**but, at some moment it goes away, and the file becomes like this:
cpu io memory hugetlb pids rdma misc
And this is the moment when kubelet desides to kill the container**

Then cpuset randomly comes and goes, but the container has been killed already.

I even built a version of kubelet with additional logging to diagnose this:
https://github.com/ptrts/kubernetes/tree/ptrts

It prints the following when the thing happens:
```console
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458201   29570 cgroup_v2_manager_linux.go:60] "(c *cgroupV2impl) Validate" cgroupPath="/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod13347e9bae338e4d2d6fc42a8599239b.slice"
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458208   29570 cgroup_v2_manager_linux.go:165] "ptrts: getSupportedUnifiedControllers" availableRootControllers={"cpu":{},"cpuset":{},"hugetlb":{},"io":{},"memory":{},"misc":{},"pids":{},"rdma":{}}
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458218   29570 cgroup_v2_manager_linux.go:64] "(c *cgroupV2impl) Validate" neededControllers={"cpu":{},"cpuset":{},"hugetlb":{},"memory":{},"pids":{}}
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458227   29570 cgroup_v2_manager_linux.go:173] "ptrts: readUnifiedControllers" path2="/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod13347e9bae338e4d2d6fc42a8599239b.slice/cgroup.controllers"
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458278   29570 cgroup_v2_manager_linux.go:178] "ptrts: readUnifiedControllers" controllersFileContent="cpu io memory hugetlb pids rdma misc\n"
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458286   29570 cgroup_v2_manager_linux.go:180] "ptrts: readUnifiedControllers" controllers=["cpu","io","memory","hugetlb","pids","rdma","misc"]
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458292   29570 cgroup_v2_manager_linux.go:71] "(c *cgroupV2impl) Validate" enabledControllers={"cpu":{},"hugetlb":{},"io":{},"memory":{},"misc":{},"pids":{},"rdma":{}}
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458299   29570 cgroup_v2_manager_linux.go:75] "(c *cgroupV2impl) Validate" difference={"cpuset":{}}
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458307   29570 cgroup_v2_manager_linux.go:87] "(c *cgroupV2impl) Exists" err="cgroup [\"kubepods\" \"burstable\" \"pod13347e9bae338e4d2d6fc42a8599239b\"] has some missing controllers: cpuset"
Apr 29 17:08:42 node1 kubelet[29570]: I0429 17:08:42.458340   29570 kuberuntime_container.go:810] "Killing container with a grace period" pod="kube-system/etcd-node1" podUID="13347e9bae338e4d2d6fc42a8599239b" containerName="etcd" containerID="containerd://80d6835742e3f9d0edf0183711b02ced037da43100336bfce096201a785d3f33" gracePeriod=30
```

Additionally I used this watch:
```console
watch -n0.2 'sudo find /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/*/cgroup.controllers -exec echo {} \; -exec cat {} \;'
```

Node Memory: 4096
Node CPUs: 2
Node OS: Ubuntu 22.04.5 LTS

kubeadm: kubeadm version: &version.Info{Major:"1", Minor:"32", GitVersion:"v1.32.4", GitCommit:"59526cd4867447956156ae3a602fcbac10a2c335", GitTreeState:"clean", BuildDate:"2025-04-22T16:02:27Z", GoVersion:"go1.23.6", Compiler:"gc", Platform:"linux/amd64"}

kubelet: 1.32.3

The node is created in VirtualBox, with Vagrant, like this:
https://gitlab.com/pavel-taruts/demos/k8s-experiments/-/blob/5cdcd59906d2364954c863bf6e6ab17bc2556286/Vagrantfile

The Vagrantfile has all the sh commands and files that I used to configure the node. There was no other configuration except replacing the kubelet to get the extended logs. 

My PC OS is Windows 10.

### What did you expect to happen?

The static pods created by kubeadm init should just run without being killed by kubelet. 

### How can we reproduce it (as minimally and precisely as possible)?

You can either use this Vagrant project
https://gitlab.com/pavel-taruts/demos/k8s-experiments/-/blob/5cdcd59906d2364954c863bf6e6ab17bc2556286

or you can configure Ubuntu jammy with the scripts and configuration files it has. 

The scripts are in the Vagrantfile. The files are in ./root/

The Vagrantfile requires you to add `vagrant_config.local.yml` with the paths of your ssh keys, like this:
```console
ssh_private_key_path: 'C:\Users\<username>\.ssh\id_ed25519'
ssh_public_key_path: 'C:\Users\<username>\.ssh\id_ed25519.pub'
```

---

Then open three ssh session into the node1

In the first session run this:
```console
watch sudo crictl ps -a
```

In the second one run this:
```console
watch -n0.2 'sudo find /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/*/cgroup.controllers -exec echo {} \; -exec cat {} \;'
```

In the third ssh terminal window run the following file. It's made as a series of stages. Befor running a stage, it shows its description and waits for any key to run it. Quickly run all stages except the last one "Get logs". Run "Get logs" only when you see a container got Exited in the `watch sudo crictl ps -a` terminal window. 

```console
#!/bin/bash

read -p "Stop kubelet"
sudo systemctl stop kubelet

read -p "Remove all manifests"
sudo mkdir -p /root/manifests-backup
sudo bash -c 'mv -f /etc/kubernetes/manifests/* /root/manifests-backup/ 2>/dev/null'

read -p "Remove all containers and pods"
sudo crictl rm --all -f 2>/dev/null
sudo crictl rmp --all -f 2>/dev/null

read -p "Stop containerd"
sudo systemctl stop containerd

read -p "Clean containerd data"
sudo rm -rf /var/lib/containerd/*
sudo rm -rf /run/containerd/*

read -p "Clean logs"
sudo journalctl --vacuum-time=1d
sudo rm -rf /var/log/pods/* /var/log/containers/*

read -p "Start containerd"
sudo systemctl start containerd

echo "---- Pods (should be empty):"
sudo crictl pods

read -p "Start kubelet"
sudo systemctl start kubelet

read -p "Bring back etcd"
sudo cp /root/manifests-backup/etcd.yaml /etc/kubernetes/manifests/

read -p "Bring back kube-apiserver"
sudo cp /root/manifests-backup/kube-apiserver.yaml /etc/kubernetes/manifests/

read -p "Containers list snapshot"
echo ---- Containers list snapshot 1
sudo crictl ps -a

read -p "Get logs (PRESS RIGHT AFTER A CONTAINER GETS Exited IN THE watch)"

date +"%H:%M:%S"

echo ---- Containers list snapshot 2
sudo crictl ps -a

rm -rf ./-*.log
rm -rf ./*.log

echo ---- full kubelet logs
sudo journalctl -u kubelet -n 10000 --no-pager > kubelet.log 2>&1

read -p "Finish"
```

Then search for "Killing container" in kubelet.log

### Anything else we need to know?

_No response_

### Kubernetes version

<details>

```console
$ kubectl version
Client Version: v1.32.4
Kustomize Version: v5.5.0
The connection to the server 192.168.56.11:6443 was refused - did you specify the right host or port?
```

</details>


### Cloud provider

<details>
```console
VirtualBox: Version 7.1.6 r167084 (Qt6.5.3)
Vagrant: Installed Version: 2.4.3
Windows 10
```
</details>


### OS version

<details>

```console
# On Linux:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
$ uname -a
Linux node1 5.15.0-135-generic #146-Ubuntu SMP Sat Feb 15 17:06:22 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
```

</details>


### Install tools

<details>

</details>


### Container runtime (CRI) and version (if applicable)

<details>
```
$ containerd --version
containerd containerd.io 1.7.27 05044ec0a9a75232cad458027ca83437aae3f4da
```
</details>


### Related plugins (CNI, CSI, ...) and versions (if applicable)

<details>

</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Kubelet kills containers after cpuset disappears from the cgroup on cgroupv2 #131567

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Kubelet kills containers after cpuset disappears from the cgroup on cgroupv2 #131567

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions