Skip to content

kubelet_pod_start_duration_seconds Metric Incorrectly Reports Double Pod Count #132268

@HirazawaUi

Description

@HirazawaUi

What happened?

The kubelet_pod_start_duration_seconds metric appears to be double counting pod creations. During testing in an empty cluster, creating 1 pod resulted in the metric reporting values for 2 pods. After creating a second pod, the metric reported 4 pods.

# Initial cluster state (1 pod):

kubectl get pods -A
NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE
kube-system   coredns-5b486bf7cd-xxwgx   1/1     Running   0          27s

# Check metric before new pod creation, output shows count 2 (expected 1)::

➜  opt curl -k  https://localhost:10250/metrics | grep pod_start_duration_seconds
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP kubelet_pod_start_duration_seconds [ALPHA] Duration in seconds from kubelet seeing a pod for the first time to the pod starting to run
# TYPE kubelet_pod_start_duration_seconds histogram
kubelet_pod_start_duration_seconds_bucket{le="0.5"} 1
kubelet_pod_start_duration_seconds_bucket{le="1"} 2
kubelet_pod_start_duration_seconds_bucket{le="2"} 2
kubelet_pod_start_duration_seconds_bucket{le="3"} 2
kubelet_pod_start_duration_seconds_bucket{le="4"} 2
kubelet_pod_start_duration_seconds_bucket{le="5"} 2
kubelet_pod_start_duration_seconds_bucket{le="6"} 2
kubelet_pod_start_duration_seconds_bucket{le="8"} 2
kubelet_pod_start_duration_seconds_bucket{le="10"} 2
kubelet_pod_start_duration_seconds_bucket{le="20"} 2
kubelet_pod_start_duration_seconds_bucket{le="30"} 2
kubelet_pod_start_duration_seconds_bucket{le="45"} 2
kubelet_pod_start_duration_seconds_bucket{le="60"} 2
kubelet_pod_start_duration_seconds_bucket{le="120"} 2
kubelet_pod_start_duration_seconds_bucket{le="180"} 2
kubelet_pod_start_duration_seconds_bucket{le="240"} 2
kubelet_pod_start_duration_seconds_bucket{le="300"} 2
kubelet_pod_start_duration_seconds_bucket{le="360"} 2
kubelet_pod_start_duration_seconds_bucket{le="480"} 2
kubelet_pod_start_duration_seconds_bucket{le="600"} 2
kubelet_pod_start_duration_seconds_bucket{le="900"} 2
kubelet_pod_start_duration_seconds_bucket{le="1200"} 2
kubelet_pod_start_duration_seconds_bucket{le="1800"} 2
kubelet_pod_start_duration_seconds_bucket{le="2700"} 2
kubelet_pod_start_duration_seconds_bucket{le="3600"} 2
kubelet_pod_start_duration_seconds_bucket{le="+Inf"} 2
kubelet_pod_start_duration_seconds_sum 0.52958725
kubelet_pod_start_duration_seconds_count 2
100  178k    0  178k    0     0  9682k      0 --:--:-- --:--:-- --:--:-- 9933k

# Create new pod:
➜ kubectl apply -f testpod.yml

# Verify new cluster state (2 pods):
➜  opt kubectl get pods -A
NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE
default       testpod                    1/1     Running   0          19s
kube-system   coredns-5b486bf7cd-xxwgx   1/1     Running   0          90s

# Check metric again, output shows count 4 (expected 2):

➜  opt curl -k  https://localhost:10250/metrics | grep pod_start_duration_seconds
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP kubelet_pod_start_duration_seconds [ALPHA] Duration in seconds from kubelet seeing a pod for the first time to the pod starting to run
# TYPE kubelet_pod_start_duration_seconds histogram
kubelet_pod_start_duration_seconds_bucket{le="0.5"} 2
kubelet_pod_start_duration_seconds_bucket{le="1"} 3
kubelet_pod_start_duration_seconds_bucket{le="2"} 3
kubelet_pod_start_duration_seconds_bucket{le="3"} 3
kubelet_pod_start_duration_seconds_bucket{le="4"} 4
kubelet_pod_start_duration_seconds_bucket{le="5"} 4
kubelet_pod_start_duration_seconds_bucket{le="6"} 4
kubelet_pod_start_duration_seconds_bucket{le="8"} 4
kubelet_pod_start_duration_seconds_bucket{le="10"} 4
kubelet_pod_start_duration_seconds_bucket{le="20"} 4
kubelet_pod_start_duration_seconds_bucket{le="30"} 4
kubelet_pod_start_duration_seconds_bucket{le="45"} 4
kubelet_pod_start_duration_seconds_bucket{le="60"} 4
kubelet_pod_start_duration_seconds_bucket{le="120"} 4
kubelet_pod_start_duration_seconds_bucket{le="180"} 4
kubelet_pod_start_duration_seconds_bucket{le="240"} 4
kubelet_pod_start_duration_seconds_bucket{le="300"} 4
kubelet_pod_start_duration_seconds_bucket{le="360"} 4
kubelet_pod_start_duration_seconds_bucket{le="480"} 4
kubelet_pod_start_duration_seconds_bucket{le="600"} 4
kubelet_pod_start_duration_seconds_bucket{le="900"} 4
kubelet_pod_start_duration_seconds_bucket{le="1200"} 4
kubelet_pod_start_duration_seconds_bucket{le="1800"} 4
kubelet_pod_start_duration_seconds_bucket{le="2700"} 4
kubelet_pod_start_duration_seconds_bucket{le="3600"} 4
kubelet_pod_start_duration_seconds_bucket{le="+Inf"} 4
kubelet_pod_start_duration_seconds_sum 4.409899585
kubelet_pod_start_duration_seconds_count 4
100  184k    0  184k    0     0  10.9M      0 --:--:-- --:--:-- --:--:-- 11.2M

What did you expect to happen?

The metric kubelet_pod_start_duration_seconds_count should reflect the actual number of pods created.

How can we reproduce it (as minimally and precisely as possible)?

Create a Kubernetes cluster, deploy any pod, and check the metrics.

Anything else we need to know?

No response

Kubernetes version

Master branch

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.sig/nodeCategorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

    Type

    No type

    Projects

    Status

    Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions