-
Notifications
You must be signed in to change notification settings - Fork 40.7k
chore(kubelet): add container name for sending Pod Hook fail event #130471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: googs1025 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/triage accepted |
// do not record the message in the event so that secrets won't leak from the server. | ||
m.recordContainerEvent(pod, container, kubeContainerID.ID, v1.EventTypeWarning, events.FailedPostStartHook, "PostStartHook failed") | ||
m.recordContainerEvent(pod, container, kubeContainerID.ID, v1.EventTypeWarning, events.FailedPostStartHook, eventMessage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: recordContainerEvent
can format messages:
m.recordContainerEvent(pod, container, kubeContainerID.ID, v1.EventTypeWarning, events.FailedPostStartHook, eventMessage) | |
m.recordContainerEvent(pod, container, kubeContainerID.ID, v1.EventTypeWarning, events.FailedPostStartHook, "PostStartHook failed for container %s", container.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// do not record the message in the event so that secrets won't leak from the server. | ||
m.recordContainerEvent(pod, containerSpec, containerID.ID, v1.EventTypeWarning, events.FailedPreStopHook, "PreStopHook failed") | ||
m.recordContainerEvent(pod, containerSpec, containerID.ID, v1.EventTypeWarning, events.FailedPreStopHook, eventMessage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same here.
0ed5048
to
ea7fc2d
Compare
@googs1025 can you look at other calls of recordContainerEvent and change them as well if they'd benefit of adding container name to the message. |
Thanks for the feedback. I took a quick look and other events seem to return detailed message info. I understand that the message should contain info that can determine which container it is. 🤔 |
This is similar of #129300, my position: The container information is already present in the Event through the involvedObject.fieldPath field, but kubectl doesn't display it by default. kubernetes/pkg/kubelet/container/ref.go Lines 33 to 45 in 8f330c6
If we want to show container names in events by default, maybe we should implement this in kubectl's output formatting rather than elsewhere. /cc @soltysh @ardaguclu Would like to get some input from kubectl maintainers. |
Why do we print the container name in other events? Normal Started 39m (x3 over 39m) kubelet Started container container-with-hooks1
Normal Created 39m (x3 over 39m) kubelet Created container: container-with-hooks1 We mainly have user(in our group) feedback that when postHook is executed, it is not sure which container failed to execute. |
I'm not sure if my understanding is correct. The sig-node maintainers might have more accurate and comprehensive insights. If you want to retrieve the involvedObject of an event, you can use |
/test pull-kubernetes-e2e-kind-ipv6 |
@SergeyKanzhelev @ffromani can you please take a look? |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
Can you clarify, is this a formatting issue or consistency with other events? @googs1025 please clarify the intent here. I do not know the history so if you have done digging already, it will help if you can summarize the state we are in. |
@SergeyKanzhelev thanks for this. This is not just a formatting issue, but rather a matter of completeness in event reporting. When multiple containers in a Pod define lifecycle hooks (e.g., postStart, preStop), and one of them fails during execution, the current events only show:
There's no indication of which container triggered the failure. This makes troubleshooting significantly harder. We've observed that other container-related events (like Started, Failed, Killing) do include the container name for clarity:
However, lifecycle hook failures are missing this context, leading to inconsistency across similar types of events. The purpose of this PR is to address that gap by including the container name in lifecycle hook failure events, improving both observability and debug. |
/remove-lifecycle stale it seems a nice QoL change to me |
To mitigate any potential errors caused by my earlier assumptions, I conducted some research: In PR #73892, we previously adjusted messages for some container events by adding container names to messages for events like Created, Started, and Stopping. However, these changes did not include container lifecycle hook events. In PR #86139, we refactored the behavior of lifecycle hooks. The current message format for container lifecycle hook events was established in this PR and has remained unchanged since. Before this PR, messages for container lifecycle hook events would log the hook failed meesage. After this PR, we only record messages like: "PostStartHook failed", etc. Furthermore, during PR #86139, we did not discuss whether to add container names to the messages for container lifecycle hook events, this aspect appears to have been overlooked. Additionally, prior to #73892, similar efforts had been made by others. However, @yujuhong rejected the proposal using reasoning similar to mine. At that time, Ref: #66855 (comment) Nevertheless, since #73892 has already been merged, I believe maintaining consistency in container event messages is what we should do. |
@HirazawaUi Thank you so much for taking the time to search for the context. 😄 |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
We have multiple containers with hooks, one succeeds and the other fails, but we cannot tell from the events which container executed the hook error, which will affect the troubleshooting
We should at least record the container name. I see error info is not added because here https://github.com/kubernetes/kubernetes/pull/86139/files#r665010260
Which issue(s) this PR fixes:
Fixes # some thinking in: #130438
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: