-
Notifications
You must be signed in to change notification settings - Fork 40.7k
[WIP][FG:InPlacePodVerticalScaling] Fix Static CPU management policy alongside InPlacePodVerticalScaling #129719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@Chunxia202410, @hshiina , @ffromani , @AnishShah , @tallclair , @SergeyKanzhelev , @vinaykul , moved old PR here due to company transfer, now i can continue working on this. Will prioritize updating the commit with the recent changes done in InPlacePodVerticalScaling and the proposals shared by Chunxia202410. |
6b6da95
to
cc91639
Compare
Thanks @Chunxia202410 for the contributions, merged your two PRs to check tests . I have some questions about the second PR regarding strategy , thought best to discuss here, will ask later this week, need to do some more tests. Thanks for the API PR seems is one of the options discussed. Will try to update tests tomorrow or by end of this week, to make sure test is covered and passed before the Beta in v1.33 to have a point of reference. We will need to raise the final solution or solutions in Sig-node , after v1.33 is finished and this PR is refactored with the new InPlacePodVerticalScaling KEP ongoing changes . Thanks again for your PRs |
b65f294
to
fc5ddd1
Compare
/retest |
0153b60
to
09b2d95
Compare
/retest |
df313e9
to
c4aaa63
Compare
Use new topology.Allocation struct (a CPU set plus alignment metadata) instead of CPU set, due to rebase. Remove duplicate unecessary SetDefaultCPUSet call as per review comment.
- Revert introduction of API env mustKeepCPUs - Replace mustKeepCPUs with local checkpoint "promised" - Introduce "promised" in CPUManagerCheckpointV3 format - Add logic, refactor with Beta candidate - Fix lint issues - Fail if mustKeepCPUs are not subset of resulted CPUs - Fail if reusableCPUsForResize, mustKeepCPUs are not a subset of aligned CPUs - Fail if mustKeepCPUs are not a subset of reusable CPUs - TODO improve align resize tests, go through testing, corner cases refactor using cpumanager_test.go - TODO improve CPUManagerCheckpointV3 tests - TODO address code review/feedback to try different approach to allocate stepwise instead of once off when resizing - TODO check init-containers - TODO check migration from v2 to v3 CPU Manager checkpoint - TODO check kubectl failure when prohibited can this be done earlier? - TODO update CPU Manager tests to use refactpred cpu_manager_test
@esotsal: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
I'm struggling to find the time to review this important PR. I'll keep pushing, but feel free to pull in more reviewers. |
/retest |
What type of PR is this?
/kind bug
What this PR does / why we need it:
It is needed to resize allocated CPU of a Guaranteed QoS Class with integer CPU requests Pod without restart with Static CPU management policy alongside InPlacePodVerticalScaling.
Current PR retains "promised" upon start of container CPUs
Which issue(s) this PR fixes:
It is mentioned the issue in https://kubernetes.io/blog/2023/05/12/in-place-pod-resize-alpha/#known-issues last bullet.
I wasn't able to find a K8s issue, posting this PR following "I found it, I fix it" ethos after discussion in slack.
Update: Attempt to fix #127262
Special notes for your reviewer:
[Update] Latest 14th April 2025 Demo screencast
Screencast.from.2025-04-14.10-02-45.webm
This PR replace #123319 , needed due to company transfer.
This PR includes also a merge from esotsal#9 , to test proposals and review in one common PR. Thanks @Chunxia202410 for the contributions.
Does this PR introduce a user-facing change?