Skip to content

Remove use of html/template in apiserver to avoid disabling dead code elimination #132177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

pgimalac
Copy link
Contributor

@pgimalac pgimalac commented Jun 8, 2025

What type of PR is this?

/kind bug

What this PR does / why we need it:

Replace the use of html/template in apiserver with direct string manipulation.

It avoids disabling dead code elimination for binaries importing this code*, in my case this single change makes a binary go from 155MiB to 117MiB (-25%).

The template is quite trivial so I don't feel like it makes the code much more complex.

*Note that this single change doesn't enable dead code elimination in the kube-apiserver binary, a few other changes are needed (see #132177 (comment) for details), in particular fixing or getting rid of https://github.com/google/go-cmp.

Which issue(s) this PR is related to:

#130201: Note the same piece of code, but same issue.

Special notes for your reviewer:

Let me know if you want me to add tests around template rendering.

I believe the calls to Write and Fprintf can't fail, but I still checked the returned errors, let me know if I should remove that to simplify the code.

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 8, 2025
@k8s-ci-robot k8s-ci-robot requested a review from jpbetz June 8, 2025 00:43
@k8s-ci-robot k8s-ci-robot requested a review from sttts June 8, 2025 00:43
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 8, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @pgimalac. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 8, 2025
@dims
Copy link
Member

dims commented Jun 8, 2025

/ok-to-test

@pgimalac i am curious, how did you find the issue and the fix? thanks!

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 8, 2025
@dims
Copy link
Member

dims commented Jun 8, 2025

/are code-organization

@pgimalac
Copy link
Contributor Author

pgimalac commented Jun 8, 2025

👋 @dims I think what made me aware of the issue in general is this talk https://www.youtube.com/watch?v=EkG177eRcco on the topic and this associated pull request in cobra spf13/cobra#1956.

Over the past few months I've been actively working on enabling the optimization in the binaries from https://github.com/DataDog/datadog-agent, by fixing upstream dependencies when possible, or just removing them / forking them otherwise.
This is the very last change, needed for the last binary.

In order to find for a given binary what is disabling the optimization, you can add -ldflags=-dumpdep to the go build args, so that it generates the symbol dependency graph, and pass it to https://github.com/aarzilli/whydeadcode to get a readable output.

For this specific change, here is the output of the tool:

whydeadcode output
text/template.(*state).evalField reachable from:
	 text/template.(*state).evalFieldChain
	 text/template.(*state).evalCommand
	 text/template.(*state).evalPipeline
	 text/template.(*state).walk
	 text/template.(*Template).execute
	 html/template.(*Template).Execute
	 k8s.io/apiserver/pkg/server/routes.DebugFlags.Index
	 k8s.io/apiserver/pkg/server/routes.DebugFlags.Index-fm
	 k8s.io/apiserver/pkg/server/routes.(*DebugSocket).InstallDebugFlag
	 k8s.io/apiserver/pkg/server.installAPI
	 k8s.io/apiserver/pkg/server.completedConfig.New
	 sigs.k8s.io/custom-metrics-apiserver/pkg/apiserver.CompletedConfig.New
	 github.com/DataDog/datadog-agent/cmd/cluster-agent/custommetrics.RunServer
	 github.com/DataDog/datadog-agent/cmd/cluster-agent/subcommands/start.start.func5
	 github.com/DataDog/datadog-agent/cmd/cluster-agent/subcommands/start.start
	 github.com/DataDog/datadog-agent/cmd/cluster-agent/subcommands/start.start·f
	 github.com/DataDog/datadog-agent/cmd/cluster-agent/subcommands/start.Commands.func1
	 github.com/DataDog/datadog-agent/cmd/cluster-agent/subcommands/start.Commands
	 github.com/DataDog/datadog-agent/cmd/cluster-agent/subcommands/start.Commands·f
	 main.main
	 runtime.main_main·f
	 runtime.main
	 runtime.mainPC
	 runtime.rt0_go
	 main
	 _

We can see that the problematic code is the use of template removed by this PR.
After patching, the output is empty, and I can confirm that the binary is significantly smaller.

@dims
Copy link
Member

dims commented Jun 8, 2025

@pgimalac thank you for the very detailed explanation :)

/priority important-soon
/assign @liggitt @deads2k

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 8, 2025
@aojea
Copy link
Member

aojea commented Jun 8, 2025

Relevant discussion golang/go#72895

Are we not using all those reflection method in other places?

@pgimalac can you add to the description what is the exact effective improvement on the apiserver binary?

From current kind job in this pr

registry.k8s.io/kube-apiserver-amd64 v1.34.0-alpha.0.960_310105bfd4e357 f837585c16f15 104MB

From another kind job in another pr

registry.k8s.io/kube-apiserver-amd64 v1.34.0-alpha.0.962_3fb0cea0496ec4 57d0fcdf48fb4 104MB

I can't see the difference

To clarify, the problem seems to be the reflection method, is just that library uses them , so removing html template is necessary condition but not enough, as other libraries or code maybe using them

@pgimalac
Copy link
Contributor Author

pgimalac commented Jun 8, 2025

I'm not familiar with the binaries generated in this repo so I can't say without looking into it further, but very likely this is not enough for your binaries, eg. I know go-cmp is used in this repo and it disables DCE (related issues: #104821, #130201, google/go-cmp#373).

I can at least say that one of the binaries I work on imports code from k8s.io/apiserver, and this use of template is the only thing disabling the optimization...

I built the binary in cmd/kube-apiserver (just did a go build, I don't know if you have a more complex build mechanism), and I can confirm you at least need to update cobra to 1.9.1. I'll try to check if more changes are needed and list them here.

@pgimalac
Copy link
Contributor Author

pgimalac commented Jun 8, 2025

Well it's not that bad, the changes needed to enable the optimization in kube-apiserver are:

go-cmp is probably the most annoying one, I tried fixing upstream but they weren't interested, so on my side I ended up replacing the dependency with a patched fork... (https://github.com/DataDog/go-cmp/tree/dce-patch/v0.7.0)

Building locally on Darwin arm64, the stripped binary kube-apiserver goes from 119MiB to 100MiB (-16%).

I'd be happy to open PRs for each change, but unless you're open to using a fork of go-cmp, it's not that easy to get rid of (so you wouldn't be able to enable the optimization).

@aojea
Copy link
Member

aojea commented Jun 9, 2025

make WHAT=cmd/kube-apiserver will generate the binary in the _output/ folder

I think is better to merge all changes together and some sort of regression test

@pgimalac
Copy link
Contributor Author

pgimalac commented Jun 9, 2025

@aojea as I mentioned in my previous message, unless you're willing to use a fork of https://github.com/google/go-cmp (or to drop it entirely), the optimization can't be enabled for the kube-apiserver binary...
So to be clear, do you still want me to do a PR with all the other changes ?

Also in general I would argue in favor of doing those changes in separate PRs to keep things clean, eg. updating dependency has value in itself, even without considering the optimization.

@aojea
Copy link
Member

aojea commented Jun 9, 2025

sorry, I was replying from mobile and didn't see the context,

@aojea as I mentioned in my previous message, unless you're willing to use a fork of https://github.com/google/go-cmp (or to drop it entirely), the optimization can't be enabled for the kube-apiserver binary... So to be clear, do you still want me to do a PR with all the other changes ?
Also in general I would argue in favor of doing those changes in separate PRs to keep things clean, eg. updating dependency has value in itself, even without considering the optimization.

forking go-cmp sounds problematic, updating deps that improve things sounds a good thing, although this final decision will be in the area-organization people, @liggitt and @dims best persons to judge

My personal take on this optimization, is that without having a detection mechanism that prevents you to get bitten again by the same problem, using modified code to overcome it seems that the risk you take can not be guaranteed

@pgimalac
Copy link
Contributor Author

pgimalac commented Jun 9, 2025

On my side in the datadog-agent repo we have CI to ensure that the various binaries we generate don't become bigger, so once the optimization is enabled it can't be accidentally disabled again (since that would add back dozens of MiB)

I'd love to be able to enable the optimization for your binaries, but it likely will have to wait until #104821 is fixed somehow, or until they accept a fix upstream (feel free to comment on the issue by the way !)

In any case I'd really like to be able to get this one change merged so that I can benefit from the optimization if you're fine with it

@liggitt
Copy link
Member

liggitt commented Jun 13, 2025

@liggitt i was planning on doing a verify script with the whydeadcode utility which is more useful than looking for the html/template so did not offer up that suggestion. If you think we should do both, i'll suggest something tomorrow.

I didn't know about that one... I just assumed the importer checker would be the easiest path. I'd probably wait to merge until we have a verify script ready (either using existing import tools or something else, and whether it's in this PR or a separate PR, but ready)

@dims
Copy link
Member

dims commented Jun 13, 2025

@liggitt @pgimalac let's do the following here. I will rework later in a follow up PR

diff --git a/cmd/kubeadm/app/cmd/util/join.go b/cmd/kubeadm/app/cmd/util/join.go
index cbf58238ab0c..feded1d06d75 100644
--- a/cmd/kubeadm/app/cmd/util/join.go
+++ b/cmd/kubeadm/app/cmd/util/join.go
@@ -19,7 +19,7 @@ package util
 import (
        "bytes"
        "crypto/x509"
-       "html/template"
+       "html/template" //nolint:depguard
        "strings"

        "k8s.io/client-go/tools/clientcmd"
diff --git a/hack/golangci.yaml b/hack/golangci.yaml
index 2306878a6bfe..94dcd95e711d 100644
--- a/hack/golangci.yaml
+++ b/hack/golangci.yaml
@@ -268,6 +268,8 @@ linters:
           deny:
             - pkg: "github.com/google/go-cmp/cmp"
               desc: "cmp is allowed only in test files"
+            - pkg: "html/template"
+              desc: "template is allowed only in test files as it disables dead code elimination"
     forbidigo:
       analyze-types: true
       forbid:

@k8s-ci-robot k8s-ci-robot added area/kubeadm sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels Jun 13, 2025
@pgimalac
Copy link
Contributor Author

@dims I pushed a commit with your suggestions 👍

@liggitt
Copy link
Member

liggitt commented Jun 13, 2025

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 1b03a93824b3404b96e857942c5bfa3c34943241

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 13, 2025
@dims
Copy link
Member

dims commented Jun 13, 2025

@pgimalac oops! forgot to mention there are 3 files in there where we have to add the deny snippet

❯ ls -1 hack/golang*
hack/golangci-hints.yaml
hack/golangci.yaml
hack/golangci.yaml.in

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2025
@k8s-ci-robot k8s-ci-robot requested a review from liggitt June 13, 2025 13:12
@dims
Copy link
Member

dims commented Jun 13, 2025

/lgtm
/approve

🤞🏾

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: f27d3417f6901334dab62239c7916c07df6abf3f

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2025
@k8s-ci-robot k8s-ci-robot requested a review from dims June 13, 2025 13:15
@pgimalac
Copy link
Contributor Author

@dims One nolint was missing in apiserver/pkg/server/routes/flags.go

@dims
Copy link
Member

dims commented Jun 13, 2025

@pgimalac if you want to squash the commits, that would be great as well.

@pgimalac pgimalac force-pushed the apiserver-avoid-template-for-dce branch from b469a08 to 40c7188 Compare June 13, 2025 13:29
@dims
Copy link
Member

dims commented Jun 13, 2025

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: bb27f89663a6f858735c232fb348d09b2b5941f8

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, liggitt, pgimalac

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 77bd3f8 into kubernetes:master Jun 13, 2025
13 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.34 milestone Jun 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants