Skip to content

Pod Certificates: Preliminary implementation of KEP-4317 #128010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ahmedtd
Copy link
Contributor

@ahmedtd ahmedtd commented Oct 11, 2024

This PR implements KEP-4317 (Pod Certificates).

  • Define PodCertificateRequest and PodCertificate projected volume sources.
  • Enable support for these types on kube-apiserver (behind a feature gate).
  • Implement the projected volume support.
  • Implement a cleaner for PodCertificateRequests
  • Add an example signer implementation agnhost (to be used in e2e tests).
  • Run make update to do all the regeneration for the new types.

/kind feature
/kind api-change

Enable kube-apiserver support for PodCertificateRequest and PodCertificate projected volumes (behind the PodCertificateRequest feature gate).
- [KEP]: https://github.com/kubernetes/enhancements/tree/25777c1bba657eac76eb294c012acbbbd519299c/keps/sig-auth/4317-pod-certificates

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 11, 2024
@ahmedtd
Copy link
Contributor Author

ahmedtd commented Oct 11, 2024

/sig auth

@k8s-ci-robot k8s-ci-robot added the sig/auth Categorizes an issue or PR as relevant to SIG Auth. label Oct 11, 2024
@k8s-ci-robot k8s-ci-robot added area/apiserver area/code-generation sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 11, 2024
@ahmedtd
Copy link
Contributor Author

ahmedtd commented Oct 11, 2024

/assign liggit
/assign enj

@k8s-ci-robot
Copy link
Contributor

@ahmedtd: GitHub didn't allow me to assign the following users: liggit.

Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign liggit
/assign enj

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ahmedtd ahmedtd force-pushed the pod-certificates-types branch 2 times, most recently from 6027570 to bda6204 Compare October 12, 2024 00:25
@k8s-triage-robot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@aramase
Copy link
Member

aramase commented Oct 14, 2024

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 14, 2024
@ahmedtd ahmedtd force-pushed the pod-certificates-types branch from bda6204 to 6f72603 Compare November 1, 2024 19:33
@ahmedtd ahmedtd force-pushed the pod-certificates-types branch 2 times, most recently from 9e39f17 to 035990e Compare February 28, 2025 06:46
Comment on lines 313 to 315
// SignerName indicates the request signer.
SignerName string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's document the well-known signer.

Comment on lines +310 to +330
// PodName is the name of the pod into which the certificate will be mounted.
PodName string
// PodUID is the UID of the pod into which the certificate will be mounted.
PodUID types.UID

// ServiceAccountname is the name of the service account the pod is running as.
ServiceAccountName string
// ServiceAccountUID is the UID of the service account the pod is running as.
ServiceAccountUID types.UID

// NodeName is the name of the node the pod is assigned to.
NodeName types.NodeName
// NodeUID is the UID of the node the pod is assigned to.
NodeUID types.UID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indicate these are all required. +required (currently)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +328 to +336
// If omitted, kube-apiserver will set it to 86400(24 hours). kube-apiserver
// will reject values shorter than 3600 (1 hour).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaulting for all will be 24hr. Custom signer users can specify this value if they need a non-default

Copy link
Contributor Author

@ahmedtd ahmedtd Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Though I'm wondering if it might be cleaner to not do any explicit defaulting, but instead treat the field being absent as meaning 24 hours. That way people who just use a JSON / yaml parser won't be confused by missing the magical behavior of the Kubernetes parsing defaulter.

Since we check the issued certificate against maxExpirationSeconds during status update validation, we can guarantee that third-party signer implementations have to be aware of the "missing means 24 hours" behavior.

Comment on lines +331 to +340
// The PodCertificateMaxExpiration admission plugin, if enabled, will
// shorten the value to the maximum expiration configured for the requested
// signer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditional defaulting has not aged well in the past. Let's not do mutating admission (or conditional mutating strategy).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I still see the original, did you push the changes?

Comment on lines +325 to +333
// maxExpirationSeconds is the maximum lifetime permitted for the
// certificate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document maximum allowable range. the kubernetes ones will be 24hrs or less. not kube ones will be 91 days or less. shortest possible for all is 1hr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


var allErrors field.ErrorList
allErrors = append(allErrors, ValidatePodCertificateRequest(newReq, opts)...)
allErrors = append(allErrors, apivalidation.ValidateObjectMetaUpdate(&newReq.ObjectMeta, &oldReq.ObjectMeta, field.NewPath("metadata"))...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

func ValidatePodCertificateRequestUpdate(newReq, oldReq *certificates.PodCertificateRequest, optsIn ValidatePodCertificateRequestOptions) field.ErrorList {
opts := optsIn
if bytes.Equal(newReq.Spec.PKIXPublicKey, oldReq.Spec.PKIXPublicKey) && bytes.Equal(newReq.Spec.ProofOfPossession, oldReq.Spec.ProofOfPossession) && newReq.Status.CertificateChain == oldReq.Status.CertificateChain {
opts.SkipCryptoValidation = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SkipCryptoValidation is no longer necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

var allErrors field.ErrorList
allErrors = append(allErrors, ValidatePodCertificateRequest(newReq, opts)...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't delegate because the update is immutable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

// Bail out if the certificate chain hasn't been set.
if req.Status.CertificateChain == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.status updates need to be a different function. .status updates must never fail on invalid .spec values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 672 to 676
leafCert, err := x509.ParseCertificate(leafBlock.Bytes)
if err != nil {
allErrors = append(allErrors, field.Invalid(certChainPath, req.Status.CertificateChain, "leaf certificate does not parse as valid X.509"))
return allErrors
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enj can you be sure this does the validation we desire?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, for the entire status bits. Be sure you're happy. Discussion suggests it's not as obvious as I might have thought.

@tico88612
Copy link
Member

Hi @ahmedtd,
We'd like to remind you of the code freeze time. If there's anything we can do, please let us know.
The code freeze is starting 02:00 UTC Friday 21st March 2025 (about 3 weeks from now). Please make sure the PR has both lgtm and approved labels before the code freeze. Thanks!

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 8, 2025
@ahmedtd ahmedtd force-pushed the pod-certificates-types branch from 035990e to 694bcad Compare March 16, 2025 02:24
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 16, 2025
@ahmedtd ahmedtd force-pushed the pod-certificates-types branch from 694bcad to b889242 Compare March 16, 2025 03:48
@ahmedtd
Copy link
Contributor Author

ahmedtd commented Mar 16, 2025

/retest

@ahmedtd ahmedtd force-pushed the pod-certificates-types branch from b889242 to 085e872 Compare March 16, 2025 04:12
@tico88612
Copy link
Member

Hello, @ahmedtd @deads2k
Appreciate all of your efforts with this PR! Is the plan still to resolve it for v1.33 release?
If so, a gentle reminder that the code freeze has started 02:00 UTC Friday 21st March 2025 . Please make sure any PRs have both lgtm and approved labels ASAP, and file an Exception if you haven't done it yet.
Thanks!

@ahmedtd
Copy link
Contributor Author

ahmedtd commented Mar 17, 2025

@deads2k This is ready for another review pass. Apologies, it took a lot longer than I expected to convert the Kubelet code to be informer-based, but I think it has significantly improved it.

There's one change from the last time you reviewed that I should call out: I have locked down the allowable status conditions on PodCertificateRequest:

  1. There can be only one condition entry.
  2. It must be one of Issued, Denied, or Failed
  3. The condition entry is immutable once it is set (similar to CSR).

This simplified a lot of the validation logic, and seems OK because

  1. Only Kubelet is really consuming these conditions anyways.
  2. We can relax the restrictions (for example, by permitting signers to add other informative conditions) in the future without compatibility concerns.

All other changes are just implementing review feedback.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 18, 2025
Comment on lines +331 to +340
// The PodCertificateMaxExpiration admission plugin, if enabled, will
// shorten the value to the maximum expiration configured for the requested
// signer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I still see the original, did you push the changes?

//
// Signer implementations do not need to support all key types supported by
// kube-apiserver and kubelet. If a signer does not support the key type
// used for a given PodCertificateRequest, it should deny the request, with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see "should", did you push the changes?

Comment on lines 82 to 94
if pcr.ObjectMeta.CreationTimestamp.Time.After(time.Now().Add(-30 * time.Minute)) {
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to have different retention for PCRs with at least one condition (=> addressed) vs PCRs with none (ignored for some reason, maybe faulty signer implementation)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a persistent problem, then there is always going to be one you can query and look at with no conditions near the end of its life. If it isn't persistent, there's nothing to look at.

It could be interesting to have a metric count of PCRs deleted without any conditions on them. Maybe separated by the kube signer and everyone else.

}

func (m *IssuingManager) refreshOnePod(ctx context.Context, pod *corev1.Pod) error {
for _, v := range pod.Spec.Volumes {
Copy link
Member

@stlaz stlaz Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you could map pod -> slice[credRecord] or pod -> podCertProjectedVolume -> credRecord so that we don't have to loop through the volumes every time?

Comment on lines 194 to 197
serviceAccount, err := m.kc.CoreV1().ServiceAccounts(pod.ObjectMeta.Namespace).Get(ctx, pod.Spec.ServiceAccountName, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("while fetching service account: %w", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one live fetch should do for the whole pod

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect there is a reason we don't do cached SA lookups, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to be one fetch for the pod.

Ehhh, I am not sure of the importance of having a live lookup here. It's probably fine to work from a cache if I can figure out where Kubelet is caching service accounts.

Copy link
Contributor

@deads2k deads2k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so far. We start tomorrow with the kubelet

return allErrors
}

if len(req.Status.Conditions) > 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than this, make the three known conditions mutually exclusive. Keeping the "must be true" is fine. This will allow us to add other conditions without having a two release dance of loosening validation one release and adding the condition in a second release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think I've done this. The three known conditions are now mutually exclusive, but other conditions may be set.

clock.RealClock{},
)
klet.podCertificateManager = pcm
kubeInformers.Start(wait.NeverStop)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be able to use ctx.Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"k8s.io/utils/clock"
)

// TODO(KEP-4317): Tests for IssuingManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, don't forget to add

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a note for myself: This is probably easiest to test under the integration test framework, with a real kube-apiserver.

Namespace string
PodName string
PodUID types.UID
ServiceAccountName string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird to have the service account name as part of a pod identity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed PodIdentity --- it was vestigial after the IssuingManager refactor to be informer-driven

)

func (m *IssuingManager) Run(ctx context.Context) {
if !cache.WaitForCacheSync(ctx.Done(), m.pcrInformer.HasSynced) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait for the node lister to be synced

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 166 to 172
uniquePath := source.PodCertificate.CredentialBundlePath
if uniquePath == "" {
uniquePath = source.PodCertificate.KeyPath
}
if uniquePath == "" {
uniquePath = source.PodCertificate.CertificateChainPath
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is different from how uniquePath is calculated in TrackPod and ForgetPod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've switched all this to be based on the volume name and the index of the projected volume source.

Comment on lines +253 to +226
pod.ObjectMeta.Namespace,
pod.ObjectMeta.Name, pod.ObjectMeta.UID,
pod.Spec.ServiceAccountName, serviceAccount.ObjectMeta.UID,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could just pass both pod's and SA's objectMeta

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like the current spelling.

volumeName: v.Name,
path: uniquePath,
}
delete(m.validCredentials, key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are invalidCredentials ever cleaned up?
Should we also remove pendingCredentials? I see they get removed in handlePCRDelete() so I guess it might be fine to defer deletion till then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's clean them all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe all our map entries should be dropped once the pod or podcertificaterequest they correspond to has been dropped.

}
newPCR := new.(*certificatesv1alpha1.PodCertificateRequest)

klog.InfoS("Handling PodCertificateRequest update", "pcr", key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logs in this method seem a bit too verbose. Probably debug logs? Some of them might be useful at some debug level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to V(2)

reason := ""
message := ""
for _, cond := range newPCR.Status.Conditions {
// PCRs can have only one condition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this to properly handle other conditions.

v.Name, uniquePath,
source.PodCertificate.SignerName, source.PodCertificate.KeyType,
)
if err != nil {
Copy link
Member

@stlaz stlaz Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we try to optimistically create all PCRs and only log errors? And eventually reattempt the PCR creates that failed?


// TrackPod visits all PodCertificate projected volume sources within the pod,
// creating an initial PodCertificateRequest for each one.
func (m *IssuingManager) TrackPod(ctx context.Context, pod *corev1.Pod) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be reading this wrong, but is this the case?

Conditions (either, or):

  1. SA is not yet present in the NS
  2. network blip during SA fetch/PCR create

What happens:

  1. pendingCredentials is never populated, PCR create is never reattempted
  2. GetPodCertificateCredentialBundle() will always fail in backoff when the kubelet tries to populate the volume with files

If that's the case, shouldn't we have a retry mechanism built into the manager?

// 2) Kubelet created this PCR, but then restarted.
//
// In both cases, just do nothing.
klog.InfoS("PodCertificateRequest update was for stray", "pcr", key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
klog.InfoS("PodCertificateRequest update was for stray", "pcr", key)
klog.InfoS("PodCertificateRequest isn't tracked for any known pods", "pcr", key)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


// On delete, clean up the pending credential record corresponding
// to this PodCertificateRequest, if any exists.
func (m *IssuingManager) handlePCRDelete(old any) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, kubectl delete podcertificaterequest -A --all will make all pending pods requesting a cert never to come up. Is that ok?

Comment on lines 194 to 197
serviceAccount, err := m.kc.CoreV1().ServiceAccounts(pod.ObjectMeta.Namespace).Get(ctx, pod.Spec.ServiceAccountName, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("while fetching service account: %w", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect there is a reason we don't do cached SA lookups, right?

}

// Add informer functions. We don't follow the typical controller pattern
// where we use a workqueue, because we don't have to do any RPCs or state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this eliminates the de-duping capability of a workqueue.

Copy link
Contributor

@deads2k deads2k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments to be addressed before our next review session.

allPods := m.podManager.GetPods()
for _, pod := range allPods {
if err := m.refreshOnePod(ctx, pod); err != nil {
klog.ErrorS(err, "Error while refreshing pod", "namespace", pod.ObjectMeta.Namespace, "name", pod.ObjectMeta.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does kubelet not use utilruntime.HandleError

Comment on lines 166 to 172
uniquePath := source.PodCertificate.CredentialBundlePath
if uniquePath == "" {
uniquePath = source.PodCertificate.KeyPath
}
if uniquePath == "" {
uniquePath = source.PodCertificate.CertificateChainPath
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the index of the projected volume and the index projected volume source as the key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// refresh.
return
}
if m.clock.Now().After(credRecord.beginRefreshAt) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also be sure that beginRefreshAt is not zero value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API validation places strong constraints on BefinRefreshAt --- it must be between NotBefore and NotAfter.

What would a zero check guard against here?

continue
}

serviceAccount, err := m.kc.CoreV1().ServiceAccounts(pod.ObjectMeta.Namespace).Get(ctx, pod.Spec.ServiceAccountName, metav1.GetOptions{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must document why this is done. Add beta criteria to the KEP to consider the alternative of single-item list/watch as we do for secrets and configmaps. We have another one of these in projected service account tokens for image pulls.

Comment on lines +399 to +397
path := source.PodCertificate.CredentialBundlePath
if path == "" {
path = source.PodCertificate.KeyPath
}
if path == "" {
path = source.PodCertificate.CredentialBundlePath
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to use indexes.

Namespace: s.pod.ObjectMeta.Namespace,
PodName: s.pod.ObjectMeta.Name,
PodUID: s.pod.ObjectMeta.UID,
ServiceAccountName: s.pod.Spec.ServiceAccountName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two possible choices

  1. SA name is enough, but if this is true, then it doesn' tmatter because SA name cannot change
  2. SA name is not enough and we need a UID, and we must collect the correct UID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocker for merge

There must be static pod integration tests. We want static pods with pod certificate projected volumes to

  1. kubelet rejects the pod and does X to report this to someone
  2. kubelet runs the pod and doesn't create the projected volume
  3. we do NOT want a state where the kubelet runs the pod and cannot write the pod to the API.

Information to make the choice, what do we do with static pods with service accounts.

volumeName: v.Name,
path: uniquePath,
}
delete(m.validCredentials, key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's clean them all

return fmt.Errorf("while PEM-encoding private key: %w", err)
}

req := &certificatesv1alpha1.PodCertificateRequest{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add owner reference to the pod so we clean up automatically. And test with the OwnerReferencesPermissionEnforcement admission plugin. You may have to add the deletion powers back, but that seems worth the tradeoff. This prevents build up of podcertificaterequests when pods are short lived.

@liggitt liggitt modified the milestones: v1.33, v1.34 Apr 9, 2025
@Rajalakshmi-Girish Rajalakshmi-Girish moved this from Tracked to Pending inclusion in [sig-release] Bug Triage May 21, 2025
ahmedtd added 2 commits June 14, 2025 04:37
* Define feature gate
* Define and serve PodCertificateRequest
* Implement Kubelet projected volume source
* kube-controller-manager GCs PodCertificateRequests
* Add agnhost subcommand that implements a toy signer for testing

Change-Id: Id7ed030d449806410a4fa28aab0f2ce4e01d3b10
Change-Id: I9edc0f3d7080330b164f6be78bb8f16350869ffa
@ahmedtd ahmedtd force-pushed the pod-certificates-types branch from 085e872 to 4a9326a Compare June 14, 2025 05:01
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 14, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ahmedtd
Once this PR has been reviewed and has the lgtm label, please assign liggitt for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

@ahmedtd: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-apidiff-client-go 4a9326a link false /test pull-kubernetes-apidiff-client-go
pull-kubernetes-unit 4a9326a link true /test pull-kubernetes-unit
pull-kubernetes-linter-hints 4a9326a link false /test pull-kubernetes-linter-hints
pull-kubernetes-unit-windows-master 4a9326a link false /test pull-kubernetes-unit-windows-master
pull-kubernetes-integration 4a9326a link true /test pull-kubernetes-integration
pull-kubernetes-verify 4a9326a link true /test pull-kubernetes-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. area/apiserver area/code-generation area/dependency Issues or PRs related to dependency changes area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Changes requested
Status: In Progress
Status: Changes Requested
Status: PRs Waiting on Author
Status: Pending inclusion
Development

Successfully merging this pull request may close these issues.

10 participants