Page MenuHomePhabricator

KubernetesTag
ActivePublic

Details

Description

A tag for anything related to Kubernetes. For the discussion see T147187: Create a tag for #kubernetes.

See also:

Recent Activity

Wed, Jun 11

gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1153978 merged by jenkins-bot:

[operations/deployment-charts@master] cfssl-issuer: Allow to provide a custom CA certificate store

https://gerrit.wikimedia.org/r/1153978

Wed, Jun 11, 12:54 PM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1154266 merged by JMeybohm:

[operations/software/cfssl-issuer@main] Make simple-cfssl usable for local WMF PKI deployments

https://gerrit.wikimedia.org/r/1154266

Wed, Jun 11, 10:02 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T389080: Fix dependencies between admin_ng deployments.

Change #1155212 merged by jenkins-bot:

[operations/deployment-charts@master] Add a script to visualize the dependencies of admin_ng environments

https://gerrit.wikimedia.org/r/1155212

Wed, Jun 11, 9:51 AM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
akosiaris added a comment to T352956: Handling inbound IPIP traffic on low traffic LVS k8s based realservers.

I 've gone ahead and switch all of aux-k8s to MTU 1460. This time around, I went for a more hands off approach, namely:

Wed, Jun 11, 7:33 AM · Patch-For-Review, Prod-Kubernetes, Kubernetes, serviceops, Traffic
gerritbot added a comment to T352956: Handling inbound IPIP traffic on low traffic LVS k8s based realservers.

Change #1155543 merged by Alexandros Kosiaris:

[operations/puppet@production] aux-k8s: Switch MTU to 1460

https://gerrit.wikimedia.org/r/1155543

Wed, Jun 11, 7:07 AM · Patch-For-Review, Prod-Kubernetes, Kubernetes, serviceops, Traffic
gerritbot added a comment to T352956: Handling inbound IPIP traffic on low traffic LVS k8s based realservers.

Change #1155543 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] aux-k8s: Switch MTU to 1460

https://gerrit.wikimedia.org/r/1155543

Wed, Jun 11, 7:04 AM · Patch-For-Review, Prod-Kubernetes, Kubernetes, serviceops, Traffic

Tue, Jun 10

cmooney added a comment to T352956: Handling inbound IPIP traffic on low traffic LVS k8s based realservers.

@akosiaris a quick question about this:

meaning that ICMP traffic to e.g. coredns gets dropped

In terms of pmtud that means that if coredns sends large UDP packets - which get dropped elsewhere - it won't get the ICMP "packet too big" messages back. But that is not really a worry. The CoreDNS PODs have a lower MTU than pretty much everything on the network, they are not going to send packets that are too large for anything else.

Agreed on this.

Are the physical K8s hosts blocked from sending ICMP? So for instance if a 1500-byte UDP packet was sent to a pod IP - and couldn't get there because we have reduced the MTU on the veth interface connecting the POD - can the host send an ICMP back to the client?

No, they are not blocked. Indeed the host could send that ICMP back instead. I didn't see that happening in my tests, however I also didn't specifically try this out, we can test that.

Tue, Jun 10, 4:32 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, serviceops, Traffic
cmooney added a comment to T352956: Handling inbound IPIP traffic on low traffic LVS k8s based realservers.

@akosiaris thanks for confirming. So overall my thinking is:

Tue, Jun 10, 4:30 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, serviceops, Traffic
akosiaris added a comment to T352956: Handling inbound IPIP traffic on low traffic LVS k8s based realservers.

@akosiaris a quick question about this:

meaning that ICMP traffic to e.g. coredns gets dropped

In terms of pmtud that means that if coredns sends large UDP packets - which get dropped elsewhere - it won't get the ICMP "packet too big" messages back. But that is not really a worry. The CoreDNS PODs have a lower MTU than pretty much everything on the network, they are not going to send packets that are too large for anything else.

Tue, Jun 10, 3:43 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, serviceops, Traffic
cmooney added a comment to T352956: Handling inbound IPIP traffic on low traffic LVS k8s based realservers.

@akosiaris a quick question about this:

Tue, Jun 10, 3:36 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, serviceops, Traffic
JMeybohm closed T389080: Fix dependencies between admin_ng deployments, a subtask of T341984: Update Kubernetes clusters to 1.31, as Resolved.
Tue, Jun 10, 1:01 PM · Patch-For-Review, collaboration-services, Data-Platform-SRE, Kubernetes, Prod-Kubernetes, serviceops
JMeybohm closed T389080: Fix dependencies between admin_ng deployments as Resolved.

Dependencies are fixed now for wikikube clusters. Other cluster maintainers might want to check/update releases which are not part what gets deployed to all clusters.
I've added the script to the root of the repo since we already have a bunch of scripts there.

Tue, Jun 10, 1:01 PM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T389080: Fix dependencies between admin_ng deployments.

Change #1155212 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Add a script to visualize the dependencies of admin_ng environments

https://gerrit.wikimedia.org/r/1155212

Tue, Jun 10, 12:51 PM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1153975 merged by jenkins-bot:

[operations/deployment-charts@master] Use Wikimedia DNS IPs as mock

https://gerrit.wikimedia.org/r/1153975

Tue, Jun 10, 10:23 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
CodeReviewBot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Add gitlab-ci pipeline to build a cfssl container image

Tue, Jun 10, 9:35 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops

Mon, Jun 9

CodeReviewBot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Add sre/cfssl to the trusted runners with dockerfile support

Mon, Jun 9, 3:00 PM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops

Fri, Jun 6

gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1154293 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] kind.sh can bootstrap a wikikube like cluster with kind

https://gerrit.wikimedia.org/r/1154293

Fri, Jun 6, 1:40 PM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T389080: Fix dependencies between admin_ng deployments.

Change #1153982 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Fix dependencies/needs of helmfiles

https://gerrit.wikimedia.org/r/1153982

Fri, Jun 6, 11:11 AM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T389080: Fix dependencies between admin_ng deployments.

Change #1153979 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: Split envoyfilters installation into a separate release

https://gerrit.wikimedia.org/r/1153979

Fri, Jun 6, 11:11 AM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1154266 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/software/cfssl-issuer@main] Make simple-cfssl usable for local WMF PKI deployments

https://gerrit.wikimedia.org/r/1154266

Fri, Jun 6, 10:34 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
CodeReviewBot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

jayme opened https://gitlab.wikimedia.org/repos/sre/cfssl/-/merge_requests/2

Fri, Jun 6, 10:29 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
CodeReviewBot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

jayme opened https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/113

Fri, Jun 6, 10:28 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops

Thu, Jun 5

JMeybohm reopened T325385: Trusted gitlab runner containers need access to staging k8s cluster as "Open".

We dismantled the CI access to k8s staging in T288629: Automated validation of mediawiki-multiversion images, so access from Gitlab runners to the ci namespace are no longer possible (the namespace does not exist anymore). If this is actually used/required, we need to find a different way. Otherwise we should kubestagemaster from the allowed_services

Thu, Jun 5, 11:04 AM · Kubernetes, serviceops, collaboration-services, GitLab
Clement_Goubert added a comment to T389080: Fix dependencies between admin_ng deployments.

I made a small script to visualize dependencies while fixing them, not sure where to put it so:

Thu, Jun 5, 10:40 AM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T389080: Fix dependencies between admin_ng deployments.

Change #1153982 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Fix dependencies/needs of helmfiles

https://gerrit.wikimedia.org/r/1153982

Thu, Jun 5, 10:36 AM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a project to T389080: Fix dependencies between admin_ng deployments: Patch-For-Review.
Thu, Jun 5, 10:36 AM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T389080: Fix dependencies between admin_ng deployments.

Change #1153979 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Split envoyfilters installation into a separate release

https://gerrit.wikimedia.org/r/1153979

Thu, Jun 5, 10:36 AM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1153979 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] admin_ng: Split envoyfilters installation into a separate release

https://gerrit.wikimedia.org/r/1153979

Thu, Jun 5, 10:31 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1153978 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] cfssl-issuer: Allow to provide a custom CA certificate store

https://gerrit.wikimedia.org/r/1153978

Thu, Jun 5, 10:31 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1153977 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] coredns: Run coredns on an unprivileged port (5353) instead of 53

https://gerrit.wikimedia.org/r/1153977

Thu, Jun 5, 10:31 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1153976 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] calico: Add support to manage CNI installation by daemonset

https://gerrit.wikimedia.org/r/1153976

Thu, Jun 5, 10:31 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a project to T396107: Make it possible to bootstrap a production linke k8s cluster locally: Patch-For-Review.
Thu, Jun 5, 10:31 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T396107: Make it possible to bootstrap a production linke k8s cluster locally.

Change #1153975 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Use Wikimedia DNS IPs as mock

https://gerrit.wikimedia.org/r/1153975

Thu, Jun 5, 10:31 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
JMeybohm created T396107: Make it possible to bootstrap a production linke k8s cluster locally.
Thu, Jun 5, 10:30 AM · Patch-For-Review, Kubernetes, Prod-Kubernetes, serviceops
JMeybohm added a comment to T389080: Fix dependencies between admin_ng deployments.

I made a small script to visualize dependencies while fixing them, not sure where to put it so:

Thu, Jun 5, 10:11 AM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
JMeybohm claimed T389080: Fix dependencies between admin_ng deployments.
Thu, Jun 5, 9:16 AM · Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops

Wed, Jun 4

gerritbot added a comment to T341984: Update Kubernetes clusters to 1.31.

Change #1114000 merged by jenkins-bot:

[operations/cookbooks@master] k8s.pool-depool-node: Add support to downtime/remove downtime

https://gerrit.wikimedia.org/r/1114000

Wed, Jun 4, 7:44 AM · Patch-For-Review, collaboration-services, Data-Platform-SRE, Kubernetes, Prod-Kubernetes, serviceops

Tue, Jun 3

JMeybohm created T395870: Remove docker related parts from kubernetes puppet code.
Tue, Jun 3, 7:06 AM · Prod-Kubernetes, Kubernetes, serviceops

Mon, Jun 2

Maintenance_bot removed a project from T388390: Ensure the correct helm version is used for each cluster: Patch-For-Review.
Mon, Jun 2, 11:31 AM · collaboration-services, Data-Platform-SRE, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T388390: Ensure the correct helm version is used for each cluster.

Change #1127950 merged by jenkins-bot:

[operations/deployment-charts@master] aux-k8s-services/*: use the correct helm version in each cluster

https://gerrit.wikimedia.org/r/1127950

Mon, Jun 2, 10:35 AM · collaboration-services, Data-Platform-SRE, Kubernetes, Prod-Kubernetes, serviceops

Fri, May 30

elukey closed T369493: Migrate ml-staging/ml-serve clusters off of Pod Security Policies as Resolved.

Recycled all the pods in ml-serve-eqiad to be sure, no PSS violation registered. Migration completed!

Fri, May 30, 9:29 AM · Patch-For-Review, Machine-Learning-Team, Kubernetes
gerritbot added a comment to T369493: Migrate ml-staging/ml-serve clusters off of Pod Security Policies.

Change #1152194 merged by Elukey:

[operations/puppet@production] kubernetes: disable PSP for ml-serve-eqiad

https://gerrit.wikimedia.org/r/1152194

Fri, May 30, 9:10 AM · Patch-For-Review, Machine-Learning-Team, Kubernetes
Gehel moved T388388: Ensure all required kubectl versions are installed on deploy hosts from Backlog - project to Reported on the Data-Platform-SRE (2025.05.24 - 2025.06.13) board.
Fri, May 30, 7:55 AM · Data-Platform-SRE (2025.05.24 - 2025.06.13), collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
Gehel edited projects for T388388: Ensure all required kubectl versions are installed on deploy hosts, added: Data-Platform-SRE (2025.05.24 - 2025.06.13); removed Data-Platform-SRE.
Fri, May 30, 7:53 AM · Data-Platform-SRE (2025.05.24 - 2025.06.13), collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
gerritbot added a comment to T369493: Migrate ml-staging/ml-serve clusters off of Pod Security Policies.

Change #1152194 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] kubernetes: disable PSP for ml-serve-eqiad

https://gerrit.wikimedia.org/r/1152194

Fri, May 30, 7:23 AM · Patch-For-Review, Machine-Learning-Team, Kubernetes
gerritbot added a comment to T369493: Migrate ml-staging/ml-serve clusters off of Pod Security Policies.

Change #1152190 merged by Elukey:

[operations/deployment-charts@master] admin_ng: disable PSP and enable PSS for ml-serve-eqiad

https://gerrit.wikimedia.org/r/1152190

Fri, May 30, 7:01 AM · Patch-For-Review, Machine-Learning-Team, Kubernetes
gerritbot added a comment to T369493: Migrate ml-staging/ml-serve clusters off of Pod Security Policies.

Change #1152190 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/deployment-charts@master] admin_ng: disable PSP and enable PSS for ml-serve-eqiad

https://gerrit.wikimedia.org/r/1152190

Fri, May 30, 6:53 AM · Patch-For-Review, Machine-Learning-Team, Kubernetes

Thu, May 29

elukey added a comment to T369493: Migrate ml-staging/ml-serve clusters off of Pod Security Policies.

@klausman since today it was very quiet for ML, I took the opportunity to apply all the changes stated in T369493#10792884 (including recycling all the isvc pods in ml-serve-eqiad).

Thu, May 29, 3:29 PM · Patch-For-Review, Machine-Learning-Team, Kubernetes
gerritbot added a comment to T369493: Migrate ml-staging/ml-serve clusters off of Pod Security Policies.

Change #1151604 merged by Elukey:

[operations/deployment-charts@master] admin_ng: set secure-pod-defaults to "enabled" for knative clusters

https://gerrit.wikimedia.org/r/1151604

Thu, May 29, 12:35 PM · Patch-For-Review, Machine-Learning-Team, Kubernetes
gerritbot added a comment to T369493: Migrate ml-staging/ml-serve clusters off of Pod Security Policies.

Change #1151600 merged by Elukey:

[operations/deployment-charts@master] kserve-inference: set seccomp defaults in the chart

https://gerrit.wikimedia.org/r/1151600

Thu, May 29, 9:37 AM · Patch-For-Review, Machine-Learning-Team, Kubernetes