Option for acknowledging terminating Pods in Deployment rolling update

### What would you like to be added?

It could make sense to wait for a pod to be terminated before scheduling a new one for the issues mentioned bellow. Even though some issues can be partially mitigated by proper setup of maxUnavailable and maxSurge, it is not applicable for all of them.

I would like to propose a new opt-in behaviour that would solve this. Deployment controller would include Terminating pods in the computation of current running replicas when deciding if the new RS should scale up (or old in case of proportional scaling).

This could be configured for example in `.spec.strategy.rollingUpdate.scalingPolicy` with possible values
1. `IgnoreTerminatingPods` - default and current behaviour
2. `WaitForTerminatingPods`

The disadvantage of this feature is a slower rollout in resource unrestricted environments. So, using this feature would be advised only for similar use cases to the mentioned ones above.

### Why is this needed?

In some cases people are surprised that their deployment can momentarily have more pods during a rollout than described `( replicas - maxUnavailable < availableReplicas < replicas + maxSurge)`. The culprit are Terminating pods that can run in addition to the Running + Starting pods.
Even though Terminating pods are not considered part of a deployment this can cause problems with resource usage and scheduling:

1. Unnecessary autoscaling of nodes in tight environments and driving up cloud costs. This can hurt especially if
    - you rollout multiple deployments at the same time.
    - you have generous termination periods and your pods take a long time to shutdown (example here https://github.com/kubernetes/kubernetes/issues/95498#issuecomment-814048997)
 
    relevant issues:
    - https://github.com/kubernetes/kubernetes/issues/95498
    - https://github.com/kubernetes/kubernetes/issues/99513
    - https://github.com/kubernetes/kubernetes/issues/41596
    - https://github.com/kubernetes/kubernetes/issues/97227
 
2. A problem also arises in contentious environments where pods are fighting for resources. This can bring up exponential backoff for not yet started pods into big numbers and unnecessarily delay start of such pods until they pop from the queue when there are computing resources to run them. This can slow down the deployment considerably.

    relevant issue: https://github.com/kubernetes/kubernetes/issues/98656

    In this issue the resources were limited by a quota, but this can be due to other reasons as well. In our use case we noticed, this can occur also in high availability scenarios where pods are expected to run only on certain nodes and pod anti-affinity forbids to run two pods at the same node.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Option for acknowledging terminating Pods in Deployment rolling update #107920

What would you like to be added?

Why is this needed?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Option for acknowledging terminating Pods in Deployment rolling update #107920

Description

What would you like to be added?

Why is this needed?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions