1
0
mirror of https://github.com/kubernetes-sigs/descheduler.git synced 2026-01-28 06:29:29 +01:00

feat: Implement preferredDuringSchedulingIgnoredDuringExecution for RemovePodsViolatingNodeAffinity (#1210)

* feat: Implement preferredDuringSchedulingIgnoredDuringExecution for RemovePodsViolatingNodeAffinity

Now, the descheduler can detect and evict pods that are not optimally
allocated according to the "preferred..." node affinity. It only evicts
a pod if it can be scheduled on a node that scores higher in terms of
preferred node affinity than the current one.

This can be activated by enabling the RemovePodsViolatingNodeAffinity
plugin and passing "preferredDuringSchedulingIgnoredDuringExecution" in
the args.

For example, imagine we have a pod that prefers nodes with label "key1:
value1" with a weight of 10. If this pod is scheduled on a node that
doesn't have "key1: value1" as label but there's another node that has
this label and where this pod can potentially run, then the descheduler
will evict the pod.

Another effect of this commit is that the
RemovePodsViolatingNodeAffinity plugin will not remove pods that don't
fit in the current node but for other reasons than violating the node
affinity. Before that, enabling this plugin could cause evictions on
pods that were running on tainted nodes without the necessary
tolerations.

This commit also fixes the wording of some tests from
node_affinity_test.go and some parameters and expectations of these
tests, which were wrong.

* Optimization on RemovePodsViolatingNodeAffinity

Before checking if a pod can be evicted or if it can be scheduled
somewhere else, we first check if it has the corresponding nodeAffinity
field defined. Otherwise, the pod is automatically discarded as a
candidate.

Apart from that, the method that calculates the weight that a pod
gives to a node based on its preferred node affinity has been
renamed to better reflect what it does.
This commit is contained in:
Jordi Piqué Sellés
2023-08-04 11:08:21 +01:00
committed by GitHub
parent 1be0ab2bd1
commit 31704047c5
8 changed files with 516 additions and 74 deletions

View File

@@ -436,7 +436,10 @@ profiles:
This strategy makes sure all pods violating
[node affinity](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity)
are eventually removed from nodes. Node affinity rules allow a pod to specify
`requiredDuringSchedulingIgnoredDuringExecution` type, which tells the scheduler
`requiredDuringSchedulingIgnoredDuringExecution` and/or
`preferredDuringSchedulingIgnoredDuringExecution`.
The `requiredDuringSchedulingIgnoredDuringExecution` type tells the scheduler
to respect node affinity when scheduling the pod but kubelet to ignore
in case node changes over time and no longer respects the affinity.
When enabled, the strategy serves as a temporary implementation
@@ -449,6 +452,14 @@ of scheduling. Over time nodeA stops to satisfy the rule. When the strategy gets
executed and there is another node available that satisfies the node affinity rule,
podA gets evicted from nodeA.
The `preferredDuringSchedulingIgnoredDuringExecution` type tells the scheduler
to respect node affinity when scheduling if that's possible. If not, the pod
gets scheduled anyway. It may happen that, over time, the state of the cluster
changes and now the pod can be scheduled on a node that actually fits its
preferred node affinity. When enabled, the strategy serves as a temporary
implementation of `preferredDuringSchedulingPreferredDuringExecution`, so the
pod will be evicted if it can be scheduled on a "better" node.
**Parameters:**
|Name|Type|