-
Notifications
You must be signed in to change notification settings - Fork 373
Bug 1995924: Set Upgradeable: false
when HA workloads are incorrectly spread
#1330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1995924: Set Upgradeable: false
when HA workloads are incorrectly spread
#1330
Conversation
Upgradeable: false
when HA workloads are incorrectly spreadUpgradeable: false
when HA workloads are incorrectly spread
@dgrisonnet: This pull request references Bugzilla bug 1995924, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
2eafad1
to
c00189a
Compare
fbc7374
to
d2b5a64
Compare
/test e2e-agnostic-upgrade |
/test e2e-agnostic |
/test e2e-agnostic-operator |
/test e2e-agnostic
Indeed the prometheus-k8s pods were all scheduled on the same node. From oc get pods -A:
Correlating from the kube-scheduler logs, it seems that when the prometheus operator provisions the prometheus-k8s statefulset,
|
d2b5a64
to
0817169
Compare
/skip |
/retest |
/test e2e-agnostic |
/lgtm |
c0a06d5
to
94372a1
Compare
3768c5d
to
d15cb46
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgrisonnet, jan--f, simonpasquier The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@dgrisonnet: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest-required |
Unholding since openshift/release#21564 has been merged. /unhold |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
@dgrisonnet: All pull requests linked via external trackers have merged: Bugzilla bug 1995924 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Bug 1995924: Revert "Merge pull request #1330 from dgrisonnet/ha-upgradeable"
Bug 1995924: Set `Upgradeable: false` when HA workloads are incorrectly spread
Bug 1995924: Set `Upgradeable: false` when HA workloads are incorrectly spread
Bug 1995924: Set `Upgradeable: false` when HA workloads are incorrectly spread
… spread (#1431) * Merge pull request #1330 from dgrisonnet/ha-upgradeable Bug 1995924: Set `Upgradeable: false` when HA workloads are incorrectly spread * pkg/rebalancer: sort resources for deletion The resources that are marked for deletion are sorted by their PVC creation timestamp, from the newest to the oldest to make the deletion consistent. Signed-off-by: Damien Grisonnet <[email protected]> * pkg: improve rebalancer logging Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort based on pod names Sort PVCs by their creation timestamps, from the newest to the oldest to make sure that the oldest PVC is retained in case all of them are annotated. If some PVCs have the same creation timestamp, they will be sorted based on their pod name. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: split annotation removal Add EnsurePVCsAreNotAnnoted function that makes sure that none of the PVCs of the given workload have the openshift.io/cluster-monitoring-drop-pvc annotation after the rebalancing is done. In case one of the PVC has the annotation, it will be removed to prevent deleting the PVC in a future cycle. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort resources to delete by age Signed-off-by: Damien Grisonnet <[email protected]> * test/e2e: fix framework Signed-off-by: Damien Grisonnet <[email protected]> Co-authored-by: OpenShift Merge Robot <[email protected]>
… spread (openshift#1431) * Merge pull request openshift#1330 from dgrisonnet/ha-upgradeable Bug 1995924: Set `Upgradeable: false` when HA workloads are incorrectly spread * pkg/rebalancer: sort resources for deletion The resources that are marked for deletion are sorted by their PVC creation timestamp, from the newest to the oldest to make the deletion consistent. Signed-off-by: Damien Grisonnet <[email protected]> * pkg: improve rebalancer logging Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort based on pod names Sort PVCs by their creation timestamps, from the newest to the oldest to make sure that the oldest PVC is retained in case all of them are annotated. If some PVCs have the same creation timestamp, they will be sorted based on their pod name. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: split annotation removal Add EnsurePVCsAreNotAnnoted function that makes sure that none of the PVCs of the given workload have the openshift.io/cluster-monitoring-drop-pvc annotation after the rebalancing is done. In case one of the PVC has the annotation, it will be removed to prevent deleting the PVC in a future cycle. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort resources to delete by age Signed-off-by: Damien Grisonnet <[email protected]> * test/e2e: fix framework Signed-off-by: Damien Grisonnet <[email protected]> Co-authored-by: OpenShift Merge Robot <[email protected]> (cherry picked from commit 8500e0f)
… spread (openshift#1431) * Merge pull request openshift#1330 from dgrisonnet/ha-upgradeable Bug 1995924: Set `Upgradeable: false` when HA workloads are incorrectly spread * pkg/rebalancer: sort resources for deletion The resources that are marked for deletion are sorted by their PVC creation timestamp, from the newest to the oldest to make the deletion consistent. Signed-off-by: Damien Grisonnet <[email protected]> * pkg: improve rebalancer logging Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort based on pod names Sort PVCs by their creation timestamps, from the newest to the oldest to make sure that the oldest PVC is retained in case all of them are annotated. If some PVCs have the same creation timestamp, they will be sorted based on their pod name. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: split annotation removal Add EnsurePVCsAreNotAnnoted function that makes sure that none of the PVCs of the given workload have the openshift.io/cluster-monitoring-drop-pvc annotation after the rebalancing is done. In case one of the PVC has the annotation, it will be removed to prevent deleting the PVC in a future cycle. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort resources to delete by age Signed-off-by: Damien Grisonnet <[email protected]> * test/e2e: fix framework Signed-off-by: Damien Grisonnet <[email protected]> Co-authored-by: OpenShift Merge Robot <[email protected]> (cherry picked from commit 8500e0f)
… spread (openshift#1431) * Merge pull request openshift#1330 from dgrisonnet/ha-upgradeable Bug 1995924: Set `Upgradeable: false` when HA workloads are incorrectly spread * pkg/rebalancer: sort resources for deletion The resources that are marked for deletion are sorted by their PVC creation timestamp, from the newest to the oldest to make the deletion consistent. Signed-off-by: Damien Grisonnet <[email protected]> * pkg: improve rebalancer logging Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort based on pod names Sort PVCs by their creation timestamps, from the newest to the oldest to make sure that the oldest PVC is retained in case all of them are annotated. If some PVCs have the same creation timestamp, they will be sorted based on their pod name. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: split annotation removal Add EnsurePVCsAreNotAnnoted function that makes sure that none of the PVCs of the given workload have the openshift.io/cluster-monitoring-drop-pvc annotation after the rebalancing is done. In case one of the PVC has the annotation, it will be removed to prevent deleting the PVC in a future cycle. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort resources to delete by age Signed-off-by: Damien Grisonnet <[email protected]> * test/e2e: fix framework Signed-off-by: Damien Grisonnet <[email protected]> Co-authored-by: OpenShift Merge Robot <[email protected]> (cherry picked from commit 8500e0f)
… spread (openshift#1431) * Merge pull request openshift#1330 from dgrisonnet/ha-upgradeable Bug 1995924: Set `Upgradeable: false` when HA workloads are incorrectly spread * pkg/rebalancer: sort resources for deletion The resources that are marked for deletion are sorted by their PVC creation timestamp, from the newest to the oldest to make the deletion consistent. Signed-off-by: Damien Grisonnet <[email protected]> * pkg: improve rebalancer logging Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort based on pod names Sort PVCs by their creation timestamps, from the newest to the oldest to make sure that the oldest PVC is retained in case all of them are annotated. If some PVCs have the same creation timestamp, they will be sorted based on their pod name. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: split annotation removal Add EnsurePVCsAreNotAnnoted function that makes sure that none of the PVCs of the given workload have the openshift.io/cluster-monitoring-drop-pvc annotation after the rebalancing is done. In case one of the PVC has the annotation, it will be removed to prevent deleting the PVC in a future cycle. Signed-off-by: Damien Grisonnet <[email protected]> * pkg/rebalancer: sort resources to delete by age Signed-off-by: Damien Grisonnet <[email protected]> * test/e2e: fix framework Signed-off-by: Damien Grisonnet <[email protected]> Co-authored-by: OpenShift Merge Robot <[email protected]>
We are now setting the upgradeable status based on whether the workflows
with PVCs are correctly spread between multiple nodes in HA topology.
This will prevent 4.8 clusters with soft-affinity on hostname to
encounter scheduling issues when updating to 4.9 where hard-affinity is
defined on hostname.
I completely decoupled setting the
Upgradeable
status from the usualstatus flow and it is now handled completely at the end of the sync.