You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Customer is reporting that the eks-operator is constantly sending DeleteCluster calls to the AWS API to delete clusters that have already been deleted (from Rancher). They would restart Rancher, but it continues every 150 seconds.
Business impact:
This isn't causing workload outages, but it's more of an annoyance for them.
Troubleshooting steps:
We had a few calls trying to find where the requests were coming from, and we found that in the ekscc object, some mentions of clusters were present, but in some of the newer logs, the original clusters they were concerned about were no longer present.
Actual behavior:
The cluster in question was removed from Rancher and EKS. However rancher continues to send requests to delete it.
Expected behavior:
When removing the cluster from Rancher, the cluster should be deleted, and cluster deleting messages should not be sent to AWS.
Files, logs, traces:
(See JIRA)
Additional notes:
It's important to note that they are doing some odd things with the permissions on the AWS side "For security reasons" that we couldn't get more explanation on. That's why we were seeing those errors in the AWS logs.
See SURE-8366 for the rest of the logs & the impacted cluster list
The text was updated successfully, but these errors were encountered:
I have asked about providing output from following commands:
kubectl get clusters.management.cattle.io -A
kubectl get clusters.provisioning.cattle.io -A
kubectl get eksclusterconfigs.eks.cattle.io -A
kubectl get clusters.management.cattle.io -A -o yaml
kubectl get clusters.provisioning.cattle.io -A -o yaml
kubectl get eksclusterconfigs.eks.cattle.io -A -o yaml
SURE-8366
Issue description:
Customer is reporting that the eks-operator is constantly sending DeleteCluster calls to the AWS API to delete clusters that have already been deleted (from Rancher). They would restart Rancher, but it continues every 150 seconds.
Business impact:
This isn't causing workload outages, but it's more of an annoyance for them.
Troubleshooting steps:
We had a few calls trying to find where the requests were coming from, and we found that in the ekscc object, some mentions of clusters were present, but in some of the newer logs, the original clusters they were concerned about were no longer present.
Actual behavior:
The cluster in question was removed from Rancher and EKS. However rancher continues to send requests to delete it.
Expected behavior:
When removing the cluster from Rancher, the cluster should be deleted, and cluster deleting messages should not be sent to AWS.
Files, logs, traces:
(See JIRA)
Additional notes:
It's important to note that they are doing some odd things with the permissions on the AWS side "For security reasons" that we couldn't get more explanation on. That's why we were seeing those errors in the AWS logs.
See SURE-8366 for the rest of the logs & the impacted cluster list
The text was updated successfully, but these errors were encountered: