Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SURE-8366] Rancher keeps sending delete request for an already deleted EKS cluster #723

Open
kkaempf opened this issue Jul 31, 2024 · 3 comments
Assignees
Labels
JIRA Must shout kind/bug Something isn't working
Milestone

Comments

@kkaempf
Copy link

kkaempf commented Jul 31, 2024

SURE-8366

Issue description:

Customer is reporting that the eks-operator is constantly sending DeleteCluster calls to the AWS API to delete clusters that have already been deleted (from Rancher). They would restart Rancher, but it continues every 150 seconds.

Business impact:

This isn't causing workload outages, but it's more of an annoyance for them.

Troubleshooting steps:

We had a few calls trying to find where the requests were coming from, and we found that in the ekscc object, some mentions of clusters were present, but in some of the newer logs, the original clusters they were concerned about were no longer present.

Actual behavior:

The cluster in question was removed from Rancher and EKS. However rancher continues to send requests to delete it.

Expected behavior:

When removing the cluster from Rancher, the cluster should be deleted, and cluster deleting messages should not be sent to AWS.

Files, logs, traces:

(See JIRA)

Additional notes:

It's important to note that they are doing some odd things with the permissions on the AWS side "For security reasons" that we couldn't get more explanation on. That's why we were seeing those errors in the AWS logs.

See SURE-8366 for the rest of the logs & the impacted cluster list

@kkaempf kkaempf added kind/bug Something isn't working JIRA Must shout labels Jul 31, 2024
@kkaempf kkaempf added this to the v2.9-Next1 milestone Jul 31, 2024
@mjura mjura self-assigned this Aug 1, 2024
@kkaempf
Copy link
Author

kkaempf commented Aug 6, 2024

Waiting for customer feedback.

@mjura
Copy link
Contributor

mjura commented Aug 13, 2024

It is still waiting for customer feedback, solution was provided we could consider to close it

@kkaempf kkaempf modified the milestones: v2.9-Next1, v2.9-Next2 Aug 13, 2024
@mjura
Copy link
Contributor

mjura commented Sep 16, 2024

I have asked about providing output from following commands:

kubectl get clusters.management.cattle.io -A
kubectl get clusters.provisioning.cattle.io -A
kubectl get eksclusterconfigs.eks.cattle.io -A
kubectl get clusters.management.cattle.io -A -o yaml
kubectl get clusters.provisioning.cattle.io -A -o yaml
kubectl get eksclusterconfigs.eks.cattle.io -A -o yaml

kubectl logs -n cattle-system eks-operator-ID

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JIRA Must shout kind/bug Something isn't working
Development

No branches or pull requests

2 participants