-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] flytepropeller fails trying to get pod resource using the kubeClient #4730
Comments
🤖 Hello @andresgomezfrr! I'm Dosu, a friendly bot here to assist you with your Flyte issues and queries. I'm here to help you while we wait for a human maintainer to join us. Don't worry, together we'll figure this out! Thank you for reporting this issue. It seems like you're experiencing a problem similar to a previously resolved issue in the Flyte repository: [BUG] Pods stuck on Terminating with finalizer. This issue was resolved by a pull request: Abort subworkflow on subnode failure. However, since you're using a newer version of the propeller (
These are potential causes and further investigation would be needed to confirm the exact cause. As for the error message you're seeing in the logs, it's a generic error message that's logged when the propeller fails to get the pod resource using the kubeClient. The actual error details are provided by the "err" variable in the Go code, which is not visible in the log message you provided. To help us investigate this issue further, could you please provide more details about your environment? Specifically, it would be helpful to know:
This information will help us understand the context in which the issue is occurring and potentially identify the cause. Thank you for your cooperation. Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Describe the bug
Yesterday we upgraded the propeller to the version
1.10.6
and today we noticed that we have a lot of pods with theTerminating
state in our cluster. After some checks we detected that we started to see this log many timesflyte/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go
Line 269 in 38d1833
We deleted all the pods manually and downgraded the version and the logs disappeared, now the pods are deleted again without issues.
Expected behavior
The pods should be deleted properly and not stuck in the Terminating state.
Additional context to reproduce
Upgrade to the latest version of the propeller and execute pods.
Screenshots
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: