-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Map tasks running on SPOT instances - pods stuck in terminating
state forever
#2701
Comments
Thank you for opening your first issue here! 🛠 |
TL;DR I think the reproduce steps are explainable, but we may want to add a flag making it work as intended. It would help to have more information to debug this (Pod info on a SPOT instance). OK, I've had some time to explore this in depth, a few things:
|
@hamersaw FYI that Vijay is my intern (and his project is to try to use map tasks as a replacement for some of our Spark tasks). I'll try to get you a little bit more information about the exact sequence of things in order to determine why the FYI that we don't actually use the As you described above, I would think that the issue is that the K8s array plugin controller refuses to process |
Posting a link to the subtask finalizer (for myself so that I can find it again): https://github.com/flyteorg/flyteplugins/blob/master/go/tasks/plugins/array/k8s/subtask.go#L216 |
@convexquad @vijaysaravana thanks for the clarification! I think this all sounds reasonable. I'll get started on adding a flag to FlytePropeller along the lines of |
Should note that this code is a near copy-paste of the plugin_manager in FlytePropeller. We have an open issue to refactor this. I'm noting because this ensures that map task subtasks and regular k8s container and pod plugins execute almost identically. We will have to update both locations to ensure that external deletes are recognized. |
I checked the sequence of events in my K8s audit logs and they do happen as we suspected:
@hamersaw does Propeller already have an informer that notices pod deletion events for task (and subtask) pods? I am curious as to what logic we can add to Propeller to make this work. I guess the way it would work would be something like: when the informer sees a pod deletion request, check if the |
FYI not trying to bribe you, but if we can successfully simplify some of our ETL pipelines with Flyte map tasks (in place of Spark tasks), we are planning to write a nice blog about it. I think it would gather a lot of attention as many, many people find Spark difficult to use (of course sometimes you absolutely need to use Spark, basically when you need shuffle->reduce operations, but we do have many use cases that would be much more simple for users with map tasks). |
This is great to hear, happy to validate that this is the actual problem.
Yes, in FlytePropeller we have an informerwhich enqueues the parent workflow when a Pod is updated. Currently, deletes are ignored, but this logic should change in our case. Then when processing the node we already have a check for deletion. Rather than ignoring this, we should check for the As I mentioned the plugin_manager is basically copy-pasted to the subtask processing in map tasks, we'll have to make sure this logic is consistent across (or better yet, remove the duplication). |
@convexquad @vijaysaravana looks like this may be much more simple than anticipated. PRs above to support external deletion of map task subtasks. I'll work on getting these pushed through quickly. |
@hamersaw thanks for the update, it is good to hear that this may be simple. Also thanks for your explanation about how the Flyte informer works. One last request from me - if the It is super important for subtask retries to work correctly, since the whole idea is to replace large map-only Spark tasks with Flyte map tasks that run in our mixed on-demand + spot GPU ASG's so that subtasks get correctly retried when there is either node maintenance or spot pre-emption. |
Great point. I will double check that everything is working correctly for this specific case, but we did change the map task implementation so that subtasks are retried individually with this PR. So everything should be working as you described. |
@hamersaw we updated our Propeller and I do believe your fix has resolved the issue |
@convexquad great to hear! Thanks for being so thoroughly descriptive and responsive, it really helps resolve these this quickly! |
Describe the bug
When map tasks are run on SPOT instances, when the node dies the pod stays in a "Terminating" state forever (but it never actually completes terminating and erroring out). This causes the subtask to think it is still "running" forever (instead of retrying). But seems to work correctly for other types of Flyte tasks.
Expected behavior
If a SPOT instance is not available, the sub task should terminate gracefully and retry for the specified number of retries.
Additional context to reproduce
terminating
state.Screenshots
Pods list :
'Terminating' pod information:
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: