Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core Feature] Allow tasks/config to specify max queue/wait time #1149

Open
EngHabu opened this issue Jun 15, 2021 · 3 comments
Open

[Core Feature] Allow tasks/config to specify max queue/wait time #1149

EngHabu opened this issue Jun 15, 2021 · 3 comments
Labels
backlogged For internal use. Reserved for contributor team workflow. enhancement New feature or request exo flytekit FlyteKit Python related issue propeller Issues related to flyte propeller

Comments

@EngHabu
Copy link
Contributor

EngHabu commented Jun 15, 2021

Motivation: Why do you think this is important?
In cases when the underlying execution engine (AWS Batch, K8s, Spark, Hive, AWS EMR, GCP BigQuery... etc.) is having issues scheduling flyte workloads, sometimes the workload get stuck. While Flyte has a concept of timeout, it only measures the execution timeout overall. Which doesn't allow the users to express their tolerance for how much they can wait in a queue to get a task executing.

Goal: What should the final outcome look like, ideally?
Expose an additional queue_timeout flag that can be set at a global scope through configs, or at a task scope (ideally can also be on a project/domain/WF levels). And when flytepropeller detects that a task hasn't started executing for that period of time, it should just abort it.

@EngHabu EngHabu added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Jun 15, 2021
@EngHabu EngHabu added this to the 0.16.0 milestone Jun 15, 2021
@EngHabu EngHabu modified the milestones: 0.16.0, 0.17.0 Aug 2, 2021
@EngHabu EngHabu added flytekit FlyteKit Python related issue propeller Issues related to flyte propeller labels Aug 25, 2021
@EngHabu EngHabu removed this from the 0.17.0 milestone Sep 2, 2021
@github-actions
Copy link

Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏

@github-actions github-actions bot added the stale label Aug 26, 2023
@github-actions
Copy link

github-actions bot commented Sep 3, 2023

Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 3, 2023
@eapolinario eapolinario reopened this Nov 2, 2023
@hamersaw hamersaw added exo backlogged For internal use. Reserved for contributor team workflow. and removed stale labels Nov 9, 2023
@hamersaw hamersaw removed the untriaged This issues has not yet been looked at by the Maintainers label Dec 5, 2023
@flixr
Copy link
Contributor

flixr commented Apr 11, 2024

While it's nice that we can now configure a pod-pending-timeout, we have pretty much the opposite problem:
We thought that the timeout really refers to only the actual execution time (without pending time) as the docs suggest:
https://docs.flyte.org/en/latest/api/flytekit/generated/flytekit.TaskMetadata.html?highlight=timeout#flytekit.TaskMetadata
We can only set a meaningful execution timeout and it can happen that sometimes tasks are queued for quite a long time until a GPU node is available (which is what we want).
So IMHO we should change the current timeout to mean only the execution timeout (as per current docs). Or add a new param that sets only the execution timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlogged For internal use. Reserved for contributor team workflow. enhancement New feature or request exo flytekit FlyteKit Python related issue propeller Issues related to flyte propeller
Projects
None yet
Development

No branches or pull requests

4 participants