-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core feature] Slurm agent #5634
Comments
Thank you for opening your first issue here! 🛠 |
To integrate Slurm with Flyte, you can create a Flyte agent that interacts with the Slurm scheduler using its REST API. This approach leverages Flyte's extensibility and allows for efficient task submission to HPC resources managed by Slurm. Motivation: Why do you think this is important?Slurm is a widely used workload management system in many HPC (High-Performance Computing) compute clusters. It plays a vital role in efficiently allocating compute resources, running work on these allocated resources, and overseeing a queue of pending work. Integrating Slurm with Flyte presents a strategic opportunity to greatly enhance the utilization of Flyte workflows on HPC clusters, providing many advantages. Goal: What should the final outcome look like, ideally?The aim is the implementation of a Flyte agent that facilitates task submission to HPC resources using the Slurm scheduler. Typically, users interact with Slurm through its command-line interface (CLI). For instance, the Describe alternatives you've consideredI don't know of anything comparable. Propose: Link/Inline OR Additional contextI am available to offer support using Slurm and to test the Flyte agent. Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
Sources: This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not. Mention @RunLLM to ask follow-up questions. See runllm.com for more! |
@BerndDoser , would this be something that you'd be interested to collaborate? The guide on how to develop a new agent has a lot of details, including how to develop agents locally (so that the dev cycle is tight). |
Hi @eapolinario, thank you for your interest in the feature request. I am generally interested in collaborating. For example, I can set up a Slurm container to test such a feature. I am currently on vacation but will be back at the beginning of September. |
Please let us know when you try. |
Motivation: Why do you think this is important?
Slurm is a widely used workload management system in many HPC (High-Performance Computing) compute clusters. It plays a vital role in efficiently allocating compute resources, running work on these allocated resources, and overseeing a queue of pending work.
Integrating Slurm with Flyte presents a strategic opportunity to greatly enhance the utilization of Flyte workflows on HPC clusters, providing many advantages.
Goal: What should the final outcome look like, ideally?
The aim is the implementation of a Flyte agent that facilitates task submission to HPC resources using the Slurm scheduler.
Typically, users interact with Slurm through its command-line interface (CLI). For instance, the
sbatch
command submits a job script for later execution. An optional Slurm daemon also offers a REST API for interacting with the Slurm system.Describe alternatives you've considered
I don't know of anything comparable.
Propose: Link/Inline OR Additional context
I am available to offer support using Slurm and to test the Flyte agent.
https://github.com/JBris/slurm-rest-api-docker can be used for testing the Slurm CLI and the Slurm REST API.
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: