Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: scheduler #275

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from
Draft

Feature: scheduler #275

wants to merge 16 commits into from

Conversation

superstar54
Copy link
Member

@superstar54 superstar54 commented Aug 27, 2024

Background

When running a workflow (such as a WorkChain or WorkGraph), each workflow is associated with a corresponding process. This process launches and waits for the child processes (e.g., CalcJob processes). In nested workflows like the PwBandsWorkChain, you may encounter multiple Workflow processes in a waiting state, with only one CalcJob process actively running. These waiting Workflow processes can be seen as inefficient resource usage.

In a WorkChain, the workflow logic is encapsulated within the new WorkChain class, making it challenging to eliminate these waiting processes at the moment. However, in a WorkGraph, the logic is more explicitly defined, and it has strict rules on who can execute this logic.

Besides, it's not good to run the task process and workgraph process in the same runner.

Proposal

To address this, I proposed a Scheduler for the WorkGraph in this PR. The Scheduler handles the following:

  • It creates the WorkGraph process only in the database without actually running the process by a daemon worker.
  • It analyzes task dependencies and, if the task is a CalcJob, it launches it to the daemon worker as usual. The key difference here is that the Scheduler uses the WorkGraph's PK as the parent PID, thereby maintaining correct provenance.

Let's compare the process count for the PwBands case. Suppose we launch 100 PwBands WorkGraphs:

  • Old Approach: 300 Workflow processes (Bands, Relax, Base), 100 CalcJob processes.
  • New Approach: 1 Scheduler process, 100 CalcJob processes.

The benefit is clear: the new approach significantly reduces the number of active processes. Moreover, the Scheduler runs in a separate daemon that does not listen to process launching tasks, thereby eliminating the possibility of deadlocks that could occur with the old approach.

This is also related to these issues:

  • Workflow process may spawn child processes which they wait on, however if there are no more slots left this child will never run and the parent will wait indefinitely whilst blocking a slot. More details in Deadlock when creating many processes aiida-core#1397.

  • User wants to control the maximum running job on a computer.

Note: this scheduler is designed for WorkGraph only. For WorkChain, this will not work.

Usage

https://aiida-workgraph--275.org.readthedocs.build/en/275/howto/scheduler.html

Scheduler

Add a daemon runner for scheduler:

  • can run multiple runners (daemons) for the scheduler. Each runner will listen to the scheduler_queue, and the prefetch_count is set to 1. Thus, each runner can only launch one Scheduler process.
  • The Scheduler will launch the process (calcjob) to the ordinary daemon workers.
  • The calcfunction is still run locally inside the Scheduler process, which is not good because it blocks the scheduler from arranging other tasks. The solution would be to allow the scheduler to submit the calcfunction, too.
  • The scheduler process listen to the workgraph_queue to launch WorkGraph
  • the scheduler recieve rpc call to launch WorkGrpah
  • user can submit workgraph to the workgraph queue, or select the shceduler to run it by pk

Keep provenance

  • When running a task, it will pass the workgraph's pk to the process of the task as the parent_pid, so that there is a link between the workgraph and the task's process.
  • If the task is a workgraph, it will save the workgraph with the parent_pid, and launch the workgraph inside the scheduler.

Use one scheduler process or scale the number of processes when needed.

While a single scheduler suffices for most use cases, scaling up the number of schedulers may be beneficial when significantly increasing the number of task workers (created by verdi daemon start). A general rule is to maintain a ratio of less than 5 workers per scheduler.

Circus

Similar to the worker daemon, we use circus to manage the scheduler daemon.

command

  • start Start the scheduler application.
  • status Print the status of the scheduler daemon.
  • stop Stop the scheduler daemon.

Todo

  • Handle failed task
  • Delete a workgraph data when it is finished
  • Restart the Scheduler process instead of launching a new one.
  • submit a workgraph inside the scheduler
  • Move report from Scheduler process to the workgraph process
  • The scheduler_queue in rmq is increased because the runner does not ack. For example, when the runner stop, the scheduler process is still running, so the runner does not ack back to rmq. When the runner restarts, it will processed the first msg in the queue, but I also send a new msg to the queue to continue the scheduler. This is bug, we don't need send the msg to continue, because, it is already there.
  • Add setting to the Web app (New PR)

checkpoint

how do we save the checkpoint? instead of saving all data every time, it would be great if we only update the context related with the workgraph.

solution 1

save the ctx data for a workgraph to the extras of that workgraph.

submit calcfuntion

I tested, one can submit a calcfunction if it is inside a package, thus the daemon can load it back using importlib.import_module. For calcfunction defined on-the-fly, it will raise an error.

Other features after this PR

  • Control max number of running jobs on a computer
  • Control running priority of WorkGraph.
  • How to ack the workgraph_queue?
  • Should a workgraph launch inside the scheduler go the the workgraph_queue, or run directly in the same scheduler?
    • go to the workgraph_queue will make the schedulers more balanced
    • run in the scheduler, is more controllable, and also it does not account for the slot, so avoid the deadlock.

@codecov-commenter
Copy link

codecov-commenter commented Aug 27, 2024

Codecov Report

Attention: Patch coverage is 16.40431% with 1009 lines in your changes missing coverage. Please review.

Project coverage is 67.30%. Comparing base (5937b88) to head (8299fe5).
Report is 60 commits behind head on main.

Files with missing lines Patch % Lines
aiida_workgraph/engine/scheduler/scheduler.py 12.00% 814 Missing ⚠️
aiida_workgraph/engine/scheduler/client.py 25.16% 116 Missing ⚠️
aiida_workgraph/engine/override.py 21.87% 25 Missing ⚠️
tests/conftest.py 22.22% 14 Missing ⚠️
aiida_workgraph/tasks/test.py 23.07% 10 Missing ⚠️
aiida_workgraph/engine/utils.py 61.90% 8 Missing ⚠️
aiida_workgraph/workgraph.py 42.85% 8 Missing ⚠️
tests/test_scheduler.py 52.94% 8 Missing ⚠️
aiida_workgraph/utils/control.py 33.33% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #275      +/-   ##
==========================================
- Coverage   75.75%   67.30%   -8.45%     
==========================================
  Files          70       70              
  Lines        4615     6123    +1508     
==========================================
+ Hits         3496     4121     +625     
- Misses       1119     2002     +883     
Flag Coverage Δ
python-3.11 67.23% <16.40%> (-8.43%) ⬇️
python-3.12 67.22% <16.40%> (?)
python-3.9 67.24% <16.33%> (-8.50%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@superstar54 superstar54 linked an issue Aug 27, 2024 that may be closed by this pull request
return result, process.node


def instantiate_process(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sphuber , I modified the instantiate_process function from aiida-core.

- Pickup the Scheduler process instead of launching a new one.
- submit a workgraph inside the scheduler
- Move report from Scheduler process to the workgraph process
The scheduler will listen to the task from scheduler_queue
1) can run multiple runner (daemon) for the scheduler, each runner will listen to the `scheduler_queue`, and the prefetch_count is set to 1, thus each runner can only launch one Scheduler process.
2) The scheduler process listen to the `workgraph_queue` to launch WorkGraph
3) the scheduler recieve rpc call to launch WorkGrpah
4) user can submit workgraph to the workgraph queue, or select the shceduler to run it by pk
@superstar54 superstar54 self-assigned this Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Add WorkGraph Scheduler
2 participants