-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scx_rustland: introduce fifo mode #305
Conversation
Simplify the CPU idle selection logic relying on the built-in logic. If something can be improved in this logic it should be done in the backend, changing the default idle selection logic, rustland doesn't need to do anything special here for now. Signed-off-by: Andrea Righi <[email protected]>
Provide a knob in scx_rustland_core to automatically turn the scheduler into a simple FIFO when the system is underutilized. This choice is based on the assumption that, in the case of system underutilization (less tasks running than the amount of available CPUs), the best scheduling policy is FIFO. With this option enabled the scheduler starts in FIFO mode. If most of the CPUs are busy (nr_running >= num_cpus - 1), the scheduler immediately exits from FIFO mode and starts to apply the logic implemented by the user-space component. Then the scheduler can switch back to FIFO if there are no tasks waiting to be scheduled (evaluated using a moving average). This option can be enabled/disabled by the user-space scheduler using the fifo_sched parameter in BpfScheduler: if set, the BPF component will periodically check for system utilization and switch back and forth to FIFO mode based on that. This allows to improve performance of workloads that are using a small amount of the available CPUs in the system, while still maintaining the same good level of performance for interactive tasks when the system is over commissioned. In certain video games, such as Baldur's Gate 3 or Counter-Strike 2, running in "normal" system conditions, we can experience a boost in fps of approximately 4-8% with this change applied. Signed-off-by: Andrea Righi <[email protected]>
Do not always assign the maximum time slice to interactive tasks, but use the same value of the dynamic time slice for everyone. This seems to prevent potential audio cracking when the system is over commissioned. Signed-off-by: Andrea Righi <[email protected]>
The shared DSQ is typically used to prioritize tasks and dispatch them on the first CPU available, so consume from the shared DSQ before the local CPU DSQ. Signed-off-by: Andrea Righi <[email protected]>
Dispatch non-interactive tasks on the CPU selected by the built-in idle selection logic and allow interactive tasks to be dispatched on any CPU. Signed-off-by: Andrea Righi <[email protected]>
Signed-off-by: Andrea Righi <[email protected]>
*/ | ||
cpu = scx_bpf_select_cpu_dfl(p, prev_cpu, wake_flags, &is_idle); | ||
if (is_idle) { | ||
if (is_idle && !full_user) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this would mark the picked CPU as not idle and then if !full_user
ignore it, which will strand the cpu for a while. It probably would make sense to test full_user
before calling scx_bpf_select_cpu_dfl()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@htejun the idea here was to use the built-in logic to pick an idle CPU, but not directly dispatch it if full_user
is specified, so that the task can be forced to go to the user-space scheduler.
Probably with full_user
enabled it just makes more sense to simply return prev_cpu
and let the user-space scheduler decide the CPU to use, according to its own idle tracking logic. At the end full_user
is provided mostly for debugging purposes, so I'm not really worried about performance here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-thinking more about this, I like a lot more to simply ignore the built-in idle selection logic in full-user mode, also from a design perspective --full-user
means that all the scheduling decisions are delegated to the user-space scheduler, idle selection logic included.
Therefore I pushed a change on top to completely ignore the built-in idle selection logic when running in full-user mode and make this option incompatible with --builtin-idle
.
@htejun sorry... I just realized that I've pushed too much stuff in this PR... but it's still something that I've been tested a lot anyway, so it's not totally bad. :) If you think it's better I can revert the additional commits and create a separate PR. The extra changes seem to mitigate the audio cracking issues that I was getting when the system is massively overloaded. Just for the records the extra changes are the following:
|
... actually let me do things properly, I'll revert this one and will send a new one with the right commits. |
This merge included additional commits that were supposed to be included in a separate pull request and have nothing to do with the fifo-mode changes. Therefore, revert the whole pull request and create a separate one with the correct list of commits required to implement this feature. Signed-off-by: Andrea Righi <[email protected]>
This is a v2 of the FIFO mode feature, more tested and tuned, based on additional experimental results (specifically to determine the optimal USERSCHED_TIMER_NS and the conditions to automatically enter and exit to/from FIFO mode).
The idea is to give an option to automatically transition to a FIFO scheduler when the system is underutilized and switch to the user-space scheduler only when the system is over commissioned.
This allows to maximize performance during regular system use, for example gaming without additional stress tests running, while also ensuring responsiveness if a CPU-intensive workload is suddenly started.
FIFO mode can lead to less predictable performance (due to the potential transitions between the scheduling policies), therefore it is provided as an optional feature that can be disabled when performance predictability is crucial, such as in real-time audio applications or during live streaming.