-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rusty: Check LOCAL_DSQ length for WAKE_SYNC #206
Conversation
In rusty_select_cpu(), if a task is WAKE_SYNC, we'll currently migrate the task to that CPU if there are any idle cores on the system. As in [0], this condition is insufficient, as there could be idle cores elsewhere on the system, but still tasks piled up on a single local DSQ. Let's add a condition that the local DSQ has to be empty in order to apply the WAKE_SYNC migration. Before patch: [void@maniforge src]$ hackbench Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 100 messages of 100 bytes Time: 0.433 With patch: [void@maniforge src]$ hackbench Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 100 messages of 100 bytes Time: 0.035 Signed-off-by: David Vernet <[email protected]>
Just a slight improvement on hackbench ;) |
|
||
if (!(BPF_CORE_READ(current, flags) & PF_EXITING) && | ||
taskc->dom_id < MAX_DOMS) { | ||
cpu = bpf_get_smp_processor_id(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look to be a problem but it feels a little weird to set the cpu here just so it can used in the if check below because if the if check fails then the cpu is still set to this value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I noticed that, but we do it all over. like e.g. setting cpu = scx_bpf_pick_idle_cpu(), and then checking if we succeeded. and it's functionally the same as before: if we failed the ->cpus_ptr check, we'd just continue on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I noticed that, but we do it all over.
Fighting my OCD urge to change it....
@@ -531,10 +531,12 @@ s32 BPF_STRUCT_OPS(rusty_select_cpu, struct task_struct *p, s32 prev_cpu, | |||
* local dsq of the waker. | |||
*/ | |||
if (wake_flags & SCX_WAKE_SYNC) { | |||
struct task_struct *current = (void *)bpf_get_current_task(); | |||
struct task_struct *current = (void *)bpf_get_current_task_btf(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JC, why the change to this function call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns a trusted pointer, so we don't have to bother with a probed read. Probably should've put it in a separate PR.
In rusty_select_cpu(), if a task is WAKE_SYNC, we'll currently migrate the task to that CPU if there are any idle cores on the system. As in [0], this condition is insufficient, as there could be idle cores elsewhere on the system, but still tasks piled up on a single local DSQ. Let's add a condition that the local DSQ has to be empty in order to apply the WAKE_SYNC migration.
Before patch:
[void@maniforge src]$ hackbench
Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 100 messages of 100 bytes
Time: 0.433
With patch:
[void@maniforge src]$ hackbench
Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 100 messages of 100 bytes
Time: 0.035