-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scx_layered: Add topology awareness for NUMA nodes and LLCs #446
Conversation
Add a cpus method various subfields in the topology struct to easily get the map of CPUs for nodes/LLCs. Signed-off-by: Daniel Hodges <[email protected]>
Add NUMA node topology awareness for scx_layared. This borrows some of the NUMA handling from scx_rusty and allows layers to set a node mask. Different layer kinds will use the node mask differently. Signed-off-by: Daniel Hodges <[email protected]>
@@ -86,6 +88,18 @@ struct cpu_ctx { | |||
u64 ran_current_for; | |||
}; | |||
|
|||
struct cache_ctx { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might not be using this, so I think I can delete this.
return 0; | ||
} | ||
|
||
static s32 create_node(u32 node_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was mostly highjacked from scx_rusty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good as the first step. That said, the overhead of looping in the dispatch path may be noticeable under contention and the resulting behavior might not be ideal (e.g. DSQs for lower numbered LLCs are always favored). Given that there are a lot more changes needed around dispatch path and LB, I think iterating in tree makes sense, so please feel free to land it and keep building on top.
// return the dsq id for the layer based on the LLC id. | ||
static inline u64 layer_dsq_id(u32 layer_id, u32 llc_id) | ||
{ | ||
return (layer_id*nr_llcs) + llc_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you format it like (layer_id * nr_llcs) + llc_id
?
@@ -649,7 +695,9 @@ void BPF_STRUCT_OPS(layered_enqueue, struct task_struct *p, u64 enq_flags) | |||
goto find_cpu; | |||
} | |||
|
|||
scx_bpf_dispatch_vtime(p, tctx->layer, slice_ns, vtime, enq_flags); | |||
u32 llc_id = cpu_to_llc_id(tctx->last_cpu >= 0 ? tctx->last_cpu: 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, plesae keep a space between operators and tokens.
if (layers[idx].preempt && scx_bpf_consume(idx)) | ||
return; | ||
bpf_for(idx, 0, nr_layers) { | ||
bpf_for(llc_id, 0, nr_llcs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose there's no automatic locality implemented in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but that should hopefully be an easy win.
bpf_for(idx, 0, nr_layers) { | ||
bpf_for(llc_id, 0, nr_llcs) { | ||
dsq_id = layer_dsq_id(idx, llc_id); | ||
if (layers[idx].preempt && scx_bpf_consume(dsq_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the number of DSQs can be pretty high, down the line, we probably need to make the walk more efficient, but let's revisit that later as this part of code would need substantial updates to support automatic locality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't like these double nested for loops.
Signed-off-by: Daniel Hodges <[email protected]>
Add support for NUMA node and LLC layer configuration for
scx_layered
. If thenodes
orllcs
fields are unset then all CPUs are used. If both fields have values the cpuset is or'ed. This is useful for runningscx_layered
on NUMA machines where setting a soft affinity to a NUMA node or LLC may be beneficial.Testing for NUMA nodes using this config:
CPU configuration:
Running
stress-ng
with 20 CPUs puts load on NUMA node 1Similar test with LLCs:
Running
stress-ng
shows similar results for NUMA pinning as this machine only has a single LLC per NUMA node.