Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scx_layered: Add per layer timeslice #578

Merged
merged 1 commit into from
Aug 28, 2024

Conversation

hodgesds
Copy link
Contributor

Allow setting a different timeslice per layer.

tested by adding some debug logs:

diff --git a/scheds/rust/scx_layered/src/bpf/main.bpf.c b/scheds/rust/scx_layered/src/bpf/main.bpf.c
index 14e6fc9..ebf3362 100644
--- a/scheds/rust/scx_layered/src/bpf/main.bpf.c
+++ b/scheds/rust/scx_layered/src/bpf/main.bpf.c
@@ -1230,20 +1230,21 @@ void BPF_STRUCT_OPS(layered_running, struct task_struct *p)
        if (tctx->last_cpu >= 0 && tctx->last_cpu != task_cpu)
                lstat_inc(LSTAT_MIGRATION, layer, cctx);
        tctx->last_cpu = task_cpu;
 
        if (vtime_before(layer->vtime_now, p->scx.dsq_vtime))
                layer->vtime_now = p->scx.dsq_vtime;
 
        cctx->current_preempt = layer->preempt;
        cctx->current_exclusive = layer->exclusive;
        tctx->running_at = bpf_ktime_get_ns();
+       dbg("CFG: layer: %d slice: %llu", layer->idx, layer->slice_ns);
 
        /*
         * If this CPU is transitioning from running an exclusive task to a
         * non-exclusive one, the sibling CPU has likely been idle. Wake it up.
         */
        if (cctx->prev_exclusive && !cctx->current_exclusive) {
                s32 sib = sibling_cpu(task_cpu);
                struct cpu_ctx *sib_cctx;
 
                /*

output:

   stress-ng-cpu-3306897 [023] dN..1 434025.244529: bpf_trace_printk: CFG: layer: 2 slice: 4000000
   stress-ng-cpu-3306898 [045] dN..1 434025.244529: bpf_trace_printk: CFG: layer: 2 slice: 4000000
   stress-ng-cpu-3306903 [074] dN..1 434025.244529: bpf_trace_printk: CFG: layer: 2 slice: 4000000
   grep-3306226 [051] dN..1 434025.244529: bpf_trace_printk: CFG: layer: 1 slice: 1500000

@hodgesds hodgesds requested a review from htejun August 28, 2024 14:38
@hodgesds hodgesds force-pushed the layered-timeslice branch 2 times, most recently from d3a91b2 to ca595d1 Compare August 28, 2024 15:11
layer.min_exec_ns = min_exec_us * 1000;
layer.yield_step_ns = if *yield_ignore > 0.999 {
0
} else if *yield_ignore < 0.001 {
opts.slice_us * 1000
layer.slice_ns * 1000
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this conversion doesn't need to be done since it already gets converted above...

@@ -788,8 +790,9 @@ static bool keep_running(struct cpu_ctx *cctx, struct task_struct *p)
if (!(tctx = lookup_task_ctx(p)) || !(layer = lookup_layer(tctx->layer)))
goto no;

u64 layer_slice_ns = layer->slice_ns > 0 ? layer->slice_ns : slice_ns;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may also want to use layer_slice_ns to refill p->scx.slice if the task keeps running. Otherwise the task's time slice is automatically refilled to SCX_SLICE_DFL if I'm not wrong.

Maybe something like this (untested):

diff --git a/scheds/rust/scx_layered/src/bpf/main.bpf.c b/scheds/rust/scx_layered/src/bpf/main.bpf.c
index 14e6fc9..cb9ccbe 100644
--- a/scheds/rust/scx_layered/src/bpf/main.bpf.c
+++ b/scheds/rust/scx_layered/src/bpf/main.bpf.c
@@ -817,6 +817,7 @@ static bool keep_running(struct cpu_ctx *cctx, struct task_struct *p)
 		 */
 		if (disable_topology) {
 			if (!scx_bpf_dsq_nr_queued(layer->idx)) {
+				p->scx.slice = layer_slice_ns;
 				lstat_inc(LSTAT_KEEP, layer, cctx);
 				return true;
 			}
@@ -825,6 +826,7 @@ static bool keep_running(struct cpu_ctx *cctx, struct task_struct *p)
 						   tctx->last_cpu :
 						   bpf_get_smp_processor_id());
 			if (!scx_bpf_dsq_nr_queued(dsq_id)) {
+				p->scx.slice = layer_slice_ns;
 				lstat_inc(LSTAT_KEEP, layer, cctx);
 				return true;
 			}
@@ -850,6 +852,7 @@ static bool keep_running(struct cpu_ctx *cctx, struct task_struct *p)
 		scx_bpf_put_idle_cpumask(idle_cpumask);
 
 		if (has_idle) {
+			p->scx.slice = layer_slice_ns;
 			lstat_inc(LSTAT_KEEP, layer, cctx);
 			return true;
 		}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that worked alright, I spent a bit of time trying tuning different layer slice lengths and it's rather interesting to see how the slice can affect throughput vs latency.

Allow setting a different timeslice per layer.

Signed-off-by: Daniel Hodges <[email protected]>
@hodgesds hodgesds merged commit f2ddae6 into sched-ext:main Aug 28, 2024
2 checks passed
@hodgesds hodgesds deleted the layered-timeslice branch August 28, 2024 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants