Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[scx_lavd] Getting large pauses and stutters in-game #234

Closed
ChrisLane opened this issue Apr 18, 2024 · 63 comments
Closed

[scx_lavd] Getting large pauses and stutters in-game #234

ChrisLane opened this issue Apr 18, 2024 · 63 comments

Comments

@ChrisLane
Copy link

I read about scx_lavd on Phoronix and decided to give it a try and get some basic benchmarks.

I have observed that I am able to quite reliably reproduce stutters and pauses, sometimes several seconds long while running around my ship in Helldivers 2.

Here's the output from sudo scx_lavd while I was running the game with several stutters in the period (not sure on exact timestamps:
https://gist.github.com/ChrisLane/1945f66b5d8a2f36a530ca9ac8abcfa1

Please let me know if I can do anything to debug or provide more useful info for your debugging purposes.

System:
Distro: Arch Linux
Kernel: 6.8.7-1-cachyos
GPU: AMD Radeon RX 6800 XT (RADV NAVI21)
CPU: AMD Ryzen 9 5900X 12-Core Processor
RAM: 36GB

@Quackdoc
Copy link

Quackdoc commented Apr 19, 2024

Did you verify the stutters don't happen when the scheduler isn't loaded? I've been having very similar issues with all of the cachyOS kernels CachyOS/linux-cachyos#244 . Im using the cachyos-eevdf kernel for instance and have very similar issues.

@ChrisLane
Copy link
Author

ChrisLane commented Apr 19, 2024 via email

@Cifer9516
Copy link

Cifer9516 commented Apr 19, 2024

Same thing, heavy load scenarios time out the scheduler and it just unloads, after which program continues running without a problem.
Happened in vkd3d game and during compilation.

@skygrango
Copy link

skygrango commented Apr 20, 2024

I tested compiling the rust program and playing the game at the same time, and got an scx_lavd error.

log file : scx_lavd_dump-2024-04-20-1.txt

edit : update log file.

@multics69
Copy link
Contributor

multics69 commented Apr 23, 2024

Thank you for reporting the issues. This will really helpful to improve and tune the LAVD. The preemption logic is under development and it will be improved next cycle.

@ALL -- When collecting the log, if you can run sudo scx_lavd -s $(nproc), it would be super helpful.

@ChrisLane -- Thank you for opening the issue and share the log!

@Cifer9516 -- If you don't mind, could you share the log and the game scenario that you played?

@skygrango -- If you don't mind, please share the game scenario that you played? It seems that you ran the old version of scx_lavd. If possible, could you try the lastest version (git HEAD).

@bunkbail
Copy link

bunkbail commented Apr 24, 2024

here's a dump log of sudo scx_lavd -s $(nproc) on latest scx_lavd (commit 9a9b4d). i'm on debian, kernel is compiled by me. what i did was just, i launched scx_lavd, ran the game "infinite wealth", run around for a bit while the game is having freezes all over and tab out until i see lavd crashes. the game runs fine after lavd crashed.
dump.txt

@bunkbail
Copy link

bunkbail commented Apr 24, 2024

ran another test (commit 9a9b4d) and attached the full log here, but apologies if it's a bit hard to read since i use oh-my-zsh styling in my terminal.

dump.txt

@Galcian79
Copy link

I think i'm facing the same issue here. Stuttering in game and then LAVD crashes.
journalctl.log

@multics69
Copy link
Contributor

Thanks @bunkbail and @Galcian79 for sharing the logs.

@htejun
Copy link
Contributor

htejun commented Apr 25, 2024

 *R LikeADragon8.ex[3041] -3ms
      scx_state/flags=3/0x5 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=(n/a)
      cpus=00000fff


  R LikeADragon8.ex[2869] -40782ms
      scx_state/flags=3/0x9 dsq_flags=0x0 ops_state/qseq=0/0
      sticky/holding_cpu=-1/-1 dsq_id=0x8000000000000002
      cpus=00000fff

So, this suggests that 2869 has been sitting on the local DSQ for over 40s which should only be possible if either the previous task keeps running or tasks keep getting queued ahead of it. The latter happens only with explicit ENQ_HEAD or ENQ_PREEMPT, neither of which is currently being used by lavd, so the situation could be that 3041 is running constantly.

Now, a gaming thread being runnable constantly wouldn't be too surprising (maybe it's busy polling for some event?) but what should have happend is it running down its slice and then taking turns with other runnable tasks, which doesn't seem to have happened. Hmm... so, one peculiarity of lavd is that it assigns slice from ops.running(). I can imagine a scenario like the following:

  1. The offending thread becomes runnable.
  2. It starts running, ops.running() sets its slice according to calc_time_slice().
  3. It's running within its slice but it interacts with e.g. graphics driver and wakes up an RT thread.
  4. The RT thread preempts the running task. ops.stopping() is invoked. Because the task hasn't exhausted its slice yet, it gets pushed at the top of the local DSQ so that it can continue its slice once the RT thread is done.
  5. RT thread yields the CPU back. The task which got preempted is picked and starts running again. Go back to (2).

So, if I'm reading the code right, if the task keeps waking up the RT thread before its slice expires, the task can get its time slice refreshed indefinitely. If this is the case, the remedy may be:

a. Distinguish the first ops.running() after ops.runnable() and replenish slice only on the first transition.
b. Implement ops.cpu_release() to detect preemption by RT / DL threads and explicitly handle the condition.

I of course could be completely wrong, so please take it with a pinch of salt. From sched_ext core side, I've been thinking about adding the ability for BPF scheduler implementation to add scheduler specific information to the debug dump. Hopefully, that should make debugging from user reports easier.

@arighi
Copy link
Contributor

arighi commented Apr 25, 2024

So, if I'm reading the code right, if the task keeps waking up the RT thread before its slice expires, the task can get its time slice refreshed indefinitely. If this is the case, the remedy may be:

a. Distinguish the first ops.running() after ops.runnable() and replenish slice only on the first transition. b. Implement ops.cpu_release() to detect preemption by RT / DL threads and explicitly handle the condition.

I was wondering if it could be helpful to have a way to easily migrate all tasks to the SCHED_EXT class, even RT/DL tasks (maybe excluding kthreads).

At some point I had a similar issue with rustland and RT audio tasks, where audio was performing better (less audio cracks) by moving the RT tasks to the SCHED_EXT class. I did that from user-space (via schedtool), but having a way directly in sched-ext to "disable" the other sched classes (even if it sounds a bit too extreme) could be useful, especially for debugging, since we could reliably take out of the equation RT tasks. Opinions?

@skygrango
Copy link

@skygrango -- If you don't mind, please share the game scenario that you played? It seems that you ran the old version of scx_lavd. If possible, could you try the lastest version (git HEAD).

after I compile the latest git version, scx_lavd no longer crashes.

but when I run compilation with 100% cpu usage in the background, the game fps is still very unstable and low.

@htejun
Copy link
Contributor

htejun commented Apr 25, 2024

I was wondering if it could be helpful to have a way to easily migrate all tasks to the SCHED_EXT class, even RT/DL tasks (maybe excluding kthreads).

Yeah, maybe. It's not technically difficult to implement but it's also kinda nice to have the guarantee that RT tasks will always run no matter what happens to scx schedulers. Lemme think over it.

Thanks.

@Galcian79
Copy link

@skygrango -- If you don't mind, please share the game scenario that you played? It seems that you ran the old version of scx_lavd. If possible, could you try the lastest version (git HEAD).

after I compile the latest git version, scx_lavd no longer crashes.

but when I run compilation with 100% cpu usage in the background, the game fps is still very unstable and low.

I synced to git too. The issue is still there i think, but far less noticeable. Doesn't crash anymore.

@Galcian79
Copy link

Wouldn't be useful to have some tool like scx_lavd --version?

@sirlucjan
Copy link
Contributor

Wouldn't be useful to have some tool like scx_lavd --version?

If you use the “git” version, you can always check what you are using. For example, in Arch Linux:

lucjan at cachyos ~ 22:08:34    
❯ pacman -Qs scx-scheds-git                       
local/scx-scheds-git 0.1.8.r59.g973aded-1
    sched_ext schedulers

you can very easily indicate which commit I built from and then you know what version you have at your place (in this case, for the moment in which I publish a comment - the latest version).

Your idea is not bad, although I see one drawback. Suppose we give version “0.1.0” - and now the question is, how would it be bumped up? What release? Every commit/pull request? I say in advance this is not a criticism just a request to develop the idea.

@Galcian79
Copy link

Wouldn't be useful to have some tool like scx_lavd --version?

If you use the “git” version, you can always check what you are using. For example, in Arch Linux:

lucjan at cachyos ~ 22:08:34    
❯ pacman -Qs scx-scheds-git                       
local/scx-scheds-git 0.1.8.r59.g973aded-1
    sched_ext schedulers

you can very easily indicate which commit I built from and then you know what version you have at your place (in this case, for the moment in which I publish a comment - the latest version).

Your idea is not bad, although I see one drawback. Suppose we give version “0.1.0” - and now the question is, how would it be bumped up? What release? Every commit/pull request? I say in advance this is not a criticism just a request to develop the idea.

I see now. So all the schedulers are tied to the same version.

@sirlucjan
Copy link
Contributor

I mean your idea is not bad. Only you would have to work out how to update the version. However, checking what commit you are based on is also a good solution, because then you know globally what version you are using.

@Galcian79
Copy link

I mean your idea is not bad. Only you would have to work out how to update the version. However, checking what commit you are based on is also a good solution, because then you know globally what version you are using.

I mean usually you would want something like x.x.x-dev. This is usually what binaries compiled from git report. But ofc concerning the -dev part you would always refer to the git commit.

@bunkbail
Copy link

Edited my previous comments, adding the commit hash version of the build.

Here's a new dump log of the latest build (commit 973aded), now it took longer to crash lavd (around 5 mins compared to previous version where it took just around 1 min to crash). Game was still lagging and freezing throughout.

dump.txt

@multics69
Copy link
Contributor

@htejun @arighi After looking at the logs, the watchdog time out happened only when a task is in a local DSQ. I seems your hypothesis is correct -- an RT task keeps preempting an scx task so scx task's time slide is replenished every preemption, running indefinitely and causing the watchdog timeout error.

Here are my two suggestions:

  1. I think it is worth to implement your option (a). If possible, it would be better to replelish the remained time slice (not 20 msec).
  2. In the watchdog logs, it would be worth to print out task's time slice. This will give us more hint where the problem comes from.

What do you think?

@arighi
Copy link
Contributor

arighi commented Apr 26, 2024

@multics69 I need to look better at the code (maybe @htejun or @Byte-Lab already have a better answer to this), but I think it should be possible already to replenish only the remaining time slice, implementing ops.cpu_release(), check whether a higher-priority task takes the CPU and restoring the remaining time slice in ops.cpu_acquire()...?

Maybe another approach to attack the problem could be to have multiple SCHED_EXT classes (following what Joel suggested here https://lore.kernel.org/all/CAEXW_YR02g=DetfwM98ZoveWEbGbGGfb1KAikcBeC=Pkvqf4OA@mail.gmail.com/). Having multiple SCHED_EXT classes would allow to have more options to decide which classes of tasks a sched-ext scheduler can grab, and eventually provide cmdline options to grab all tasks, or exclude RT tasks, etc.

But I'm just brainstorming here... this is something to think about more.

@multics69
Copy link
Contributor

multics69 commented Apr 26, 2024

@arighi -- Yes, restoring the remaining time slice is possible using cpu_release/acquire() callbacks. However, if it could be the problem in all schedulers, I think it would be better to be implemented on the framework level. Again, @htejun and @Byte-Lab are the right person to answer this. hehe :-)

Supporting multiple sched_ext classes sounds an interesting direction. I will also think about more.

@htejun
Copy link
Contributor

htejun commented Apr 26, 2024

I don't think lavd needs to implement ops.cpu_release/acquire() to solve this problem. The problem is specific to lavd in that currently no other schedulers override p->scx.slice from ops.running() and that lavd is currently not considering the fact that a task may transition multiple times between ops.running() and ops.stopping between a pair of ops.runnable() and ops.quiescent(). Given those, here are possible solutions that come to mind:

  • Have a per-task flag, e.g. refresh_slice which is set by ops.runnable() and make ops_running() set the slice iff the flag is set and then clear the flag.
  • I suppose the reason why p->scx.slice is being set from ops.running() is because scx_bpf_dispatch() is too early to determine the slice. This is a bit of hole in the scx API design. Originally, there were only local DSQs so the expectation was that a task which is scx_bpf_dispatch()'d would get executed right away and thus that's where the slice was set. However, with user-DSQs this is no longer the case. Tasks can sit in user DSQs for arbitrary amount of time, so depending on the scheduler implementation, it may not be desirable to set the slice at the time a given task is dispatched to such user DSQ. Hopefully, this should be resolved with the newly introduced DSQ iterators (will update scx repo with support later today). Once this is in place, lavd should be able to walk its user DSQ, pick the target task to run next, set its slice and then consume the task from ops.dispatch() which would make it unnecessary to muck with the slice from ops.running().

Other than the immediate lavd misbehavior, there are three issues raised in this thread:

  1. Taking over all tasks including RT / DL. There's more to think about here. e.g. some RT kthreads are used for kernel's basic operation and stalling them can lead to system level failures. e.g. If kernel hang check timer kthread isn't scheduled in time, it will trigger it which in some systems would be configured to panic and reset the system, so this has the problem of weakening one of the important properties that SCX tries to guarantee. So, right now, I'm leaning more towards leaving them alone. I think, as a group, SCX scheduler writers would need to learn that there are other sched classes on top of us and deal with it. All the necessary interface is already there but we definitely can benefit from better documentation.
  2. The ordering between kernel's fair class and SCX. This seems to be a feature which has been requested by multiple parties and there are no safety concerns. I'll think more on it and likely add this in some form.
  3. Adding more information in debug dumps. Yeah, for sure. However, I'm not sure remaining slice printout would have helped much here. It would have looked like the task just has a normal slice duration as the problem is the slice being repeatedly being replenished (if the current hypothesis is correct). What may have been more useful is the total runtime since the last runnable transition. Also, I'm planning to add scheduler-specific dump information so that schedulers can dump whatever they want to dump.

@htejun
Copy link
Contributor

htejun commented Apr 26, 2024

Hmmm.... @arighi did mention excluding RT kthreads. That probably doesn't break any kernel mechanisms. I'll think more on it.

@multics69
Copy link
Contributor

multics69 commented Apr 27, 2024

Thank you @htejun for the detailed explanation.

Setting the time slice at ops.running() seems to be the source of the problem as you explained.

Regarding the scx iterator, I don't think the iterator helps much here. That is because the global DSQ is already vtime-based DSQ. And, in lavd, vtime of DSQ is task' deadline, so scx_bpf_dispatch() should always return a task with the earliest deadline. Or we set task's time slice at ops.dispatch() using scx iterator?

Regarding the refresh_slice, I thought a transition such that ops.runnable() -> ops.running() -> ops.stopping() -> ops.running() ... is possible even without RT task preemption. I thought it could happen when a task's time slice is exhausted but a task is still in a runnable state. Am I correct?

@htejun
Copy link
Contributor

htejun commented Apr 27, 2024

Regarding the scx iterator, I don't think the iterator helps much here. That is because the global DSQ is already vtime-based DSQ. And, in lavd, vtime of DSQ is task' deadline, so scx_bpf_dispatch() should always return a task with the earliest deadline. Or we set task's time slice at ops.dispatch() using scx iterator?

Yes, that's what I meant. ops.dispatch() can set the slice using the iterator as it will have access to what will get consumed.

Regarding the refresh_slice, I thought a transition such that ops.runnable() -> ops.running() -> ops.stopping() -> ops.running() ... is possible even without RT task preemption. I thought it could happen when a task's time slice is exhausted but a task is still in a runnable state. Am I correct?

You're right. I guess that's why you put slice setting in ops.running() in the first place. Would reasserting refresh_slice from ops.stopping() if it has exhausted its slice work?

@htejun
Copy link
Contributor

htejun commented Apr 27, 2024

Or rather, how about ops.runnable() setting slice to 0 and ops.running() refreshing slice iff it's 0?

@htejun
Copy link
Contributor

htejun commented Apr 27, 2024

Oops, that won't work right now. Kernel will trigger zero slice warning.

@bunkbail
Copy link

Another update from me. I've built lavd from the latest commit (d9ea53c) and also cherry picked 2 patches from PR #247 and PR #250. Now, lavd no longer crashes and the game no longer stutters/freezes like before. In my limited test, everything is smooth as butter. Awesome work @multics69 and @htejun!

@Galcian79
Copy link

Galcian79 commented Apr 30, 2024

Apex Legends is FTP i could give it a try too.

@multics69 multics69 pinned this issue Apr 30, 2024
@DasLeo
Copy link

DasLeo commented Apr 30, 2024

Apex Legends is FTP i could give it a try too.

If you test it, make sure to enable the shader messages DXVK_HUD as they also produce lots of stutters, when not yet finished.
Also unlock FPS with %command% +fps_max unlimited argument.

It's already enough to work in the Firing Range, no need to play online multiplayer.

@Galcian79
Copy link

Apex Legends is FTP i could give it a try too.

If you test it, make sure to enable the shader messages DXVK_HUD as they also produce lots of stutters, when not yet finished.

Even with GPL?

@DasLeo
Copy link

DasLeo commented Apr 30, 2024

Apex Legends is FTP i could give it a try too.

If you test it, make sure to enable the shader messages DXVK_HUD as they also produce lots of stutters, when not yet finished.

Even with GPL?

@Galcian79 if you mean this, it's actually as stated in the docs. Shader compilation starts as soon as the mode (map) is selected in the menu, but it takes a while and keeps running during the game phase.
Apex Shaders can go up to 2GB at least in my $HOME/.cache/nvidia/GLCache/.
Also had to use __GL_SHADER_DISK_CACHE_SKIP_CLEANUP=1 to prevent NVIDIA drivers from cleaning them up after 256MB or something.

If the caching is done, the game runs butter smooth but with a slight difference when enabling scx_lavd.

@LethalManBoob
Copy link

I will add here that CS2 main menu is like 10fps with lavd, runs fine without

@Galcian79
Copy link

Galcian79 commented Apr 30, 2024

I started the game with fps unlocked, all graphic settings set to max. I played on the training ground waiting for dxvk to compile the shaders. Didn't notice any stutter.
Then i started lavd and took the logs while playing a 10 mins match. I didn't notice any stutter either.
(BTW, we won 🥇 )
inxi.log

1172470_stamps.log

@LethalManBoob
Copy link

I started the game with fps unlocked, all graphic settings set to max. I played on the training ground waiting for dxvk to compile the shaders. Didn't notice any stutter. Then i started lavd and took the logs while playing a 10 mins match. I didn't notice any stutter either. (BTW, we won 🥇 ) inxi.log 1172470_stamps.log

I dont know what your specs are but i run nvidia arch

@Galcian79
Copy link

Galcian79 commented Apr 30, 2024

I started the game with fps unlocked, all graphic settings set to max. I played on the training ground waiting for dxvk to compile the shaders. Didn't notice any stutter. Then i started lavd and took the logs while playing a 10 mins match. I didn't notice any stutter either. (BTW, we won 🥇 ) inxi.log 1172470_stamps.log

I dont know what your specs are but i run nvidia arch

That's why i included my inxi.

@multics69 multics69 unpinned this issue May 1, 2024
@LethalManBoob
Copy link

I started the game with fps unlocked, all graphic settings set to max. I played on the training ground waiting for dxvk to compile the shaders. Didn't notice any stutter. Then i started lavd and took the logs while playing a 10 mins match. I didn't notice any stutter either. (BTW, we won 🥇 ) inxi.log 1172470_stamps.log

I dont know what your specs are but i run nvidia arch

That's why i included my inxi.

it doesnt show your specs...

@Galcian79
Copy link

I started the game with fps unlocked, all graphic settings set to max. I played on the training ground waiting for dxvk to compile the shaders. Didn't notice any stutter. Then i started lavd and took the logs while playing a 10 mins match. I didn't notice any stutter either. (BTW, we won 🥇 ) inxi.log 1172470_stamps.log

I dont know what your specs are but i run nvidia arch

That's why i included my inxi.

it doesnt show your specs...

Which specs are we talking about?

@LethalManBoob
Copy link

image
didnt know these two links were seperate

@Galcian79
Copy link

Galcian79 commented May 1, 2024

image didnt know these two links were seperate

Yeah my fault, i was editing on my phone and didn't realized they were on the same line.

@Galcian79
Copy link

Fixed my op.

@DasLeo
Copy link

DasLeo commented May 1, 2024

Hey folks,

I ran a test again now to attach my inxi.log + scx_lavd-01-05-2024.log

Some facts:

  • I was standing still in Apex Legends to compare fps -> dropped from 170 to 145 after starting scx_lavd
  • Constant Microstutter (while moving the mouse) as soon as scx_lavd is started, but butter-smooth on plain BORE+EEVDS
  • Running Arch, X11, KDE without Composition, Nvidia GTX 1080 on proprietary driver 550.76
  • Find the rest in the inxi.log

Hope this helps, otherwise just ping me.

@Galcian79
Copy link

Here is the log of another game. I just found logs cannot run for more than 3hrs.

1658280_stamps.tar.gz

@multics69
Copy link
Contributor

Thanks @DasLeo and @Galcian79 for narrowing down the problem.
I will take a close look and improve it in the next version (hopefully).

@multics69
Copy link
Contributor

@ALL -- Could you please test if the problem still exists? PR #274 should fix the problem.

@ptr1337
Copy link
Contributor

ptr1337 commented May 10, 2024

@ALL -- Could you please test if the problem still exists? PR #274 should fix the problem.

FYI, please test on the 6.9 linux-cachyos-rc kernel, since LAVD requires cpufreq support which is not included in the cachyos 6.8 Kernel currently.
Ive updated the scx-scheds-git package with the above mentioned patch.

@bunkbail
Copy link

bunkbail commented May 10, 2024

I wanted to test the latest lavd but I only have kernel 6.8.9 and latest lavd refuses to launch there. I tried reverting commit a24e1d7 (reverting commit 6892898 causes merge conflict) and build, but it still refuses to launch where pretty much other scx schedulers are running just fine. I'll wait until 6.9 hits stable then I'll be able to test lavd.

@Galcian79
Copy link

1604030_lavd.tar.gz

@multics69
Copy link
Contributor

@Galcian79 Thank you for sharing your log. Did you notice any stuttering? Is the problem getting worse or better?

@Galcian79
Copy link

@Galcian79 Thank you for sharing your log. Did you notice any stuttering? Is the problem getting worse or better?

I haven't noticed any stuttering.

@DasLeo
Copy link

DasLeo commented May 11, 2024

Tried to build Cachy 6.9-rc7 (I guess) yesterday but I got BTF errors in the final vmlinux step, need to figure this out to start further testing.

@DasLeo
Copy link

DasLeo commented May 12, 2024

I was testing for a while today on 6.7-rc7 and hadn't had any issues nor micro-stutter.
FPS stays the same with and without the scheduler attached.

Great work folks! 👍👍👍

Here are my logs if you still wanna have a look:

Will use the scx_lavd from now on. ✌️

@bunkbail
Copy link

Same here. I'm on the stable 6.9 kernel, no micro-stutter during gaming and desktop responsiveness under load is also good. Great work @multics69!

@ChrisLane
Copy link
Author

ChrisLane commented May 13, 2024

Compiled at fa1c146 and tested with the same setup but 6.9.0-rc7-5-cachyos-rc and didn't experience any of the severe stuttering that I was having before 🎉

Closing this issue. Thank you for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests