Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of --blur #129

Closed
AladW opened this issue Aug 7, 2019 · 14 comments
Closed

Improve performance of --blur #129

AladW opened this issue Aug 7, 2019 · 14 comments

Comments

@AladW
Copy link

AladW commented Aug 7, 2019

This issue is a....

[ ] Bug
[X] Other kind of issue (Please describe in detail)

Current Behavior

i3lock -B gets increasingly slower as the blur radius is increased. On my system (off-the-mill Core i5 with integrated graphics):

Command Duration (s) CPU usage (%)
i3lock -B 0.5 0.119 total 75
i3lock -B 2 0.114 total 78
i3lock -B 5 0.176 total 82
i3lock -B 10 0.529 total 93
i3lock -B 20 1.883 total 97
i3lock -B 50 11.588 total 99

Expected Behavior

A related discussion occured in meskarune/i3lock-fancy#6, where the last comment advertised https://github.com/yvbbrjdr/i3lock-fancy-rapid. Using 3 blur passes, I have the following results:

Command Duration (s) CPU usage (%)
i3lock-fancy-rapid 0.5 3 0.232 167
i3lock-fancy-rapid 2 3 0.243 164
i3lock-fancy-rapid 5 3 0.238 166
i3lock-fancy-rapid 10 3 0.230 172
i3lock-fancy-rapid 20 3 0.234 167
i3lock-fancy-rapid 50 3 0.224 171

In summary, the "rapid" implementation seems to have constant complexity, while the i3lock-color seems to have exponential complexity.

I haven't studied the code in detail, but one step is the use of the openmp pragma:

#pragma omp parallel for

https://github.com/yvbbrjdr/i3lock-fancy-rapid/blob/master/i3lock-fancy-rapid.c#L15

Reproduction Instructions

I used time for the above results on an updated (July 08) Arch Linux system.

Environment

Output of i3lock --version:

i3lock: version 2.12.c (2018-10-02, branch "tags/2.12.c") © 2010 Michael Stapelberg, © 2015 Cassandra Fox

Where'd you get i3lock-color from?

[ ] AUR package (which one?)
[ ] Built from source yourself
[X] Other (Please describe in detail)

The package which I maintain in the Arch Linux [community] repository.

@AladW
Copy link
Author

AladW commented Aug 7, 2019

I'll also see if I can use perf to find some slow code paths, because I doubt that openmp alone can explain these big differences in performance.

@PandorasFox
Copy link

I think the parallelizing does mostly account for reducing the blur times to linear. I'll give this a crack this weekend.

@AladW
Copy link
Author

AladW commented Aug 9, 2019

OK I think I found out the issue. The amount of passes i3lock-color computes (https://github.com/PandorasFox/i3lock-color/blob/master/blur.c#L75) is very high:

Sigma Passes (n)
0.5 3
2 3
10 25
20 100
50 625

With the above amount of passes, i3lock-fancy-rapid looks way off and is even slower. I guess here you could make the amount of passes configurable, and/or default to a lower amount.

@AladW
Copy link
Author

AladW commented Aug 9, 2019

I tried adding the openmp pragma as well (appending -fopenmp to CFLAGS) and had not much luck. Adding it in blur_impl_horizontal_pass_sse2 before the outer for loop resulted in grey rather than blurred images. Adding it before the for loop in blur_image_surface resulted in even worse performance.

Maybe there's some way to make openmp work, but I'd say investigating the amount of passes is more relevant. The issue is that setting the amount of passes too low removes any distinction between say -B 5 and -B 50, and higher makes it slow.

ffmpeg is a good comparison case: it's always fast and while the blur is not as "fine", it has a reasonable distinction between lower and higher sigma.

@AladW AladW changed the title Use of openmp for faster implementation Improve performance of --blur Aug 9, 2019
@AladW
Copy link
Author

AladW commented Sep 26, 2019

If it's useful to someone, I'm doing the blur manually as follows:

#!/bin/bash
# define blur radius
sigma=15

# create temporary file for screenshot
tmp=$(mktemp -d) || exit
trap 'rm -rf "$tmp"' EXIT

# take screenshot of root window and blur it
if maim --hidecursor "$tmp"/in.png; then
    if ffmpeg -loglevel 0 -i "$tmp"/in.png -vf "gblur=sigma=$sigma" "$tmp"/out.png; then
        i3lock_options+=(--image="$tmp"/out.png)
    fi
fi

@lepz0r
Copy link

lepz0r commented Nov 6, 2019

Downscale the resolution of the screenshot before blurring & upscale back to it's original resolution after blurring

@Raymo111
Copy link
Owner

Raymo111 commented Mar 8, 2020

@AladW I'm now maintaining this repo, but I have zero experience with blurring. Can you draft a PR with what you have above? I'd be more than happy to merge it.

@AladW
Copy link
Author

AladW commented Mar 15, 2020

Sure, but I don't know how useful it would be to i3lock-color. The script above doesn't use the in-built blur at all.

@Raymo111
Copy link
Owner

@AladW What would you suggest then? Also if you don't mind updating the Arch Community package it would be great.

@AladW
Copy link
Author

AladW commented Mar 15, 2020

I'm not familiar enough with image processing or the ffmpeg codebase to propose specific code changes to improve performance.

I suppose adding something to the README could be added that --blur may be slow and that people can consider alternatives like ffmpeg for this purpose.

@bendardenne
Copy link

I've created PR #152 which might help with this. It allows you to set a transparent background color, and if you use compton with blur enabled, you can let it do the blurring. It's noticeably faster, and you get "true" transparency (stuff in the background is still updated and not "frozen").

@Raymo111
Copy link
Owner

@bendardenne By doing an alpha-color overlay, wouldn't there not be any blur though?

@bendardenne
Copy link

Here's a short recording of what it looks like with color = 000000BA which adds some dimming as well: https://youtu.be/Um3ArF4UQgI

The flickering is a recording artefact, doesn't actually happen IRL.

On this setup (4k screen + laptop monitor, not pictured) i3lock -B 5 takes a couple of seconds, while this is basically instantaneous. But yes if you want the screen to freeze, it won't help.

Raymo111 added a commit that referenced this issue Apr 17, 2020
Use 4-byte -c option to allow compositors to use translucency as an alternate blurring method, thus speeding up blurring and allowing for dynamic background. Somewhat addresses #129.
@Raymo111
Copy link
Owner

I've merged #152 that should mitigate this issue, and I'm rolling out a new release. @AladW If you could kindly update the Community Repo it would be great.

julio-b pushed a commit to julio-b/i3lock-color that referenced this issue Jan 25, 2021
fecet added a commit to fecet/dwm that referenced this issue Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants