Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systematically benchmark compression algorithm, compression factor, block size #44

Open
probonopd opened this issue Aug 8, 2024 · 5 comments

Comments

@probonopd
Copy link
Member

Execute a systematic and reproducible benchmark to find the optimal combination of

  • Compression algorithm
  • Compression factor
  • Block size

in terms of

  • File size
  • Applicartion startup time
  • zsync efficiency (delta update size)

for

  • zlib
  • lzma
  • zstandard
  • dwarfs
  • libdeflate

References:

@probonopd
Copy link
Member Author

probonopd commented Aug 8, 2024

This test matrix includes 5 compression algorithms (zlib, lzma, zstandard, dwarfs, and libdeflate), 3 compression factors (low, mid, high), and 3 block sizes (256KB, 512KB, and 1MB), resulting in a total of 45 test cases.

Compression Algorithm Compression Factor Block Size File Size Application Startup Time zsync Efficiency (Delta Update Size)
squashfs with zlib 6 256KB
squashfs with zlib 6 512KB
squashfs with zlib 6 1MB
squashfs with zlib 9 256KB
squashfs with zlib 9 512KB
squashfs with zlib 9 1MB
squashfs with lzma 0 (fast) 256KB
squashfs with lzma 0 (fast) 512KB
squashfs with lzma 0 (fast) 1MB
squashfs with lzma 6 (normal) 256KB
squashfs with lzma 6 (normal) 512KB
squashfs with lzma 6 (normal) 1MB
squashfs with lzma 9 (ultra) 256KB
squashfs with lzma 9 (ultra) 512KB
squashfs with lzma 9 (ultra) 1MB
squashfs with zstandard 3 256KB
squashfs with zstandard 3 512KB
squashfs with zstandard 3 1MB
squashfs with zstandard 10 256KB
squashfs with zstandard 10 512KB
squashfs with zstandard 10 1MB
squashfs with zstandard 19 256KB
squashfs with zstandard 19 512KB
squashfs with zstandard 19 1MB
dwarfs 128 256KB
dwarfs 128 512KB
dwarfs 128 1MB
dwarfs 256 256KB
dwarfs 256 512KB
dwarfs 256 1MB
libdeflate 6 256KB
libdeflate 6 512KB
libdeflate 6 1MB
libdeflate 9 256KB
libdeflate 9 512KB
libdeflate 9 1MB

@probonopd
Copy link
Member Author

Volunteers?

@Samueru-sama
Copy link

Something that needs to be considered as well is to take into account how long it normally takes for the same application not as an appimage to start.

For example we might see that on a very big application a certain algo is 30% faster, but that application even when not being an appimage due to its size takes several seconds to start anyway, and that 30% ends up being a very small percentage of the overall delay for the app.

Same way for very small applications, the speed difference might not matter much, because they are very small and take no time regardless.

Where problems can happen is with the mid size applications that you normally expect to start fast, those are web-browsers in other words.

Right now zstd with the current default block size is actually very good, I will do some benchmarks comparing the size and startup times, however I can't measure zsync efficiency since that seems quite a bit more work.

I'm also interested to know how this affects very old hardware, IE, some pre sandy bridge cpu for example, my hardware is from 2016 and not the worst I would say lol.

@probonopd
Copy link
Member Author

Application startup times also depend on the hardware. On systems with slow disk but fast CPU, a highly compressed image may lead to faster application launch times than uncompressed files (iirc, I have seen this myself with a large application in the past, likely with a spinning drive). It always depends where the performance bottleneck is on a particular system.

So for this to be really scientific, we'd have to execute the test matrix for typical defined machines.

But then, we are not exactly writing a dissertation here ;-)

@Drsheppard01
Copy link

I'm probably a bit late, but I think EROFS is an interesting option as well.

  • wide kernel support (since 5.4, Ubuntu 20.04, Alpine since 3.11)
  • fuse support,
  • compression:
    • zstd,
    • lzma,
    • lz4

dwarfs is incredibly performant, but gpl3 makes it impossible to use with proprietary programs packaged in appimage. At the same time, AppImage packaging is used by large companies, so the introduction of dwarfs will cut off a significant part of the audience

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants