Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd support #539

Merged
merged 7 commits into from
Apr 27, 2023
Merged

zstd support #539

merged 7 commits into from
Apr 27, 2023

Conversation

rlerdorf
Copy link
Contributor

This adds zstd compression support.

The current two options, zlib and fastlz is basically a choice between performance and compression ratio. You would choose zlib if you are memory-bound and fastlz if you are cpu-bound.
With zstd, you get the performance of fastlz with the compression of zlib. And often it wins on both. See this benchmark I ran on json files of varying sizes:

https://gist.github.com/rlerdorf/788f3d0144f9c5514d8fee9477cbe787

Taking just a 40k json blob, we see that zstd at compression level 3 reduces it to 8862 bytes. Our current zlib 1 gets worse compression at 10091 bytes and takes longer both to compress and decompress.

      C Size  ratio%     C MB/s     D MB/s   SCORE      Name            File
        8037    19.9       0.58    2130.89       0.08   zstd 22         file-39.54k-json
        8204    20.3      31.85    2381.59       0.01   zstd 10         file-39.54k-json
        8371    20.7      47.52     547.12       0.01   zlib 9          file-39.54k-json
        8477    20.9      74.84     539.83       0.01   zlib 6          file-39.54k-json
        8862    21.9     449.86    2130.89       0.01   zstd 3          file-39.54k-json
        9171    22.7     554.62    2381.59       0.01   zstd 1          file-39.54k-json
       10091    24.9     153.94     481.99       0.01   zlib 1          file-39.54k-json
       10646    26.3      43.39    8097.40       0.01   lz4 16          file-39.54k-json
       10658    26.3      72.30    8097.40       0.01   lz4 10          file-39.54k-json
       13004    32.1    1396.10    6747.83       0.01   lz4 1           file-39.54k-json
       13321    32.9     440.08    1306.03       0.01   fastlz 2        file-39.54k-json
       14807    36.6     444.91    1156.77       0.01   fastlz 1        file-39.54k-json
       15517    38.3    1190.79    4048.70       0.02   zstd -10        file-39.54k-json

The fact that decompression a dramatically faster with zstd is a win for most common memcache uses since they tend to be read-heavy.

The PR also adds a memcache.compression_level INI switch which currently only applies to zstd compression. It could probably be made to also apply to zlib and fastlz.

php_memcached.c Outdated Show resolved Hide resolved
php_memcached.c Outdated Show resolved Hide resolved

}
else
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(for existing block below) -- I am refreshing myself on the other compression routines, but it looks like they are hard requirements to compile the php-memcached package and so they're never conditional on the libraries. Is that because PHP always offers these routines? (I'll catch up on reading, asking out loud for transparency)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both zlib and fastlz are mandatory. If the system fastlz is not available it will use a bundled copy. In theory we could do the same with zstd, but it is pretty widely available in every distro these days, so we could also just make it a hard requirement.

Copy link
Contributor

@sodabrew sodabrew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! let me know if you have any other work you want to do on the PR, otherwise I'll go ahead and land it

@@ -3001,6 +3029,9 @@ int php_memc_set_option(php_memc_object_t *intern, long option, zval *value)
case MEMC_OPT_COMPRESSION_TYPE:
lval = zval_get_long(value);
if (lval == COMPRESSION_TYPE_FASTLZ ||
#ifdef HAVE_ZSTD_H
lval == COMPRESSION_TYPE_ZSTD ||
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the scenario here is setting the compression type to zstd, but it's not compiled in, you get an invalid argument error. Seems reasonable!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! let me know if you have any other work you want to do on the PR, otherwise I'll go ahead and land it

I am not planning anything else for the moment. There is some interesting stuff that could be done around training zstd and using dictionary compression, but that is probably overkill here.

@sodabrew sodabrew merged commit 5833590 into php-memcached-dev:master Apr 27, 2023
sodabrew pushed a commit that referenced this pull request May 3, 2023
…540)

Make it possible to use setOption to set Memcached::OPT_COMPRESSION_LEVEL which
was missed in the original zstd PR #539

zlib compression was using the default zlib compression level of 6. With this PR
it is now possible to choose other levels for zlib as well. The default remains
at 6 so nothing will change for people upgrading unless they explicitly set a
different level.

Here is some more benchmarking data using php serialized data
https://gist.github.com/rlerdorf/b9bae385446d5a30b65e6e241e34d0a8

fastlz is not really useful at any value size anymore. Anybody looking for
lightning quick compression and decompression should use zstd at level 1.

compression_level is not applied to fastlz because it only has 2 levels and
php-memcached already switches from level 1 to 2 automatically for values larger
than 65535 bytes. Forcing it to one or the other doesn't seem useful.
@tylerchr tylerchr mentioned this pull request Dec 5, 2023
m6w6 added a commit that referenced this pull request Sep 26, 2024
- Add #515 option to locally enforce payload size limit
- Add #539 zstd support
- Add #540 compression_level option
- Mark password as a sensitive param for PHP 8.2
- Fix Windows PHP 8 compatibility
- Fix #518 Windows msgpack support
- Fix #522 signed integer overflow
- Fix #523 incorrect PHP reflection type for Memcached::cas $cas_token
- Fix #546 don't check key automatically, unless client-side verify_key is enabled
- Fix #555 incompatible pointer types (32-bit)
m6w6 added a commit that referenced this pull request Oct 4, 2024
    - Add #515 option to locally enforce payload size limit
    - Add #539 zstd support
    - Add #540 compression_level option
    - Mark password as a sensitive param for PHP 8.2
    - Fix Windows PHP 8 compatibility
    - Fix #518 Windows msgpack support
    - Fix #522 signed integer overflow
    - Fix #523 incorrect PHP reflection type for Memcached::cas $cas_token
    - Fix #546 don't check key automatically, unless client-side verify_key is enabled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants