-
Notifications
You must be signed in to change notification settings - Fork 800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace Zstandard wrapper with native Go implementation #396
Conversation
Thanks a lot for this contribution @pascaldekloe ! What do you think about keeping the compression ratio to 5? I'm concerned that programs out there which rely on this package will suddenly see a change in behavior which could lead to exhausting disk space on kafka brokers. The default compression ratio in kafka-go would differ from what the dependency for zstd sets, but we can document why we're making this decision. Could you share a CPU and memory profile of the decompression benchmark? I'm curious to see where the difference comes from and if there's anything we can do about it. Let me know! |
The numbers of the benchmark look a bit weird to me. Could you share the benchmark if it is not included? |
@klauspost you can run the benchmark with I took a closer look at the |
Where do I find this?
Edit... Ahhh... From the Go source.... I don't have |
There is something wonky with your benchmarks. If you only do the compression the numbers are drastically different, eg: func benchmarkCompression(b *testing.B, codec kafka.CompressionCodec, buf *bytes.Buffer, payload []byte) float64 {
// In case only the decompression benchmark are run, we use this flags to
// detect whether we have to compress the payload before the decompression
// benchmarks.
b.Run("compress", func(b *testing.B) {
r := bytes.NewReader(payload)
b.SetBytes(int64(len(payload)))
b.ResetTimer()
for i := 0; i < b.N; i++ {
buf.Reset()
r.Reset(payload)
w := codec.NewWriter(buf)
_, err := io.Copy(w, r)
if err != nil {
b.Fatal(err)
}
if err := w.Close(); err != nil {
b.Fatal(err)
}
}
})
return 1 - (float64(buf.Len()) / float64(len(payload)))
}
|
As a sidenote, since you already have the dependency, swapping out
With
But it only really becomes an advantage on long streams, longer than the teststream. (900MB/s is low) |
Nice! Thanks for the tip on The compression ratio is different in your benchmarks, is it due to using a different default compression level? Or is it due to different trade offs in the implementation? |
Yes, it is for "default" to be a more reasonable default in terms of speed/size tradeoff.
https://blog.klauspost.com/rebalancing-deflate-compression-levels/ But it also has a lot other optimizations, for example in-compressible data being skipped more than 50x faster. |
To be clear, compression level 3 is the only compression level supported with the Go library. Other values will silently default to 3. The compressed size win for level 5 is highly unlikely to exceed 20%. Kafka works well with fully packed disks thanks to the low fragmentation by design. I don't think anyone will run out of space unexpectedly due to this loss. Those debug lines are more or less harmless. func printf(format string, a ...interface{}) {
if debug {
log.Printf(format, a...)
}
} The benchmark result in the description were measured on a mid-range iMac from 2017. I had a quick look, and most of the slowdown comes from the high amount of data copies. The decoder does not make use the advertised frame size. Also, the data parsing uses abstract methods, which prevent all sorts of optimisations. This is all fixable in the long run of course. |
(assuming you talk about zstd). A level 1 equivalent is also available. |
So what should we do? Conns
Pro's
|
@pascaldekloe Do you have an updated set of benchmarks now that you've changed the compression level and upgraded the library? Also, zstd is expected to be slower than snappy, so I wouldn't list that as a con. The compression ratio should be very favorable, though, so it's all about tradeoffs. 😄 |
|
Looking at the numbers the benchmarks still looks wrong. Try to compare to the "clean" one I posted above. Maybe just split encode/decode benchmarks, since simpler may be better in this case. LZ4 should perform on level with Snappy, and only the fastest level is really worth it on lz4 (using pierrec library). The rest are slow and a waste of CPU. I would propose to set the default to that. |
Thanks for the tip! Let's make this another follow up to this PR as well to keep the focus on Zstd here 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is looking good to me 👍 Thanks for the contribution!
Follow-up on stale #303 with full use of the streaming API from @klauspost.
Decompression is slower while the (default) compression ratio went down from 5 to 3.