Amortize allocations in snappyDecode #446

potocnyj · 2015-05-08T19:37:53Z

Profiling the Kafka consumers at VividCortex showed me that a large portion of the allocated memory in our code comes from sarama.snappyDecode. This change reduces the average number of allocated bytes by pre-allocating the destination slice, and by re-using the chunk slice for cases that make multiple calls to snappy.Decode. Doing so should reduce GC pressure by a modest amount.

Profiling the Kafka consumers at VividCortex showed me that a large portion of the allocated memory in our code comes from sarama.snappyDecode. This change reduces the average number of allocated bytes by pre-allocating the destination slice, and by re-using the chunk slice for cases that make multiple calls to snappy.Decode. Doing so reduces GC pressure by a modest amount.

eapache · 2015-05-08T19:40:05Z

Makes sense to me. I'd be curious to see the actual before/after benchmark results if you can share those?

potocnyj · 2015-05-08T19:42:56Z

The numbers I have so far from local testing are fairly noisy, I'd be happy to collect some numbers from our production env and report back later though

potocnyj · 2015-05-08T21:50:22Z

Right, so here are some numbers on patched vs. un-patched on one of our production consumers:

snappyDecode went from being 43.9% of total alloc'd_space to 33.9%, with the underlying snappy.Decode call dropping from 6.2% to 0.7%
On the cpu side, snappyDecode went from 6.6% of cpu to 4.8%. runtime.Growslice and runtime.Makeslice have decreases that appear to make up this difference.
bgsweep dropped from using 7.8% to 6.7% of cpu, and runtime.sweepone went from 9.7% to 7.8% of total cpu

eapache · 2015-05-08T23:39:00Z

👍

Amortize allocations in snappyDecode

wvanbergen · 2015-05-08T23:39:49Z

Awesome improvement!!

On Friday, May 8, 2015, Evan Huus [email protected] wrote:

Merged #446 #446.

—
Reply to this email directly or view it on GitHub
#446 (comment).

eapache · 2015-05-29T21:17:38Z

Definitely awesome! After rolling this out to our cluster a little while ago, we've seen a distinct ~15% reduction in memory usage across our producers. Thanks!

eapache added a commit that referenced this pull request May 8, 2015

Merge pull request #446 from potocnyj/amortize-snappy-allocations

440808c

Amortize allocations in snappyDecode

eapache merged commit 440808c into IBM:master May 8, 2015

eapache mentioned this pull request Jan 12, 2016

Sarama Snappy Compression #593

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amortize allocations in snappyDecode #446

Amortize allocations in snappyDecode #446

potocnyj commented May 8, 2015

eapache commented May 8, 2015

potocnyj commented May 8, 2015

potocnyj commented May 8, 2015

eapache commented May 8, 2015

wvanbergen commented May 8, 2015

eapache commented May 29, 2015

Amortize allocations in snappyDecode #446

Amortize allocations in snappyDecode #446

Conversation

potocnyj commented May 8, 2015

eapache commented May 8, 2015

potocnyj commented May 8, 2015

potocnyj commented May 8, 2015

eapache commented May 8, 2015

wvanbergen commented May 8, 2015

eapache commented May 29, 2015