Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compaction multi-threading #754

Merged
merged 12 commits into from
Jun 15, 2021
Merged

Conversation

mdisibio
Copy link
Contributor

@mdisibio mdisibio commented Jun 10, 2021

What this PR does:
Compaction is single-threaded and solidly limited to using a single core. This PR introduces some basic multi-threading by moving iteration to a separate goroutine, which improves performance.

Benchmark was added and shows ~12% improvement:

pkg: github.com/grafana/tempo/tempodb
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkCompaction-12     	    3594	    329848 ns/op
BenchmarkCompaction2-12    	    4042	    290434 ns/op

Testing in a real env is better:
objects and bytes written up ~20%
image
image

Some notes:

  • New tunable: the channel buffer size. Unfortunately it is in traces (not bytes). Defaulted to 1000 based on testing.
  • ID and data slices must be copied to escape the iterators which increases GC and memory pressure slightly. Interested in trying pooling, but the current Iterator interface doesn't expose the ability return slices from Next(). Will explore this in a later PR unless there is interest in trying it now.

Which issue(s) this PR fixes:
n/a

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@joe-elliott
Copy link
Member

Ha! you deleted compactor bookmarks. This is my new favorite PR.

@mdisibio
Copy link
Contributor Author

Ha! you deleted compactor bookmarks. This is my new favorite PR.

Sorry then to say they were actually just moved into iterator_multiblock.go. Although there was a slight change to retain currentError.

Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments included.

I'm finding the logic of the iterators and channels to be a bit confusing to follow. I think it's mostly b/c you directly translated the gross bookmark logic into a goroutine. The error path (actual error vs EOF) is I think the part I find most confusing.

CHANGELOG.md Outdated Show resolved Hide resolved
docs/tempo/website/configuration/manifest.md Show resolved Hide resolved
tempodb/compactor.go Show resolved Hide resolved
tempodb/encoding/iterator_multiblock.go Outdated Show resolved Hide resolved
tempodb/encoding/iterator_multiblock.go Outdated Show resolved Hide resolved
tempodb/encoding/iterator_multiblock.go Show resolved Hide resolved
tempodb/encoding/iterator_multiblock.go Outdated Show resolved Hide resolved
@mdisibio mdisibio force-pushed the compactor-parallel branch from 4821abf to 18c06b0 Compare June 14, 2021 16:37
@mdisibio mdisibio requested a review from joe-elliott June 14, 2021 17:12
Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@annanay25
Copy link
Contributor

Nice! Couldn't give this a thorough review but the decoupling of reads and writes using a buffered message queue looks great

@mdisibio mdisibio merged commit c24b20e into grafana:main Jun 15, 2021
@mdisibio mdisibio deleted the compactor-parallel branch September 15, 2021 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants