-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Follower index is sometimes not cleaned up correctly #6237
Comments
It seems the Follower receives quite often a snapshot where the snapshot index is much bigger then the last index of the journal. This means it deletes the log and replaces it with the snapshot. Problem here is that the journal index is not reseted and keeps growing.
Another issue I see here is that it seems that after the log is deleted the snapshot is rolled back and also deleted 👀 Looks problematic. Any thoughts @deepthidevaki @npepinpe ? Update Ok nevermind it seems the logs are bit confusing. |
Please backport this to 0.25 and 0.26, and we will create patch releases for both afterwards. |
…6237) Bumps [com.auth0:mvc-auth-commons](https://github.com/auth0/auth0-java-mvc-common) from 1.10.0 to 1.11.0. - [Release notes](https://github.com/auth0/auth0-java-mvc-common/releases) - [Changelog](https://github.com/auth0/auth0-java-mvc-common/blob/master/CHANGELOG.md) - [Commits](auth0/auth0-java-mvc-common@1.10.0...1.11.0) --- updated-dependencies: - dependency-name: com.auth0:mvc-auth-commons dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Describe the bug
TL;DR; It seems that the growing heap usage is related to the index of a follower.
We observed in the stable cluster benchmark a growing heap memory usage.
I took some heap dump's on different times and different nodes and I saw that often when the memory usage was quite high we had a lot of long living
Long
objects. These objects are referenced for example by aTreeMap
, which is used by the SparseJournalIndex. But also some are referenced by theConccurentSkipList
which is used by theZeebeIndexAdapter
.One of the heap I checked was node two which was leader for partition one and two and follower for partition three. This node had a quite high heap usage.
We can see it had 15 million Long objects and a lot TreeMap and ConcurrentSkipList nodes. Based on a OQL query I was able to find relative big TreeMaps.
Based on that I was able to find the SparseJournalIndex.
and the related partition.
If we check the
ZeebeIndexAdapter
we see the usage of theConcurrentSkipList
.Node 0
If we check another node (broker 0), which is follower for all partitions, we see around ~60k indexes in the journal indexes.
It seems to be a problem with compacting. After a leader change has happened we saw for short time different heap usage.
I check also the logs and it seems that we are able to compact the log at node 2 for partition 3.
To Reproduce
Run a benchmark on stable nodes.
Expected behavior
The journal index is not that big and can be compacted.
Environment:
The text was updated successfully, but these errors were encountered: