[Enhancement] metadata support LRU memory evict strategy in shared-nothing cluster (backport #48832) #49170
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I'm doing:
In previous implementation, there is no memory limit for
metadata
in shared-nothing cluster. So if there are too many tablets and segment files, BE will OOM. We need a controllablemetadata
memory strategy.What I'm doing:
Add a lru cache for
metadata
, its capacity is controlled by be.confmetadata_cache_memory_limit_percent
.When
Rowset
performs a load action, it adds theRowset
to the lru cache, and if the lru cache memory exceeds the limit, it selectively eliminates the loadedRowset
, ultimately realizing the goal of controllablemetadata
memory.This strategy is only support non-pk table now.
There will be two cases when evict the Rowset:
ROWSET_LOADED
toROWSET_UNLOADED
, and then release the memory.ROWSET_LOADED
toROWSET_UNLOADING
. Memory will be release after compaction or query finish.Test Result
metadata
continues to increase without limit.metadata
memory is stabilized at 1GB.What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
This is an automatic backport of pull request #48832 done by [Mergify](https://mergify.com). ## Why I'm doing: In previous implementation, there is no memory limit for `metadata` in shared-nothing cluster. So if there are too many tablets and segment files, BE will OOM. We need a controllable `metadata` memory strategy.
What I'm doing:
Add a lru cache for
metadata
, its capacity is controlled by be.confmetadata_cache_memory_limit_percent
.When
Rowset
performs a load action, it adds theRowset
to the lru cache, and if the lru cache memory exceeds the limit, it selectively eliminates the loadedRowset
, ultimately realizing the goal of controllablemetadata
memory.This strategy is only support non-pk table now.
There will be two cases when evict the Rowset:
ROWSET_LOADED
toROWSET_UNLOADED
, and then release the memory.ROWSET_LOADED
toROWSET_UNLOADING
. Memory will be release after compaction or query finish.Test Result
metadata
continues to increase without limit.metadata
memory is stabilized at 1GB.What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist: