-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
👑 Readonly B-Tree SST index #1483
Labels
Comments
kunga
added
the
area/datashard
Issues related to datashard tablets (relational table partitions)
label
Jan 31, 2024
This was
linked to
pull requests
Feb 2, 2024
Merged
This was
unlinked from
pull requests
Feb 2, 2024
Merged
TPC-C 26-03-24https://nda.ya.ru/t/n6PUIdLH7595hj B-Tree + Flat, Flat, B-Tree B-Tree index
Flat index
YCSB 26-03-24https://nda.ya.ru/t/jHlSyKq3759KBW B-Tree, Flat B-Tree index
Flat index
|
BuildStatsFlat indexhttps://nda.ya.ru/t/u4oX-oPx759rBA B-Tree index |
TPC-C 30-04-2024, 16Khttps://nda.ya.ru/t/KHu9a09t75gmqt Flat index
B-Tree index
|
YSCB 30-04-2024https://nda.ya.ru/t/sMAtUlme75hHrn Flat index
B-Tree index
|
TPC-C 01-05-2024, 12Khttps://nda.ya.ru/t/E-eHq6GZ75hyB5 Flat index
B-Tree index
|
TPC-C 02-05-2024, 12K, 54GB cachehttps://nda.ya.ru/t/oyxKo3hD75iN9U Flat index
B-Tree index
|
TPC-C 02-05-2024, 12K, 18GB cachehttps://nda.ya.ru/t/MYwKhca_75iuBR Flat index
B-Tree index
|
TPC-C 03-05-2024, 12K, 2GB cachehttps://nda.ya.ru/t/lJkYsNo375joSi Flat index
B-Tree index
|
YCSB 04-05-2024, 12K, 2GB cachehttps://nda.ya.ru/t/llFWtuzo75kKft Flat index
B-Tree index
|
This was referenced Nov 12, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Description
Implement a readonly B-Tree SST index instead of the current flat one.
This improvement opens us the possibility of not always needing to keep SST's indexes in memory. We will be able to store a large amount of "cold" data without dozens of compute nodes to load needed parts of indexes on demand. Also enhances stability of Shared Cache and removes (actually reduces) passive bytes segment.
As we write a SST during the compaction process, its indexes will be readonly. However, the B-Tree structure minimizes needed disk I/O operations.
Also we will extend the current index format with additional information about data size and erased row count which will allow us to navigate easily and count statistics without the need of fully loading the whole index into memory.
Internal design doc.
Steps
1. Support simple flat index loading in all the usages
Instead of loading all SST's indexes on a tablet start, do it on the first usage. Before that loading keep memory free.
IIndexIter
interface that loads index on the first usage 66a3bbbTRunIt
66a3bbbIPages
implementations aab08a6 10fe953TCharge
55d8ae0TKeysLoader
0f2a2ff bac30c7BuildStats
287b4bb a18f545TForward
5ad1dcc BTreeIndex Bugfix load indexes in scan #657TDump
cc271cdTPart
16b0f8d2. Implement readonly B-Tree index
Implement B-Tree data structure, build and write them with an SST, implement searches.
TPartWriter
1ccc850TRunIt
with a new B-Tree index iterator (TPartBtreeIndexIt
) 2f7d8b6EnableLocalDBBtreeIndex
setting ba6b7d6TKeysLoader
with a new B-Tree index iterator 897ada73. Implement Precharge over readonly B-Tree index
Implement Precharge that works over tree structured index instead of the old flat one.
ICharge
interface and extractChargeRange
methods e7f64cc BTreeIndex Precharge Simplification #6824. Implement Build Stats over readonly B-Tree index
Implement Build Stats that works over tree structured index instead of the old flat one. We expect that it will no longer need to fully load index and only use upper-level B-Tree index nodes.
5. Implement Scan over readonly B-Tree index
Implement Scan that works over over tree structured index.
6. Make fixes
Manually check suspicious places and fix what is broken with B-Tree index.
TSchemeShard::Clear
method BTreeIndex Bugfix SchemeShard Clear #655EnableLocalDBBtreeIndex
setting BTreeIndex Iterator benchmark #653 BTreeIndex Bugfix index size #671EnableLocalDBFlatIndex
setting #2546EnableLocalDBFlatIndex = false
EnableLocalDBBtreeIndex = true
#2547Cache line head don't want to do fetch but should
verify Forward cache bugfix index pages queue verify #3134Limit leaf B-Tree index nodes small enough for skipping them in BuildStatsTemporary use bigger index resolution to avoid full B-Tree index loading BTreeIndex Split Flush method, use bigger resolution #31826. Test
Verify that everything works together not dramatically worse than before. Make tests on a slice.
TRunIt
over B-Tree index BTreeIndex Iterator benchmark #653EnableLocalDBBtreeIndex
andEnableLocalDBFlatIndex
settings work7. Make optimizations
8. Release
EnableLocalDBBtreeIndex
setting on pre-prod clusters (both indexes will be produced) (for some databases first)EnableLocalDBBtreeIndex
setting on prod clusters (for some databases first)EnableLocalDBBtreeIndex
setting by defaultEnableLocalDBFlatIndex
settingThe text was updated successfully, but these errors were encountered: