Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Support online optimize table #43747

Merged
merged 1 commit into from
Jun 17, 2024

Conversation

meegoo
Copy link
Contributor

@meegoo meegoo commented Apr 8, 2024

Why I'm doing:

Currently, the Optimize Table redistributes data at the granularity of partitions. If new data is ingest into a partition during the optimization process, conflicts can arise. The default strategy is to prioritize ingestion, causing the optimization job for that partition to fail. This makes it impossible for Optimize Table to handle:

  • The latest partition of a partitioned table, which generally receives real-time updates.
  • Non-partitioned tables, which are usually Primary Key tables and often receive real-time updates.

What I'm doing:

Support Online Optimize Table, allowing Optimize Table and DML operations to run concurrently without conflict.

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

@wanpengfei-git wanpengfei-git requested review from a team April 8, 2024 12:15
@mergify mergify bot assigned meegoo Apr 8, 2024
@wanpengfei-git wanpengfei-git requested a review from a team April 8, 2024 12:16
@meegoo meegoo force-pushed the online_optimize branch 3 times, most recently from c16063b to e29d447 Compare April 8, 2024 19:17
@wanpengfei-git wanpengfei-git removed the request for review from a team April 8, 2024 19:17
@meegoo meegoo marked this pull request as ready for review April 10, 2024 07:37
@meegoo meegoo requested review from a team as code owners April 10, 2024 07:37
@meegoo meegoo force-pushed the online_optimize branch 2 times, most recently from a872872 to 1c8ae7e Compare April 10, 2024 08:29
@meegoo meegoo changed the title [Feature] Support online optimize table [Enhancement] Support online optimize table Apr 10, 2024
@wanpengfei-git wanpengfei-git removed the request for review from a team April 10, 2024 08:31
@github-actions github-actions bot added the 3.3 label Apr 10, 2024
@meegoo meegoo force-pushed the online_optimize branch 4 times, most recently from b8f0d49 to fcaff5a Compare April 15, 2024 13:02
@meegoo meegoo force-pushed the online_optimize branch 3 times, most recently from b9d3704 to 44932a2 Compare May 8, 2024 13:06
@meegoo meegoo force-pushed the online_optimize branch 4 times, most recently from ec1fd30 to 95442a6 Compare May 15, 2024 12:15
@meegoo meegoo force-pushed the online_optimize branch 2 times, most recently from 0a6354f to 6438402 Compare May 31, 2024 09:22
@meegoo meegoo enabled auto-merge (squash) June 5, 2024 03:01
Copy link

sonarcloud bot commented Jun 14, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
21.9% Duplication on New Code (required ≤ 3%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

Copy link

[FE Incremental Coverage Report]

pass : 496 / 579 (85.66%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/alter/OptimizeTask.java 0 6 00.00% [54, 58, 59, 62, 66, 67]
🔵 com/starrocks/qe/StmtExecutor.java 7 22 31.82% [2233, 2234, 2235, 2236, 2237, 2238, 2240, 2241, 2242, 2243, 2244, 2245, 2246, 2248, 2249]
🔵 com/starrocks/task/PublishVersionTask.java 5 7 71.43% [78, 80]
🔵 com/starrocks/transaction/InsertTxnCommitAttachment.java 7 9 77.78% [75, 76]
🔵 com/starrocks/catalog/OlapTable.java 11 13 84.62% [383, 386]
🔵 com/starrocks/alter/OnlineOptimizeJobV2.java 356 407 87.47% [135, 136, 143, 153, 171, 179, 204, 205, 206, 233, 245, 254, 328, 359, 360, 362, 363, 370, 371, 372, 376, 430, 439, 440, 441, 450, 451, 452, 454, 470, 486, 488, 495, 496, 497, 498, 511, 532, 536, 539, 540, 541, 547, 567, 596, 604, 627, 636, 759, 760, 770]
🔵 com/starrocks/transaction/DatabaseTransactionMgr.java 40 45 88.89% [1039, 1040, 1201, 1202, 1203]
🔵 com/starrocks/transaction/OlapTableTxnStateListener.java 5 5 100.00% []
🔵 com/starrocks/catalog/Partition.java 1 1 100.00% []
🔵 com/starrocks/planner/OlapScanNode.java 3 3 100.00% []
🔵 com/starrocks/common/Config.java 1 1 100.00% []
🔵 com/starrocks/planner/OlapTableSink.java 25 25 100.00% []
🔵 com/starrocks/sql/ast/InsertStmt.java 4 4 100.00% []
🔵 com/starrocks/transaction/OlapTableTxnLogApplier.java 10 10 100.00% []
🔵 com/starrocks/alter/OptimizeJobV2Builder.java 9 9 100.00% []
🔵 com/starrocks/persist/gson/GsonUtils.java 1 1 100.00% []
🔵 com/starrocks/transaction/TransactionState.java 6 6 100.00% []
🔵 com/starrocks/transaction/PartitionCommitInfo.java 5 5 100.00% []

Copy link

[BE Incremental Coverage Report]

pass : 194 / 233 (83.26%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/runtime/load_channel_mgr.cpp 1 3 33.33% [182, 183]
🔵 be/src/runtime/load_channel.cpp 14 20 70.00% [161, 163, 164, 228, 230, 231]
🔵 be/src/storage/segment_replicate_executor.cpp 3 4 75.00% [151]
🔵 be/src/agent/publish_version.cpp 15 20 75.00% [169, 170, 171, 172, 173]
🔵 be/src/exec/multi_olap_table_sink.cpp 59 74 79.73% [94, 95, 96, 98, 117, 118, 119, 120, 121, 123, 126, 129, 130, 131, 133]
🔵 be/src/storage/txn_manager.cpp 34 42 80.95% [313, 314, 320, 321, 322, 323, 330, 348]
🔵 be/src/runtime/tablets_channel.h 6 7 85.71% [109]
🔵 be/src/exec/tablet_sink.cpp 21 22 95.45% [712]
🔵 be/src/runtime/lake_tablets_channel.cpp 1 1 100.00% []
🔵 be/src/exec/pipeline/fragment_executor.cpp 3 3 100.00% []
🔵 be/src/storage/tablet.cpp 8 8 100.00% []
🔵 be/src/exec/multi_olap_table_sink.h 1 1 100.00% []
🔵 be/src/exec/pipeline/olap_table_sink_operator.h 1 1 100.00% []
🔵 be/src/exec/tablet_sink.h 1 1 100.00% []
🔵 be/src/exec/tablet_sink_index_channel.cpp 3 3 100.00% []
🔵 be/src/exec/pipeline/olap_table_sink_operator.cpp 3 3 100.00% []
🔵 be/src/exec/data_sink.cpp 13 13 100.00% []
🔵 be/src/exec/async_data_sink.h 1 1 100.00% []
🔵 be/src/runtime/local_tablets_channel.cpp 3 3 100.00% []
🔵 be/src/storage/tablet_updates.cpp 3 3 100.00% []

@meegoo meegoo merged commit 16acc23 into StarRocks:main Jun 17, 2024
53 of 56 checks passed
Copy link

@Mergifyio backport branch-3.3

@github-actions github-actions bot removed the 3.3 label Jun 17, 2024
Copy link
Contributor

mergify bot commented Jun 17, 2024

backport branch-3.3

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Jun 17, 2024
Signed-off-by: meegoo <[email protected]>
(cherry picked from commit 16acc23)

# Conflicts:
#	be/src/agent/publish_version.cpp
#	be/src/exec/CMakeLists.txt
#	be/src/exec/tablet_sink_sender.cpp
#	be/src/exec/tablet_sink_sender.h
#	be/src/exec/write_combined_txn_log.cpp
#	be/src/exec/write_combined_txn_log.h
#	be/src/runtime/lake_tablets_channel.cpp
#	fe/fe-core/src/main/java/com/starrocks/planner/OlapScanNode.java
#	fe/fe-core/src/main/java/com/starrocks/task/PublishVersionTask.java
#	gensrc/proto/internal_service.proto
#	gensrc/thrift/AgentService.thrift
#	gensrc/thrift/DataSinks.thrift
meegoo added a commit that referenced this pull request Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants