feat: partition table query optimize #1594

zealchen · 2024-11-12T17:49:14Z

Rationale

Detailed Changes

TLDR

The performance issue with inlist queries is due to the extra overhead from bloom-filter-like directory lookups when scanning each SST file for rows. The solution is to create a separate predicate for each partition, containing only the keys relevant to that partition. Since the current partition filter only supports BinaryExpr(Column, operator, Literal) and non-negated InList expressions, this solution will address only those specific cases.

Changes

During the scan building process, when identifying the partitions for a query, we create a PartitionedFilterKeyIndex variable to store the predicate key indices for each expression.
In the compute_partition_for_keys_group function, we use a HashMap<partition_id, HashMap<filter_index, BTreeSet<key_index>>> to record the indices of keys involved in partition computation for each group.
In the partitioned_predicates function, we construct the final predicates for each partition.
In resolve_partitioned_scan_internal, we generate separate requests for each partition.

e.g.
conditions:

table schema: col_ts, col1, col2, in which col1 and col2 are both keys,
and with two partitions
sql: select * from table where col1 = '33' and col2 in ("aa", "bb",
"cc", "dd")

partition expectations:
yield two predicates
p0: col1 = '33' and col2 in ("aa", "bb", "cc");
p1: col1 = '33' and col2 in ("dd")

Other issues discovered

When the inlist key args length is less than three, Expr will be refactored to nested BinaryExpr which bypasses the FilterExtractor.

e.g.
SQL: select * from table where col1 in ("aa", "bb") and col2 in (1,2,3,4,5...1000)
Since ("aa", "bb") has fewer than three elements, the col1 key filter is not included in partition computation, which interrupts the partitioning process in the get_candidate_partition_keys_groups function, as contains_empty_filter is set to true.

Test Plan

UT: test_partitioned_predicate
Manual test.

src/df_engine_extensions/src/dist_sql_query/resolver.rs

jiacai2050 · 2024-11-20T08:34:51Z

src/partition_table_engine/src/scan_builder.rs

    },
    provider::TableScanBuilder,
    remote::model::TableIdentifier,
    table::ReadRequest,
 };

+use super::partitioned_predicates;


Our codebase should prefer absolute import over super import.

src/partition_table_engine/src/scan_builder.rs

src/table_engine/src/partition/rule/df_adapter/mod.rs

jiacai2050 · 2024-11-20T09:33:36Z

src/table_engine/src/partition/rule/key.rs

        let mut partitions = BTreeSet::new();
+        // Retrieve all the key DatumView instances along with their corresponding
+        // indices related to their positions in the predicate inlist. Since DatumView


horaedb/src/common_types/src/datum.rs

Line 1313 in e065ba7

impl<'a> std::hash::Hash for DatumView<'a> {

DatumView already impl Hash

Only the Hash trait is not adequate, since a HashSet requires that the elements implement the Eq and Hash traits. Why don't you implement Eq, btw?

This BTreeSet in PartitionedFilterKeyIndex is used to deduplicate the inlist key value that contributes to the partition calculation.

integration_tests/cases/env/cluster/ddl/partition_table.sql

jiacai2050

LGTM

jiacai2050 · 2024-11-25T09:33:21Z

src/table_engine/src/partition/rule/df_adapter/mod.rs

    },
    BuildPartitionRule, PartitionInfo, Result,
 };

 mod extractor;

+pub type PartitionId = usize; // partiton number (id)


👍 for comments, it make the code more readable!

feat: partition table query optimize

30e85ee

github-actions bot added the feature New feature or request label Nov 12, 2024

Merge branch 'main' into feat_optimize_query_with_inlist

c69d4bc

jiacai2050 self-requested a review November 13, 2024 03:03

jiacai2050 reviewed Nov 20, 2024

View reviewed changes

src/df_engine_extensions/src/dist_sql_query/resolver.rs Show resolved Hide resolved

jiacai2050 reviewed Nov 20, 2024

View reviewed changes

src/partition_table_engine/src/scan_builder.rs Outdated Show resolved Hide resolved

jiacai2050 reviewed Nov 20, 2024

View reviewed changes

src/partition_table_engine/src/scan_builder.rs Outdated Show resolved Hide resolved

jiacai2050 reviewed Nov 20, 2024

View reviewed changes

zealchen added 4 commits November 22, 2024 22:13

cr fix

c891698

cr fix

dbc4df1

ut fix

6d99284

integration test fix

0cc9a0b

zealchen commented Nov 24, 2024

View reviewed changes

integration_tests/cases/env/cluster/ddl/partition_table.sql Show resolved Hide resolved

zealchen and others added 2 commits November 24, 2024 23:35

optimize partitioned predicates creation

ce4bae5

refactor

b1374a7

jiacai2050 approved these changes Nov 25, 2024

View reviewed changes

jiacai2050 reviewed Nov 25, 2024

View reviewed changes

jiacai2050 merged commit e2970b1 into apache:main Nov 25, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: partition table query optimize #1594

feat: partition table query optimize #1594

zealchen commented Nov 12, 2024 •

edited by jiacai2050

Loading

jiacai2050 Nov 20, 2024

jiacai2050 Nov 20, 2024

zealchen Nov 23, 2024 •

edited

Loading

zealchen Nov 23, 2024

jiacai2050 left a comment

jiacai2050 Nov 25, 2024 •

edited

Loading

feat: partition table query optimize #1594

feat: partition table query optimize #1594

Conversation

zealchen commented Nov 12, 2024 • edited by jiacai2050 Loading

Rationale

Detailed Changes

TLDR

Changes

Other issues discovered

Test Plan

jiacai2050 Nov 20, 2024

Choose a reason for hiding this comment

jiacai2050 Nov 20, 2024

Choose a reason for hiding this comment

zealchen Nov 23, 2024 • edited Loading

Choose a reason for hiding this comment

zealchen Nov 23, 2024

Choose a reason for hiding this comment

jiacai2050 left a comment

Choose a reason for hiding this comment

jiacai2050 Nov 25, 2024 • edited Loading

Choose a reason for hiding this comment

zealchen commented Nov 12, 2024 •

edited by jiacai2050

Loading

zealchen Nov 23, 2024 •

edited

Loading

jiacai2050 Nov 25, 2024 •

edited

Loading