Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] show persistent index disk cost in be_tablets #35615

Merged
merged 1 commit into from
Nov 29, 2023

Conversation

luohaha
Copy link
Contributor

@luohaha luohaha commented Nov 23, 2023

Why I'm doing:
We can't get the persistent index disk usage from system table now.

What I'm doing:
Add each tablet's persistent index disk usage INDEX_DISK to be_tablets.

mysql> select * from be_tablets where tablet_id = 10121\G
*************************** 1. row ***************************
       BE_ID: 10004
    TABLE_ID: 10119
PARTITION_ID: 10118
   TABLET_ID: 10121
 NUM_VERSION: 4
 MAX_VERSION: 6
 MIN_VERSION: 4
  NUM_ROWSET: 1
     NUM_ROW: 5
   DATA_SIZE: 710
   INDEX_MEM: 119
 CREATE_TIME: 1700701817
       STATE: RUNNING
        TYPE: PRIMARY
    DATA_DIR: /home/disk5/luoyixin/starrocks-4/starrocks/output/be/storage
    SHARD_ID: 48
 SCHEMA_HASH: 1872891624
  INDEX_DISK: 152
1 row in set (0.02 sec)

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.2
    • 3.1
    • 3.0
    • 2.5

// INDEX_DISK
fill_column_with_slot<TYPE_BIGINT>(column.get(), (void*)&info.index_disk_usage);
break;
}
default:
break;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
Incorrect memory size definition for TYPE_VARCHAR columns in the _s_columns[] array initialization.

You can modify the code like this:

@@ -34,7 +34,7 @@ SchemaScanner::ColumnDesc SchemaBeTabletsScanner::_s_columns[] = {
     {"INDEX_MEM", TYPE_BIGINT, sizeof(int64_t), false},     {"CREATE_TIME", TYPE_BIGINT, sizeof(int64_t), false},
     // SIZE_VARCHAR has been changed from sizeof(StringValue) to an appropriate fixed buffer size or variable length handling mechanism
     {"STATE", TYPE_VARCHAR, appropriate_buffer_size, false},    {"TYPE", TYPE_VARCHAR, appropriate_buffer_size, false},
     {"DATA_DIR", TYPE_VARCHAR, appropriate_buffer_size, false}, {"SHARD_ID", TYPE_BIGINT, sizeof(int64_t), false},
-    {"SCHEMA_HASH", TYPE_BIGINT, sizeof(int64_t), false},
+    {"SCHEMA_HASH", TYPE_BIGINT, sizeof(int64_t), false},   {"INDEX_DISK", TYPE_BIGINT, sizeof(int64_t), false},
};

Remember to replace appropriate_buffer_size with the actual fixed size or dynamic size handling logic based on the real use case and data requirements.

Explanation:
The TYPE_VARCHAR likely expects a string value and therefore needs a correct buffer length to accommodate the string data. The sizeof(StringValue) here might not reflect the actual size needed for the VARCHAR data depending on how StringValue is defined. In common scenarios, VARCHAR types either specify a maximum length or are accompanied by dynamic sizing logic in the system to handle variable-length strings appropriately.

Copy link

sonarcloud bot commented Nov 28, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

warning The version of Java (11.0.21) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17.
Read more here

Copy link

[FE Incremental Coverage Report]

pass : 1 / 1 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/catalog/system/information/BeTabletsSystemTable.java 1 1 100.00% []

Copy link

[BE Incremental Coverage Report]

fail : 0 / 14 (00.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 src/storage/tablet_updates.cpp 0 10 00.00% [2823, 2824, 2828, 2829, 3969, 3970, 3971, 3974, 3975, 3977]
🔵 src/exec/schema_scanner/schema_be_tablets_scanner.cpp 0 4 00.00% [95, 188, 190, 191]

@chaoyli chaoyli merged commit 956f397 into StarRocks:main Nov 29, 2023
48 of 49 checks passed
Copy link

@Mergifyio backport branch-3.2

@github-actions github-actions bot removed the 3.2 label Nov 29, 2023
Copy link

@Mergifyio backport branch-3.1

@github-actions github-actions bot removed the 3.1 label Nov 29, 2023
Copy link
Contributor

mergify bot commented Nov 29, 2023

backport branch-3.2

✅ Backports have been created

Copy link
Contributor

mergify bot commented Nov 29, 2023

backport branch-3.1

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Nov 29, 2023
We can't get the persistent index disk usage from system table now.
Add each tablet's persistent index disk usage `INDEX_DISK` to `be_tablets`.

```
mysql> select * from be_tablets where tablet_id = 10121\G
*************************** 1. row ***************************
       BE_ID: 10004
    TABLE_ID: 10119
PARTITION_ID: 10118
   TABLET_ID: 10121
 NUM_VERSION: 4
 MAX_VERSION: 6
 MIN_VERSION: 4
  NUM_ROWSET: 1
     NUM_ROW: 5
   DATA_SIZE: 710
   INDEX_MEM: 119
 CREATE_TIME: 1700701817
       STATE: RUNNING
        TYPE: PRIMARY
    DATA_DIR: /home/disk5/luoyixin/starrocks-4/starrocks/output/be/storage
    SHARD_ID: 48
 SCHEMA_HASH: 1872891624
  INDEX_DISK: 152
1 row in set (0.02 sec)
```

(cherry picked from commit 956f397)
mergify bot pushed a commit that referenced this pull request Nov 29, 2023
We can't get the persistent index disk usage from system table now.
Add each tablet's persistent index disk usage `INDEX_DISK` to `be_tablets`.

```
mysql> select * from be_tablets where tablet_id = 10121\G
*************************** 1. row ***************************
       BE_ID: 10004
    TABLE_ID: 10119
PARTITION_ID: 10118
   TABLET_ID: 10121
 NUM_VERSION: 4
 MAX_VERSION: 6
 MIN_VERSION: 4
  NUM_ROWSET: 1
     NUM_ROW: 5
   DATA_SIZE: 710
   INDEX_MEM: 119
 CREATE_TIME: 1700701817
       STATE: RUNNING
        TYPE: PRIMARY
    DATA_DIR: /home/disk5/luoyixin/starrocks-4/starrocks/output/be/storage
    SHARD_ID: 48
 SCHEMA_HASH: 1872891624
  INDEX_DISK: 152
1 row in set (0.02 sec)
```

(cherry picked from commit 956f397)

# Conflicts:
#	be/src/storage/tablet_updates.cpp
#	fe/fe-core/src/test/resources/sql/scheduler/schema_scan.sql
chaoyli pushed a commit that referenced this pull request Nov 30, 2023
luohaha added a commit to luohaha/starrocks that referenced this pull request Dec 1, 2023
…s#35615)

We can't get the persistent index disk usage from system table now.
Add each tablet's persistent index disk usage `INDEX_DISK` to `be_tablets`.

```
mysql> select * from be_tablets where tablet_id = 10121\G
*************************** 1. row ***************************
       BE_ID: 10004
    TABLE_ID: 10119
PARTITION_ID: 10118
   TABLET_ID: 10121
 NUM_VERSION: 4
 MAX_VERSION: 6
 MIN_VERSION: 4
  NUM_ROWSET: 1
     NUM_ROW: 5
   DATA_SIZE: 710
   INDEX_MEM: 119
 CREATE_TIME: 1700701817
       STATE: RUNNING
        TYPE: PRIMARY
    DATA_DIR: /home/disk5/luoyixin/starrocks-4/starrocks/output/be/storage
    SHARD_ID: 48
 SCHEMA_HASH: 1872891624
  INDEX_DISK: 152
1 row in set (0.02 sec)
```
luohaha added a commit to luohaha/starrocks that referenced this pull request Dec 1, 2023
…s#35615)

We can't get the persistent index disk usage from system table now.
Add each tablet's persistent index disk usage `INDEX_DISK` to `be_tablets`.

```
mysql> select * from be_tablets where tablet_id = 10121\G
*************************** 1. row ***************************
       BE_ID: 10004
    TABLE_ID: 10119
PARTITION_ID: 10118
   TABLET_ID: 10121
 NUM_VERSION: 4
 MAX_VERSION: 6
 MIN_VERSION: 4
  NUM_ROWSET: 1
     NUM_ROW: 5
   DATA_SIZE: 710
   INDEX_MEM: 119
 CREATE_TIME: 1700701817
       STATE: RUNNING
        TYPE: PRIMARY
    DATA_DIR: /home/disk5/luoyixin/starrocks-4/starrocks/output/be/storage
    SHARD_ID: 48
 SCHEMA_HASH: 1872891624
  INDEX_DISK: 152
1 row in set (0.02 sec)
```
luohaha added a commit to luohaha/starrocks that referenced this pull request Dec 2, 2023
…s#35615)

We can't get the persistent index disk usage from system table now.
Add each tablet's persistent index disk usage `INDEX_DISK` to `be_tablets`.

```
mysql> select * from be_tablets where tablet_id = 10121\G
*************************** 1. row ***************************
       BE_ID: 10004
    TABLE_ID: 10119
PARTITION_ID: 10118
   TABLET_ID: 10121
 NUM_VERSION: 4
 MAX_VERSION: 6
 MIN_VERSION: 4
  NUM_ROWSET: 1
     NUM_ROW: 5
   DATA_SIZE: 710
   INDEX_MEM: 119
 CREATE_TIME: 1700701817
       STATE: RUNNING
        TYPE: PRIMARY
    DATA_DIR: /home/disk5/luoyixin/starrocks-4/starrocks/output/be/storage
    SHARD_ID: 48
 SCHEMA_HASH: 1872891624
  INDEX_DISK: 152
1 row in set (0.02 sec)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants