feat(executor): Introduce `QueryProfileManager` to collect query profilings #11760

leiysky · 2023-06-14T16:56:19Z

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

Introduce a QueryProfileManager to collect query profilings.

Each databend-query process has a global instance of QueryProfileManager. Every time we execute an EXPLAIN ANALYZE statement, it would collect the profile information and store it in an LRU storage with fixed capacity(20 queries by default).

Currently, it's implemented with volatile storage, which means if we restart the databend-query server, we will lose the query profiles. As soon as the format of the query profile is stable, we can store the data with persistent storage, e.g. S3.

We can query the profiling data from the system table system.query_profile, here's an example:

mysql> explain analyze select sum(cast(cast(a as string) as int)) from t;
+------------------------------------------------------------------------------------------------------------------------------------------+
| explain                                                                                                                                  |
+------------------------------------------------------------------------------------------------------------------------------------------+
| EvalScalar                                                                                                                               |
| ├── expressions: [sum(CAST(CAST(a AS STRING) AS Int32)) (#2)]                                                                            |
| ├── estimated rows: 1.00                                                                                                                 |
| ├── total cpu time: 0.010499999999999999ms                                                                                               |
| └── AggregateFinal                                                                                                                       |
|     ├── group by: []                                                                                                                     |
|     ├── aggregate functions: [sum(sum_arg_0)]                                                                                            |
|     ├── estimated rows: 1.00                                                                                                             |
|     ├── total cpu time: 0.271459ms                                                                                                       |
|     └── AggregatePartial                                                                                                                 |
|         ├── group by: []                                                                                                                 |
|         ├── aggregate functions: [sum(sum_arg_0)]                                                                                        |
|         ├── estimated rows: 1.00                                                                                                         |
|         ├── total cpu time: 8.070584ms                                                                                                   |
|         └── EvalScalar                                                                                                                   |
|             ├── expressions: [to_int32(to_string(t.a (#0)))]                                                                             |
|             ├── estimated rows: 1000000.00                                                                                               |
|             ├── total cpu time: 267.459292ms                                                                                             |
|             └── TableScan                                                                                                                |
|                 ├── table: default.default.t                                                                                             |
|                 ├── read rows: 1000000                                                                                                   |
|                 ├── read bytes: 2724149                                                                                                  |
|                 ├── partitions total: 10                                                                                                 |
|                 ├── partitions scanned: 10                                                                                               |
|                 ├── pruning stats: [segments: <range pruning: 10 to 10>, blocks: <range pruning: 10 to 10, bloom pruning: 0 to 0>]       |
|                 ├── push downs: [filters: [], limit: NONE]                                                                               |
|                 └── estimated rows: 1000000.00                                                                                           |
+------------------------------------------------------------------------------------------------------------------------------------------+
27 rows in set (0.12 sec)
Read 1000000 rows, 3.81 MiB in 0.072 sec., 13.91 million rows/sec., 53.05 MiB/sec.

mysql> select * from system.query_profile;
+--------------------------------------+---------+------------------+-------------+-----------+
| query_id                             | plan_id | plan_name        | description | cpu_time  |
+--------------------------------------+---------+------------------+-------------+-----------+
| e91b8537-a410-4cab-bc8a-c5bf512144b3 |       0 | TableScan        |             |         0 |
| e91b8537-a410-4cab-bc8a-c5bf512144b3 |       1 | EvalScalar       |             | 267459292 |
| e91b8537-a410-4cab-bc8a-c5bf512144b3 |       2 | AggregatePartial |             |   8070584 |
| e91b8537-a410-4cab-bc8a-c5bf512144b3 |       3 | AggregateFinal   |             |    271459 |
| e91b8537-a410-4cab-bc8a-c5bf512144b3 |       4 | EvalScalar       |             |     10500 |
+--------------------------------------+---------+------------------+-------------+-----------+
5 rows in set (0.03 sec)
Read 5 rows, 443.00 B in 0.005 sec., 981.81 rows/sec., 84.95 KiB/sec.

Future works

We can introduce an enable_query_profiling setting to allow it stores the profile information for all the common queries, e.g. SELECT and INSERT .. SELECT ... statements.

It's possible to display the profile data in a graphical way in the future to help analyze queries.

And besides, we can fetch the query profile data from other nodes in the same databend cluster and aggregate them into a complete query profile for distributed queries.

Part of #4238

vercel · 2023-06-14T16:56:24Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
databend	⬜️ Ignored (Inspect)			Jun 15, 2023 5:44am

BohuTANG · 2023-06-15T00:31:30Z

Can we show the 'TableScan' cost time? This is a important metrics.

leiysky · 2023-06-15T02:56:56Z

Can we show the 'TableScan' cost time? This is a important metrics.

It will be supported later. Since we only support recording the execution time of Processor::process for now, the cost of TableScan can not be measured properly.

…ilings (databendlabs#11760) * introduce profile manager to collect query profilings * fix license header * fix license header * format

leiysky requested review from BohuTANG and zhang2014 June 14, 2023 16:56

mergify bot added the pr-feature this PR introduces a new feature to the codebase label Jun 14, 2023

leiysky added 3 commits June 15, 2023 13:37

introduce profile manager to collect query profilings

8634040

fix license header

255dd7e

fix license header

c38d941

leiysky force-pushed the profile-manager branch from 2d67eb9 to c38d941 Compare June 15, 2023 05:40

format

17b4ae1

andylokandy approved these changes Jun 15, 2023

View reviewed changes

BohuTANG merged commit ffed29b into databendlabs:main Jun 15, 2023

leiysky deleted the profile-manager branch June 15, 2023 06:28

leiysky mentioned this pull request Jun 27, 2023

Tracking: query profiling framework #11874

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(executor): Introduce `QueryProfileManager` to collect query profilings #11760

feat(executor): Introduce `QueryProfileManager` to collect query profilings #11760

leiysky commented Jun 14, 2023 •

edited

Loading

vercel bot commented Jun 14, 2023 •

edited

Loading

BohuTANG commented Jun 15, 2023

leiysky commented Jun 15, 2023

feat(executor): Introduce QueryProfileManager to collect query profilings #11760

feat(executor): Introduce QueryProfileManager to collect query profilings #11760

Conversation

leiysky commented Jun 14, 2023 • edited Loading

Summary

Future works

vercel bot commented Jun 14, 2023 • edited Loading

BohuTANG commented Jun 15, 2023

leiysky commented Jun 15, 2023

feat(executor): Introduce `QueryProfileManager` to collect query profilings #11760

feat(executor): Introduce `QueryProfileManager` to collect query profilings #11760

leiysky commented Jun 14, 2023 •

edited

Loading

vercel bot commented Jun 14, 2023 •

edited

Loading