StarRocks Migration Tool (SMT). Supports migration from mysql, postgresql, oracle, clickhouse, sqlserver and tidb #36508

alberttwong · 2023-12-06T05:52:57Z

alberttwong
Dec 6, 2023

As of Jan 3, 2023, it's been confirmed with StarRocks Technical Steering Committee member that SMT will be deprecated in favor of Apache Flink 3.0 CDC. Read https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-starrocks-pipeline-tutorial.html for more info about Apache Flink with StarRocks. Another option is to just use StarRocks’ mysql wire compatible protocol to write into the database.

----- Historical content

Synchronization process:

Synchronize database & table schema.

The SMT reads the schema of the source database & table to be synchronized and generates SQL files for creating a destination database & table in StarRocks. This operation is based on the source database and StarRocks information in SMT's configuration file.
Synchronize data.

a. The Flink SQL client executes the data loading statement INSERT INTO SELECT to submit one or more Flink jobs to the Flink cluster.

b. The Flink cluster runs the Flink jobs to obtain data. The Flink CDC connector first reads full historical data from the source database, then seamlessly switches to incremental reading, and sends the data to flink-starrocks-connector.

c. flink-starrocks-connector accumulates data in mini-batches, and synchronizes each batch of data to StarRocks.

Note

Only data manipulation language (DML) operations in source database can be synchronized to StarRocks. Data definition language (DDL) operations cannot be synchronized.

Scenarios

Real-time synchronization from source database like MySQL has a broad range of use cases where data is constantly changed. Take a real-world use case "real-time ranking of commodity sales" as an example.

Flink calculates the real-time ranking of commodity sales based on the original order table in MySQL and synchronizes the ranking to StarRocks' Primary Key table in real time. Users can connect a visualization tool to StarRocks to view the ranking in real time to gain on-demand operational insights.

Download it at https://www.starrocks.io/download/community

alberttwong · 2023-12-06T06:06:34Z

alberttwong
Dec 6, 2023
Author

RFE to support postgreSQL at #36509

1 reply

alberttwong Dec 6, 2023
Author

Built.

rad-pat · 2023-12-20T16:38:58Z

rad-pat
Dec 20, 2023

I'm struggling to get this to work against a Greenplum database which is essentially the same. Error output is not very helpful. This was the most common error:

2023/12/20 16:36:23 source/pgsql.go:124 SLOW SQL >= 200ms
[3893.899ms] [rows:600] SELECT "table_name","table_schema","table_catalog",pg_table_size(table_schema || '.' || table_name) as data_length,pg_indexes_size(table_schema || '.' || table_name) as index_length FROM "information_schema"."tables" WHERE table_type='BASE TABLE' and table_schema not in ('information_schema', 'p
g_catalog') ORDER BY TABLE_SCHEMA asc, TABLE_NAME asc
panic: Failed to get rows from information_schema.tables.

goroutine 1 [running]:
main.main()
        main.go:39 +0x959

1 reply

alberttwong Jan 4, 2024
Author

Here's another way using Apache Flink CDC 3.0. https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-starrocks-pipeline-tutorial.html

inviscid · 2023-12-20T16:40:08Z

inviscid
Dec 20, 2023

Hi - Is there a source repo we can fork and contribute to for SMT? I can't seem to find it in Github.

0 replies

alberttwong · 2024-01-04T01:44:39Z

alberttwong
Jan 4, 2024
Author

Another "better" method using Apache Flink CDC 3.0

This tutorial is to show how to quickly build a Streaming ELT job from MySQL to StarRocks using Flink CDC 3.0，including the feature of sync all table of one database, schema change evolution and sync sharding tables into one table.

https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-starrocks-pipeline-tutorial.html

0 replies

alberttwong · 2024-01-25T22:10:01Z

alberttwong
Jan 25, 2024
Author

Moved to https://forum.starrocks.io/t/starrocks-migration-tool-smt-supports-migration-from-mysql-postgresql-oracle-clickhouse-sqlserver-and-tidb/120

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StarRocks Migration Tool (SMT). Supports migration from mysql, postgresql, oracle, clickhouse, sqlserver and tidb #36508

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

StarRocks Migration Tool (SMT). Supports migration from mysql, postgresql, oracle, clickhouse, sqlserver and tidb #36508

alberttwong Dec 6, 2023

Scenarios

Replies: 5 comments · 2 replies

alberttwong Dec 6, 2023 Author

alberttwong Dec 6, 2023 Author

rad-pat Dec 20, 2023

alberttwong Jan 4, 2024 Author

inviscid Dec 20, 2023

alberttwong Jan 4, 2024 Author

alberttwong Jan 25, 2024 Author

alberttwong
Dec 6, 2023

Replies: 5 comments 2 replies

alberttwong
Dec 6, 2023
Author

alberttwong Dec 6, 2023
Author

rad-pat
Dec 20, 2023

alberttwong Jan 4, 2024
Author

inviscid
Dec 20, 2023

alberttwong
Jan 4, 2024
Author

alberttwong
Jan 25, 2024
Author