StarRocks Migration Tool (SMT). Supports migration from mysql, postgresql, oracle, clickhouse, sqlserver and tidb #36508
Replies: 5 comments 2 replies
-
RFE to support postgreSQL at #36509 |
Beta Was this translation helpful? Give feedback.
-
I'm struggling to get this to work against a Greenplum database which is essentially the same. Error output is not very helpful. This was the most common error:
|
Beta Was this translation helpful? Give feedback.
-
Hi - Is there a source repo we can fork and contribute to for SMT? I can't seem to find it in Github. |
Beta Was this translation helpful? Give feedback.
-
Another "better" method using Apache Flink CDC 3.0 This tutorial is to show how to quickly build a Streaming ELT job from MySQL to StarRocks using Flink CDC 3.0,including the feature of sync all table of one database, schema change evolution and sync sharding tables into one table. |
Beta Was this translation helpful? Give feedback.
-
As of Jan 3, 2023, it's been confirmed with StarRocks Technical Steering Committee member that SMT will be deprecated in favor of Apache Flink 3.0 CDC. Read https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-starrocks-pipeline-tutorial.html for more info about Apache Flink with StarRocks. Another option is to just use StarRocks’ mysql wire compatible protocol to write into the database.
----- Historical content
Synchronization process:
Synchronize database & table schema.
The SMT reads the schema of the source database & table to be synchronized and generates SQL files for creating a destination database & table in StarRocks. This operation is based on the source database and StarRocks information in SMT's configuration file.
Synchronize data.
a. The Flink SQL client executes the data loading statement
INSERT INTO SELECT
to submit one or more Flink jobs to the Flink cluster.b. The Flink cluster runs the Flink jobs to obtain data. The Flink CDC connector first reads full historical data from the source database, then seamlessly switches to incremental reading, and sends the data to flink-starrocks-connector.
c. flink-starrocks-connector accumulates data in mini-batches, and synchronizes each batch of data to StarRocks.
Note
Only data manipulation language (DML) operations in source database can be synchronized to StarRocks. Data definition language (DDL) operations cannot be synchronized.
Scenarios
Real-time synchronization from source database like MySQL has a broad range of use cases where data is constantly changed. Take a real-world use case "real-time ranking of commodity sales" as an example.
Flink calculates the real-time ranking of commodity sales based on the original order table in MySQL and synchronizes the ranking to StarRocks' Primary Key table in real time. Users can connect a visualization tool to StarRocks to view the ranking in real time to gain on-demand operational insights.
Download it at https://www.starrocks.io/download/community
Beta Was this translation helpful? Give feedback.
All reactions