You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using the DataHub project and have cloned the repository to my local environment. I successfully ran the debug container using Docker and accessed the UI via port 9002.
However, when ingesting metadata via the CLI, I found that comments for tables and fields from my data source cannot be parsed. Upon analyzing the logs, I identified the root cause of the issue.
In a native MySQL environment, the metadata ingestion works fine, and the comments are correctly ingested into the description attributes of the entities. However, my actual production environment uses StarRocks. Since DataHub does not directly support StarRocks, I access the metadata using the MySQL compatibility mode provided by StarRocks.
The problem arises because the SHOW CREATE TABLE DDL output from StarRocks differs from native MySQL. Specifically, StarRocks uses double quotes (") to enclose comments in the COMMENT section of the DDL, whereas native MySQL uses single quotes ('). This difference prevents DataHub from parsing the comments correctly.
I am using the DataHub project and have cloned the repository to my local environment. I successfully ran the debug container using Docker and accessed the UI via port 9002.
However, when ingesting metadata via the CLI, I found that comments for tables and fields from my data source cannot be parsed. Upon analyzing the logs, I identified the root cause of the issue.
In a native MySQL environment, the metadata ingestion works fine, and the comments are correctly ingested into the description attributes of the entities. However, my actual production environment uses StarRocks. Since DataHub does not directly support StarRocks, I access the metadata using the MySQL compatibility mode provided by StarRocks.
The problem arises because the SHOW CREATE TABLE DDL output from StarRocks differs from native MySQL. Specifically, StarRocks uses double quotes (") to enclose comments in the COMMENT section of the DDL, whereas native MySQL uses single quotes ('). This difference prevents DataHub from parsing the comments correctly.
For example:
[2024-12-02 18:10:39,281] WARNING {py.warnings:109} - /projects/datahub/metadata-ingestion/src/datahub/ingestion/source/sql/sql_common.py:928: SAWarning: Unknown schema content: 'COMMENT "XXXXXX"'
columns = inspector.get_columns(table, schema)
The text was updated successfully, but these errors were encountered: