Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Parse Comments in StarRocks Metadata Using DataHub MySQL Ingestion #12009

Open
szRyu666 opened this issue Dec 3, 2024 · 0 comments
Labels
bug Bug report

Comments

@szRyu666
Copy link

szRyu666 commented Dec 3, 2024

I am using the DataHub project and have cloned the repository to my local environment. I successfully ran the debug container using Docker and accessed the UI via port 9002.

However, when ingesting metadata via the CLI, I found that comments for tables and fields from my data source cannot be parsed. Upon analyzing the logs, I identified the root cause of the issue.

In a native MySQL environment, the metadata ingestion works fine, and the comments are correctly ingested into the description attributes of the entities. However, my actual production environment uses StarRocks. Since DataHub does not directly support StarRocks, I access the metadata using the MySQL compatibility mode provided by StarRocks.

The problem arises because the SHOW CREATE TABLE DDL output from StarRocks differs from native MySQL. Specifically, StarRocks uses double quotes (") to enclose comments in the COMMENT section of the DDL, whereas native MySQL uses single quotes ('). This difference prevents DataHub from parsing the comments correctly.

For example:
[2024-12-02 18:10:39,281] WARNING {py.warnings:109} - /projects/datahub/metadata-ingestion/src/datahub/ingestion/source/sql/sql_common.py:928: SAWarning: Unknown schema content: 'COMMENT "XXXXXX"'
columns = inspector.get_columns(table, schema)

@szRyu666 szRyu666 added the bug Bug report label Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
Development

No branches or pull requests

1 participant