-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hudi exception reading data. com.google.common.util.concurrent.UncheckedExecutionException: java.lang.NullPointerException: null value in entry: date=2018-08-31=null #23374
Comments
In beehive you can see that I can do show tables and show data in tables.
|
showing the output in SR.
|
This issue is due to starrocks relying on metadata from hive metatsore for hudi queries. We will fix this issue later. Currently, there is a solution: |
Just to confirm. This is an issue with our implementation with Apache Hudi and not something Hudi would fix. |
Confirmed with StarRocks engineering that it is an issue on StarRocks' side. |
It worked when I applied the repair and refresh. I also tried it again and I only needed to do the refresh command to make it work.
|
We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks! |
I think this needs to be reopened, trying today with 3.2.2 allin1 and Hudi 0.14.1 and seeing:
Issue a refresh and try the query:
|
The main cause of this issue is we use hms to get metadata, and in Hudi, user need to enable metadata sink to hms when spark of flink modify hudi table. That is, user need to set |
I will check, thanks so much! |
@wangsimo0 Can you show me how to configure this? I am using the Docker Demo at https://hudi.apache.org/docs/docker_demo/ I tried these env vars in the containers:
But no change, I still have to |
@DanRoscigno By still have to refresh, do you get error after select? It's possible if you are quering new partitions. The core reason is like select from hive. We cache meta in starrocks, if the hudi table is being ingested or updated, starrocks cannot get the latest information, and we haven't supported refresh metacache cyclical so there is no way we can know the metadata update. So error may happen when user is querying new partitions because starrocks doesn't have those partitions cache. Also if the old partition is updated, starrocks will return the old data because of the cache. This is absolutely not user-friendly. We are planning to use Hudi SDK to get hudi metadata to solve this problem completely, however, we don't have sufficient manpower and we are seeking for community developers who are interested in this to work with us. By now, unfortunately we do have this limitation. |
After a refresh everything seems fine. Note that with OneHouse Hudi @alberttwong did not have to refresh, everything worked on the first try. I will update the docs to include a refresh of each table. Please let me know when this changes and I will remove the refresh from the docs. Thanks for explaining this to me and your help @wangsimo0 |
We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks! |
I hit the same exception when using StarRocks to read AWS Glue external tables.
|
Also, I could only refresh the external table by specifying partitions. When I removed the |
Debugged the issue locally. It seems to be related to |
Why I'm doing: See #23374 (comment) What I'm doing: Made max_hive_partitions_per_rpc mutable. Signed-off-by: Xingcan Cui <[email protected]>
Instructions
Follow the Hudi Docker Quickstart. https://hudi.apache.org/docs/docker_demo
Modifed docker-compose_hadoop284_hive233_spark244_mac_aarch64.yml to include starrocks in the hudi docker compose. Also you need to apply apache/hudi#8700 if they haven't merged it in yet.
Do all the steps in the hudi docker compose quickstart. When you can do a show tables with beehive, you know that tables are ready and SR should be able to connect.
login to the SR container within the hudi docker compose
then execute the sql commands.
The text was updated successfully, but these errors were encountered: