Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive external table tpch/tpcds query failed with 1T hdfs orc/parquet data: fail to read file #1315

Closed
tiannan-sr opened this issue Nov 16, 2021 · 4 comments
Assignees
Labels
external/hive type/bug Something isn't working

Comments

@tiannan-sr
Copy link

Steps to reproduce the behavior (Required)

  1. create hive table stored as orc or parquet, with 1T data;
  2. create hive external table;
  3. perform tpch/tpcds querys, and some querys failed:
    fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/item/part-00000-24b3c82c-15b8-4d7e-93e7-40d240850fc6-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073896452_155628 file=/stability_data/tpcds/parquet/item/part-00000-24b3c82c-15b8-4d7e-93e7-40d240850fc6-c000.snappy.parquet'), and time is: 2021-11-16 17:10:43
    {'status': False, 'msg': (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/item/part-00000-24b3c82c-15b8-4d7e-93e7-40d240850fc6-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073896452_155628 file=/stability_data/tpcds/parquet/item/part-00000-24b3c82c-15b8-4d7e-93e7-40d240850fc6-c000.snappy.parquet

Expected behavior (Required)

return the right result

Real behavior (Required)

hive外表tpcds 1T测试:hive表为hdfs文件系统parquet格式 ... == begin execute query54 ===
== begin execute query21 ===
== begin execute query14 ===
execute sql error and messages: (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/web_sales/part-00141-4b629069-7b34-45ab-8aec-54bad8becd95-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073913104_172280 file=/stability_data/tpcds/parquet/web_sales/part-00141-4b629069-7b34-45ab-8aec-54bad8becd95-c000.snappy.parquet'), and time is: 2021-11-16 16:34:40
{'status': False, 'msg': (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/web_sales/part-00141-4b629069-7b34-45ab-8aec-54bad8becd95-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073913104_172280 file=/stability_data/tpcds/parquet/web_sales/part-00141-4b629069-7b34-45ab-8aec-54bad8becd95-c000.snappy.parquet')}
== begin execute query63 ===
== begin execute query82 ===
== begin execute query98 ===
== begin execute query03 ===
== begin execute query01 ===
== begin execute query66 ===
== begin execute query12 ===
== begin execute query26 ===
== begin execute query32 ===
== begin execute query81 ===
== begin execute query51 ===
== begin execute query69 ===
== begin execute query38 ===
== begin execute query84 ===
== begin execute query05 ===
== begin execute query78 ===
execute sql error and messages: (1064, 'Memory exceed limit. QUERY Backend: 172.26.194.118, fragment: 0ac919be-46bc-11ec-81ff-00163e0a1b73 Used: 77421662232, Limit: 77309411328. Mem usage has exceed the limit of query pool'), and time is: 2021-11-16 17:04:49
{'status': False, 'msg': (1064, 'Memory exceed limit. QUERY Backend: 172.26.194.118, fragment: 0ac919be-46bc-11ec-81ff-00163e0a1b73 Used: 77421662232, Limit: 77309411328. Mem usage has exceed the limit of query pool')}
== begin execute query96 ===
execute sql error and messages: (1064, 'Memory exceed limit. AggrNode Backend: 172.26.194.118, fragment: 409d2913-46bc-11ec-81ff-00163e0a1afb Used: 77549723720, Limit: 77309411328. Mem usage has exceed the limit of query pool'), and time is: 2021-11-16 17:04:49
{'status': False, 'msg': (1064, 'Memory exceed limit. AggrNode Backend: 172.26.194.118, fragment: 409d2913-46bc-11ec-81ff-00163e0a1afb Used: 77549723720, Limit: 77309411328. Mem usage has exceed the limit of query pool')}
== begin execute query52 ===
execute sql error and messages: (1064, 'Memory exceed limit. AggrNode Backend: 172.26.194.118, fragment: 40a58d85-46bc-11ec-81ff-00163e0a1b05 Used: 77725048440, Limit: 77309411328. Mem usage has exceed the limit of query pool'), and time is: 2021-11-16 17:04:49
{'status': False, 'msg': (1064, 'Memory exceed limit. AggrNode Backend: 172.26.194.118, fragment: 40a58d85-46bc-11ec-81ff-00163e0a1b05 Used: 77725048440, Limit: 77309411328. Mem usage has exceed the limit of query pool')}
== begin execute query41 ===
execute sql error and messages: (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/item/part-00000-24b3c82c-15b8-4d7e-93e7-40d240850fc6-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073896452_155628 file=/stability_data/tpcds/parquet/item/part-00000-24b3c82c-15b8-4d7e-93e7-40d240850fc6-c000.snappy.parquet'), and time is: 2021-11-16 17:10:43
{'status': False, 'msg': (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/item/part-00000-24b3c82c-15b8-4d7e-93e7-40d240850fc6-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073896452_155628 file=/stability_data/tpcds/parquet/item/part-00000-24b3c82c-15b8-4d7e-93e7-40d240850fc6-c000.snappy.parquet')}
== begin execute query13 ===
== begin execute query97 ===
== begin execute query77 ===
== begin execute query94 ===
== begin execute query87 ===
== begin execute query47 ===
execute sql error and messages: (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/store_sales/part-00864-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073908743_167919 file=/stability_data/tpcds/parquet/store_sales/part-00864-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet'), and time is: 2021-11-16 17:38:24
{'status': False, 'msg': (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/store_sales/part-00864-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073908743_167919 file=/stability_data/tpcds/parquet/store_sales/part-00864-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet')}
== begin execute query37 ===
== begin execute query33 ===
execute sql error and messages: (1064, 'Memory exceed limit. QUERY Backend: 172.26.194.118, fragment: f4f7acde-46c0-11ec-81ff-00163e0a1af7 Used: 78619624672, Limit: 77309411328. Mem usage has exceed the limit of query pool'), and time is: 2021-11-16 17:39:41
{'status': False, 'msg': (1064, 'Memory exceed limit. QUERY Backend: 172.26.194.118, fragment: f4f7acde-46c0-11ec-81ff-00163e0a1af7 Used: 78619624672, Limit: 77309411328. Mem usage has exceed the limit of query pool')}
== begin execute query90 ===
== begin execute query44 ===
execute sql error and messages: (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/store_sales/part-00604-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073906202_165378 file=/stability_data/tpcds/parquet/store_sales/part-00604-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet'), and time is: 2021-11-16 17:43:15
{'status': False, 'msg': (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/store_sales/part-00604-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073906202_165378 file=/stability_data/tpcds/parquet/store_sales/part-00604-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet')}
== begin execute query09 ===
execute sql error and messages: (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/store_sales/part-00071-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073900746_159922 file=/stability_data/tpcds/parquet/store_sales/part-00071-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet'), and time is: 2021-11-16 17:46:13
{'status': False, 'msg': (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/store_sales/part-00071-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073900746_159922 file=/stability_data/tpcds/parquet/store_sales/part-00071-98421a60-c176-43dd-a30d-c84da1d7e359-c000.snappy.parquet')}
== begin execute query62 ===
== begin execute query92 ===
== begin execute query08 ===
== begin execute query89 ===
== begin execute query46 ===
== begin execute query02 ===
== begin execute query55 ===
== begin execute query43 ===
== begin execute query59 ===
== begin execute query27 ===
== begin execute query71 ===
== begin execute query36 ===
== begin execute query75 ===
execute sql error and messages: (1064, 'Memory exceed limit. HashJoinNode Backend: 172.26.194.122, fragment: 6444168d-46c8-11ec-81ff-00163e0a1c24 Used: 77317693528, Limit: 77309411328. Mem usage has exceed the limit of query pool'), and time is: 2021-11-16 18:32:40
{'status': False, 'msg': (1064, 'Memory exceed limit. HashJoinNode Backend: 172.26.194.122, fragment: 6444168d-46c8-11ec-81ff-00163e0a1c24 Used: 77317693528, Limit: 77309411328. Mem usage has exceed the limit of query pool')}
== begin execute query24 ===
execute sql error and messages: (1064, 'Memory exceed limit. QUERY Backend: 172.26.194.122, fragment: 863e5c65-46c8-11ec-81ff-00163e0a1ba5 Used: 78406370264, Limit: 77309411328. Mem usage has exceed the limit of query pool'), and time is: 2021-11-16 18:32:41
{'status': False, 'msg': (1064, 'Memory exceed limit. QUERY Backend: 172.26.194.122, fragment: 863e5c65-46c8-11ec-81ff-00163e0a1ba5 Used: 78406370264, Limit: 77309411328. Mem usage has exceed the limit of query pool')}
== begin execute query10 ===
execute sql error and messages: (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/catalog_sales/part-00134-a8e7e4f9-b48f-4d6c-bd86-3fbbb1127fb4-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073886164_145340 file=/stability_data/tpcds/parquet/catalog_sales/part-00134-a8e7e4f9-b48f-4d6c-bd86-3fbbb1127fb4-c000.snappy.parquet'), and time is: 2021-11-16 18:33:31
{'status': False, 'msg': (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/catalog_sales/part-00134-a8e7e4f9-b48f-4d6c-bd86-3fbbb1127fb4-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073886164_145340 file=/stability_data/tpcds/parquet/catalog_sales/part-00134-a8e7e4f9-b48f-4d6c-bd86-3fbbb1127fb4-c000.snappy.parquet')}
== begin execute query16 ===
execute sql error and messages: (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/catalog_sales/part-00043-a8e7e4f9-b48f-4d6c-bd86-3fbbb1127fb4-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073885387_144563 file=/stability_data/tpcds/parquet/catalog_sales/part-00043-a8e7e4f9-b48f-4d6c-bd86-3fbbb1127fb4-c000.snappy.parquet'), and time is: 2021-11-16 18:36:35
{'status': False, 'msg': (1064, 'fail to read file, file=hdfs://172.26.194.118:9002/stability_data/tpcds/parquet/catalog_sales/part-00043-a8e7e4f9-b48f-4d6c-bd86-3fbbb1127fb4-c000.snappy.parquet, error=error=Error(255): Unknown error 255, root_cause=BlockMissingException: Could not obtain block: BP-1847522660-172.26.194.118-1621306357667:blk_1073885387_144563 file=/stability_data/tpcds/parquet/catalog_sales/part-00043-a8e7e4f9-b48f-4d6c-bd86-3fbbb1127fb4-c000.snappy.parquet')}
== begin execute query74 ===
execute sql error and messages: (1064, 'Memory exceed limit. QUERY Backend: 172.26.194.121, fragment: 12beaf18-46c9-11ec-81ff-00163e0a1b39 Used: 77354211848, Limit: 77309411328. Mem usage has exceed the limit of query pool'), and time is: 2021-11-16 18:41:36
{'status': False, 'msg': (1064, 'Memory exceed limit. QUERY Backend: 172.26.194.121, fragment: 12beaf18-46c9-11ec-81ff-00163e0a1b39 Used: 77354211848, Limit: 77309411328. Mem usage has exceed the limit of query pool')}
== begin execute query76 ===
== begin execute query50 ===
== begin execute query99 ===
== begin execute query70 ===
== begin execute query73 ===
== begin execute query15 ===
== begin execute query40 ===
== begin execute query07 ===
== begin execute query68 ===
== begin execute query20 ===
== begin execute query57 ===
== begin execute query80 ===

StarRocks version (Required)

  • You can get the StarRocks version by executing SQL select current_version()
    mysql> select current_version();
    +-------------------+
    | current_version() |
    +-------------------+
    | UNKNOWN be897f6 |
    +-------------------+
    1 row in set (0.02 sec)
@tiannan-sr tiannan-sr added type/bug Something isn't working external/hive labels Nov 16, 2021
@stdpain
Copy link
Contributor

stdpain commented Nov 17, 2021

It looks like a corrupt file on HDFS, try the following command.

$HADOOP_HOME/bin/hadoop fs -checksum $HDFS_URL/stability_data/tpcds/parquet/catalog_sales/part-00043-a8e7e4f9-b48f-4d6c-bd86-3fbbb1127fb4-c000.snappy.parquet

If an error occurs, then the HDFS file is corrupt

@stdpain
Copy link
Contributor

stdpain commented Nov 17, 2021

we should check dfs.datanode.max.transfer.threads config in datanode

2021-11-17 14:58:46,779 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: stability03:50012:DataXceiverServer:
java.io.IOException: Xceiver count 4097 exceeds the limit of concurrent xcievers: 4096
        at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:150)
        at java.lang.Thread.run(Thread.java:745)

It looks like the problem is caused by a low number of threads.

@caneGuy
Copy link
Contributor

caneGuy commented Nov 17, 2021

Maybe you can change value of dfs.datanode.max.transfer.threads to 4096*2, please check the ioutil and cpu load of datanode first

dfs.datanode.max.transfer.threads

@stdpain
Copy link
Contributor

stdpain commented Nov 26, 2021

closed by #1394

caneGuy pushed a commit to caneGuy/starrocks that referenced this issue Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external/hive type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants