Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUPPORT] Hive meta sync null pointer issue #11955

Closed
bytesid19 opened this issue Sep 18, 2024 · 13 comments
Closed

[SUPPORT] Hive meta sync null pointer issue #11955

bytesid19 opened this issue Sep 18, 2024 · 13 comments

Comments

@bytesid19
Copy link

bytesid19 commented Sep 18, 2024

Tips before filing an issue

  • Have you gone through our FAQs?

  • Join the mailing list to engage in conversations and get faster support at [email protected].

  • If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced

We recently migrated our hudi version from 0.10.1 to 0.15.0 in our spark jobs.
Hudi options at that time

'hoodie.table.name': 'table_name',
'hoodie.datasource.write.table.name':'table_name',
'hoodie.datasource.write.operation': 'upsert',
'hoodie.datasource.write.keygenerator.class': 'org.apache.hudi.keygen.ComplexKeyGenerator',
'hoodie.datasource.write.recordkey.field': hudi_record_key,
'hoodie.datasource.write.partitionpath.field': "txn_date",
'hoodie.datasource.write.precombine.field': 'ingestion_timestamp',
'hoodie.upsert.shuffle.parallelism': 100,
'hoodie.insert.shuffle.parallelism': 100

During that time hive meta sync was not enabled.

Now we had the requirement to sync data to hive via hms mode, for which we added following options to above

'hoodie.datasource.meta.sync.enable': 'true',
'hoodie.datasource.hive_sync.mode': 'hms',
'hoodie.database.name': '##db_name##',
'hoodie.datasource.hive_sync.metastore.uris': '##hive_uri##'

When we triggered a new spark job after that to write new data to the same hudi table, the job ran successfully and added the new data to hudi table but did not update hudi properties.
But now whenever we are triggering the spark job to write new data, job is failing and we are getting null pointer exception at org.apache.hudi.common.table.timeline.TimelineUtils.lambda$null$5(TimelineUtils.java:114)
at java.util.HashMap.forEach(HashMap.java:1290)
at org.apache.hudi.common.table.timeline.TimelineUtils.lambda$getDroppedPartitions$6(TimelineUtils.java:113)

To Reproduce

Steps to reproduce the behavior:

  1. Migrate hudi version from 0.10.1 to 0.15.0 and start writing to hudi table w/o hive sync.
  2. Add hive meta sync related properties to hudi options.
  3. Write to hudi table via spark jobs.

Expected behavior

  • The job should complete successfully.
  • Hudi table to be updated with new data.
  • Hudi options to include hoodie.database.name.
  • Metadata to be synced to hive successfully.

Environment Description

  • Hudi version :
    0.15.0

  • Spark version :
    3.3.2

  • Hive version :
    3.1.2

  • Hadoop version :
    3.0.0

  • Storage (HDFS/S3/GCS..) :
    S3

  • Running on Docker? (yes/no) :
    no

Additional context

Add any other context about the problem here.

Stacktrace

: org.apache.hudi.exception.HoodieMetaSyncException: Could not sync using the meta sync class org.apache.hudi.hive.HiveSyncTool
	at org.apache.hudi.sync.common.util.SyncUtilHelpers.runHoodieMetaSync(SyncUtilHelpers.java:81)
	at org.apache.hudi.HoodieSparkSqlWriterInternal.$anonfun$metaSync$2(HoodieSparkSqlWriter.scala:1015)
	at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
	at org.apache.hudi.HoodieSparkSqlWriterInternal.metaSync(HoodieSparkSqlWriter.scala:1013)
	at org.apache.hudi.HoodieSparkSqlWriterInternal.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:1112)
	at org.apache.hudi.HoodieSparkSqlWriterInternal.writeInternal(HoodieSparkSqlWriter.scala:508)
	at org.apache.hudi.HoodieSparkSqlWriterInternal.write(HoodieSparkSqlWriter.scala:187)
	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:125)
	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:168)
	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:49)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.$anonfun$sideEffectResult$1(commands.scala:82)
	at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:80)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:79)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:91)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$3(QueryExecution.scala:256)
	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:165)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:256)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$9(SQLExecution.scala:258)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:448)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:203)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1073)
	at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:131)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:398)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:255)
	at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:238)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:251)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:244)
	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:519)
	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:106)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:519)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:339)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:335)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:495)
	at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:244)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:395)
	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:244)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:198)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:189)
	at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:305)
	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:964)
	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:429)
	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:396)
	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:250)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
	at py4j.Gateway.invoke(Gateway.java:306)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
	at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hudi.exception.HoodieException: Got runtime exception when hive syncing LH820YOhaC01kt
	at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:170)
	at org.apache.hudi.sync.common.util.SyncUtilHelpers.runHoodieMetaSync(SyncUtilHelpers.java:79)
	... 58 more
Caused by: java.lang.NullPointerException
	at org.apache.hudi.common.table.timeline.TimelineUtils.lambda$null$5(TimelineUtils.java:114)
	at java.util.HashMap.forEach(HashMap.java:1290)
	at org.apache.hudi.common.table.timeline.TimelineUtils.lambda$getDroppedPartitions$6(TimelineUtils.java:113)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
	at org.apache.hudi.common.table.timeline.TimelineUtils.getDroppedPartitions(TimelineUtils.java:110)
	at org.apache.hudi.sync.common.HoodieSyncClient.getDroppedPartitionsSince(HoodieSyncClient.java:97)
	at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:289)
	at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:179)
	at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:167)
	... 59 more```

@danny0405
Copy link
Contributor

It looks like the clean instant is missing:

          try {
            HoodieCleanMetadata cleanMetadata = TimelineMetadataUtils.deserializeHoodieCleanMetadata(cleanerTimeline.getInstantDetails(instant).get());
            cleanMetadata.getPartitionMetadata().forEach((partition, partitionMetadata) -> {
              if (partitionMetadata.getIsPartitionDeleted()) {
                partitionToLatestDeleteTimestamp.put(partition, instant.getTimestamp());
              }
            });
          }

@bytesid19
Copy link
Author

@danny0405 What is the above process doing and how will it help solve our use case ??

@danny0405
Copy link
Contributor

Did you check your clean metadata file and see what the partition metadata looks like?

@bytesid19
Copy link
Author

bytesid19 commented Sep 19, 2024

@danny0405 Before that I want to understand what you think might have happened which led you to propose the above solution.
Why is hive sync throwing null pointer in this case ?

It would have worked fine without exceptions had we directly migrated hudi version from 0.10.1 to 0.15.0 with hive sync enabled in hudi properties.

But we migrated hudi version from 0.10.1 to 0.15.0 without hive sync initially. And then enabled it in hudi options.

Can you explain what difference would have led to this issue.

@bytesid19
Copy link
Author

Also my partition metadata in .clean file looks like

{
   "2023-11-02":{
      "partitionPath":"2023-11-02",
      "policy":"KEEP_LATEST_COMMITS",
      "deletePathPatterns":[
         "e3af0439-4282-4505-88f9-0c70b39c8882-0_0-189-1130_20231126175139078.parquet"
      ],
      "successDeleteFiles":[
         "e3af0439-4282-4505-88f9-0c70b39c8882-0_0-189-1130_20231126175139078.parquet"
      ],
      "failedDeleteFiles":[
         
      ],
      "isPartitionDeleted":false
   },
   "2023-11-03":{
      "partitionPath":"2023-11-03",
      "policy":"KEEP_LATEST_COMMITS",
      "deletePathPatterns":[
         "bc234ec1-0b3f-4a15-86b6-7df504f80551-0_0-53-1247_20231126174353127.parquet",
         "bc234ec1-0b3f-4a15-86b6-7df504f80551-0_0-189-1104_20231126172830225.parquet",
         "bc234ec1-0b3f-4a15-86b6-7df504f80551-0_0-245-1376_20231126173720486.parquet"
      ],
      "successDeleteFiles":[
         "bc234ec1-0b3f-4a15-86b6-7df504f80551-0_0-53-1247_20231126174353127.parquet",
         "bc234ec1-0b3f-4a15-86b6-7df504f80551-0_0-189-1104_20231126172830225.parquet",
         "bc234ec1-0b3f-4a15-86b6-7df504f80551-0_0-245-1376_20231126173720486.parquet"
      ],
      "failedDeleteFiles":[
         
      ],
      "isPartitionDeleted":false
   }
}

@danny0405
Copy link
Contributor

danny0405 commented Sep 19, 2024

I have no idea why the hive sync could affect the hoodie properties, did you diff the hoodie.properties between the two kind of operations?

Why is hive sync throwing null pointer in this case

Just from the error stacktrace, it looks like related with the clean metadata.

@bytesid19
Copy link
Author

bytesid19 commented Sep 20, 2024

Hudi options while writing to hudi table when we migrated from 0.10.1 to 0.15.0

'hoodie.table.name': 'table_name',
'hoodie.datasource.write.table.name':'table_name',
'hoodie.datasource.write.operation': 'upsert',
'hoodie.datasource.write.keygenerator.class': 'org.apache.hudi.keygen.ComplexKeyGenerator',
'hoodie.datasource.write.recordkey.field': hudi_record_key,
'hoodie.datasource.write.partitionpath.field': "txn_date",
'hoodie.datasource.write.precombine.field': 'ingestion_timestamp',
'hoodie.upsert.shuffle.parallelism': 100,
'hoodie.insert.shuffle.parallelism': 100

hoodie.properties after that migration

#Updated at 2024-09-03T12:04:40.615Z
#Tue Sep 03 12:04:40 UTC 2024
hoodie.table.precombine.field=ingestion_timestamp
hoodie.table.partition.fields=txn_date
hoodie.table.type=COPY_ON_WRITE
hoodie.archivelog.folder=archived
hoodie.populate.meta.fields=true
hoodie.timeline.layout.version=1
hoodie.table.version=6
hoodie.table.metadata.partitions=files
hoodie.table.recordkey.fields=record_id
hoodie.table.base.file.format=PARQUET
hoodie.datasource.write.partitionpath.urlencode=false
hoodie.table.keygenerator.class=org.apache.hudi.keygen.ComplexKeyGenerator
hoodie.table.name=table_name
hoodie.table.metadata.partitions.inflight=
hoodie.datasource.write.hive_style_partitioning=false
hoodie.table.checksum=849148821

extra hudi options added later for hive sync

'hoodie.datasource.meta.sync.enable': 'true',
'hoodie.datasource.hive_sync.mode': 'hms',
'hoodie.database.name': '##db_name##',
'hoodie.datasource.hive_sync.metastore.uris': '##hive_uri##'

after adding above options hoodie.properties never changed.

@bytesid19
Copy link
Author

@danny0405
We are also trying hive sync tool, but that is also failing

Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1380)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1361)
        at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:119)
        at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:108)
        at org.apache.hudi.hive.HiveSyncTool.main(HiveSyncTool.java:547)

when running

./run_sync_tool.sh  --sync-mode hms --metastore-uris uri:9083/qbdb --partitioned-by txn_date --base-path s3://abc/bytesid/table --database test --table table

hadoop version : Hadoop 3.3.0
hive version : 3.1.3
Please help here

@danny0405
Copy link
Contributor

Exception in thread "main" java.lang.NoSuchMethodError

This is a hadoop jar conflict.

@bytesid19
Copy link
Author

Was able to solve for this but still getting other exceptions
Steps

  • git clone https://github.com/apache/hudi.git (master)
  • built the whole package
  • installed hadoop 2.10.2 - set HADOOP_HOME and path
  • installed hive 2.3.4 - set HIVE_HOME and path
  • ran ./run_sync_tool.sh --sync-mode hms --metastore-uris uri:9083/qbdb --partitioned-by txn_date --base-path s3://abc/bytesid/table --database test --table table

Getting exception

Exception in thread “main” org.apache.hudi.exception.HoodieException: Unable to create org.apache.hudi.storage.hadoop.HoodieHadoopStorage
        at org.apache.hudi.storage.HoodieStorageUtils.getStorage(HoodieStorageUtils.java:44)
        at org.apache.hudi.common.table.HoodieTableMetaClient.getStorage(HoodieTableMetaClient.java:395)
        at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:103)
        at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:909)
        at org.apache.hudi.sync.common.HoodieSyncTool.buildMetaClient(HoodieSyncTool.java:75)
        at org.apache.hudi.hive.HiveSyncTool.lambda$new$0(HiveSyncTool.java:128)
        at org.apache.hudi.common.util.Option.orElseGet(Option.java:153)
        at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:128)
        at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:108)
        at org.apache.hudi.hive.HiveSyncTool.main(HiveSyncTool.java:547)
Caused by: org.apache.hudi.exception.HoodieException: Unable to instantiate class org.apache.hudi.storage.hadoop.HoodieHadoopStorage
        at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:75)
        at org.apache.hudi.storage.HoodieStorageUtils.getStorage(HoodieStorageUtils.java:41)
        ... 9 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:73)
        ... 10 more
Caused by: org.apache.hudi.exception.HoodieIOException: Failed to get instance of org.apache.hadoop.fs.FileSystem
        at org.apache.hudi.hadoop.fs.HadoopFSUtils.getFs(HadoopFSUtils.java:128)
        at org.apache.hudi.hadoop.fs.HadoopFSUtils.getFs(HadoopFSUtils.java:119)
        at org.apache.hudi.storage.hadoop.HoodieHadoopStorage.<init>(HoodieHadoopStorage.java:64)
        ... 15 more
Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme “s3"
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3239)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3259)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3310)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3278)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)
        at org.apache.hudi.hadoop.fs.HadoopFSUtils.getFs(HadoopFSUtils.java:126)
        ... 17 more

Please help here.

@danny0405
Copy link
Contributor

you need to add the aws sdk jar which contains the s3 filesystem.

@ad1happy2go ad1happy2go moved this from ⏳ Awaiting Triage to 🏁 Triaged in Hudi Issue Support Oct 22, 2024
@ad1happy2go
Copy link
Collaborator

@bytesid19 Let us know in case you need any more help here. Thanks. Feel free to close if we are all good here

@ad1happy2go
Copy link
Collaborator

@bytesid19 Closing the issue. Feel free to reopen or create new one in case of any more issues. Thanks.

@github-project-automation github-project-automation bot moved this from 🏁 Triaged to ✅ Done in Hudi Issue Support Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

3 participants