Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky-test: BookKeeperPackagesStorageTest.setUp #16860

Closed
codelipenghui opened this issue Jul 29, 2022 · 7 comments
Closed

Flaky-test: BookKeeperPackagesStorageTest.setUp #16860

codelipenghui opened this issue Jul 29, 2022 · 7 comments

Comments

@codelipenghui
Copy link
Contributor

example failure

Error:  Tests run: 31, Failures: 1, Errors: 0, Skipped: 28, Time elapsed: 45.57 s <<< FAILURE! - in org.apache.pulsar.packages.management.storage.bookkeeper.BookKeeperPackagesStorageTest
  Error:  setUp(org.apache.pulsar.packages.management.storage.bookkeeper.BookKeeperPackagesStorageTest)  Time elapsed: 24.472 s  <<< FAILURE!
  org.apache.bookkeeper.bookie.BookieException$MetadataStoreException: Failed to initialize metadata bookie driver
  	at org.apache.bookkeeper.bookie.BookieResources.createMetadataDriver(BookieResources.java:64)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase$ServerTester.<init>(BookKeeperClusterTestCase.java:849)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.startBookie(BookKeeperClusterTestCase.java:674)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.startAndAddBookie(BookKeeperClusterTestCase.java:654)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.startNewBookieAndReturnAddress(BookKeeperClusterTestCase.java:643)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.startNewBookie(BookKeeperClusterTestCase.java:636)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.startBKCluster(BookKeeperClusterTestCase.java:267)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.setUp(BookKeeperClusterTestCase.java:181)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.setUp(BookKeeperClusterTestCase.java:166)
  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
  	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132)
  	at org.testng.internal.MethodInvocationHelper.invokeMethodConsideringTimeout(MethodInvocationHelper.java:61)
  	at org.testng.internal.ConfigInvoker.invokeConfigurationMethod(ConfigInvoker.java:366)
  	at org.testng.internal.ConfigInvoker.invokeConfigurations(ConfigInvoker.java:320)
  	at org.testng.internal.TestInvoker.runConfigMethods(TestInvoker.java:701)
  	at org.testng.internal.TestInvoker.invokeMethod(TestInvoker.java:527)
  	at org.testng.internal.TestInvoker.invokeTestMethod(TestInvoker.java:174)
  	at org.testng.internal.MethodRunner.runInSequence(MethodRunner.java:46)
  	at org.testng.internal.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:822)
  	at org.testng.internal.TestInvoker.invokeTestMethods(TestInvoker.java:147)
  	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146)
  	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:128)
  	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
  	at org.testng.TestRunner.privateRun(TestRunner.java:764)
  	at org.testng.TestRunner.run(TestRunner.java:585)
  	at org.testng.SuiteRunner.runTest(SuiteRunner.java:384)
  	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:378)
  	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:337)
  	at org.testng.SuiteRunner.run(SuiteRunner.java:286)
  	at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:53)
  	at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:96)
  	at org.testng.TestNG.runSuitesSequentially(TestNG.java:1218)
  	at org.testng.TestNG.runSuitesLocally(TestNG.java:1140)
  	at org.testng.TestNG.runSuites(TestNG.java:1069)
  	at org.testng.TestNG.run(TestNG.java:1037)
  	at org.apache.maven.surefire.testng.TestNGExecutor.run(TestNGExecutor.java:135)
  	at org.apache.maven.surefire.testng.TestNGDirectoryTestSuite.executeSingleClass(TestNGDirectoryTestSuite.java:112)
  	at org.apache.maven.surefire.testng.TestNGDirectoryTestSuite.executeLazy(TestNGDirectoryTestSuite.java:123)
  	at org.apache.maven.surefire.testng.TestNGDirectoryTestSuite.execute(TestNGDirectoryTestSuite.java:90)
  	at org.apache.maven.surefire.testng.TestNGProvider.invoke(TestNGProvider.java:146)
  	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
  	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
  	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
  	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
  Caused by: org.apache.bookkeeper.meta.exceptions.MetadataException: Failed to create zookeeper client to localhost:37709
  	at org.apache.bookkeeper.meta.zk.ZKMetadataDriverBase.initialize(ZKMetadataDriverBase.java:236)
  	at org.apache.bookkeeper.meta.zk.ZKMetadataBookieDriver.initialize(ZKMetadataBookieDriver.java:55)
  	at org.apache.bookkeeper.bookie.BookieResources.createMetadataDriver(BookieResources.java:61)
  	... 46 more
  Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
  	at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
  	at org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase.waitForConnection(ZooKeeperWatcherBase.java:159)
  	at org.apache.bookkeeper.zookeeper.ZooKeeperClient$Builder.build(ZooKeeperClient.java:260)
  	at org.apache.bookkeeper.meta.zk.ZKMetadataDriverBase.initialize(ZKMetadataDriverBase.java:216)
  	... 48 more
  
  [INFO] 
@codelipenghui
Copy link
Contributor Author

Looks similar with #15773, but the stack is not 100% same

@codelipenghui
Copy link
Contributor Author

@codelipenghui
Copy link
Contributor Author

@coderzc
Copy link
Member

coderzc commented Sep 23, 2022

@poorbarcode
Copy link
Contributor

congbobo184 added a commit that referenced this issue Sep 29, 2022
Fixes: #15773  #16863 #16860

### Motivation
```
  Error:  Tests run: 11, Failures: 1, Errors: 0, Skipped: 3, Time elapsed: 87.06 s <<< FAILURE! - in org.apache.pulsar.packages.management.storage.bookkeeper.BookKeeperPackagesStorageTest
  Error:  setUp(org.apache.pulsar.packages.management.storage.bookkeeper.BookKeeperPackagesStorageTest)  Time elapsed: 13.089 s  <<< FAILURE!
  org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
  	at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
  	at org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase.waitForConnection(ZooKeeperWatcherBase.java:159)
  	at org.apache.bookkeeper.zookeeper.ZooKeeperClient$Builder.build(ZooKeeperClient.java:260)
  	at org.apache.bookkeeper.test.ZooKeeperUtil.restartCluster(ZooKeeperUtil.java:133)
  	at org.apache.bookkeeper.test.ZooKeeperUtil.startCluster(ZooKeeperUtil.java:104)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.startZKCluster(BookKeeperClusterTestCase.java:238)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.setUp(BookKeeperClusterTestCase.java:178)
  	at org.apache.pulsar.packages.management.storage.bookkeeper.bookkeeper.test.BookKeeperClusterTestCase.setUp(BookKeeperClusterTestCase.java:166)
  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
  	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132)
  	at org.testng.internal.MethodInvocationHelper.invokeMethodConsideringTimeout(MethodInvocationHelper.java:61)
  	at org.testng.internal.ConfigInvoker.invokeConfigurationMethod(ConfigInvoker.java:366)
  	at org.testng.internal.ConfigInvoker.invokeConfigurations(ConfigInvoker.java:320)
  	at org.testng.internal.TestInvoker.runConfigMethods(TestInvoker.java:701)
  	at org.testng.internal.TestInvoker.invokeMethod(TestInvoker.java:527)
  	at org.testng.internal.TestInvoker.invokeTestMethod(TestInvoker.java:174)
  	at org.testng.internal.MethodRunner.runInSequence(MethodRunner.java:46)
  	at org.testng.internal.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:822)
  	at org.testng.internal.TestInvoker.invokeTestMethods(TestInvoker.java:147)
  	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146)
  	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:128)
  	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
  	at org.testng.TestRunner.privateRun(TestRunner.java:764)
  	at org.testng.TestRunner.run(TestRunner.java:585)
  	at org.testng.SuiteRunner.runTest(SuiteRunner.java:384)
  	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:378)
  	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:337)
  	at org.testng.SuiteRunner.run(SuiteRunner.java:286)
  	at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:53)
  	at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:96)
  	at org.testng.TestNG.runSuitesSequentially(TestNG.java:1218)
  	at org.testng.TestNG.runSuitesLocally(TestNG.java:1140)
  	at org.testng.TestNG.runSuites(TestNG.java:1069)
  	at org.testng.TestNG.run(TestNG.java:1037)
  	at org.apache.maven.surefire.testng.TestNGExecutor.run(TestNGExecutor.java:135)
  	at org.apache.maven.surefire.testng.TestNGDirectoryTestSuite.executeSingleClass(TestNGDirectoryTestSuite.java:112)
  	at org.apache.maven.surefire.testng.TestNGDirectoryTestSuite.executeLazy(TestNGDirectoryTestSuite.java:123)
  	at org.apache.maven.surefire.testng.TestNGDirectoryTestSuite.execute(TestNGDirectoryTestSuite.java:90)
  	at org.apache.maven.surefire.testng.TestNGProvider.invoke(TestNGProvider.java:146)
  	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
  	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
  	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
  	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
```

The root cause is that the zk client randomly selects IPV4 and IPV6 when parsing localhost, can connect when using IPV4, and fails when using IPV6. Therefore, if you continue to randomly connect to IPV6, the connection will timeout.

https://github.com/apache/zookeeper/blob/bc1b231c9e32667b2978c86a6a64833470973dbd/zookeeper-server/src/main/java/org/apache/zookeeper/client/StaticHostProvider.java#L140-L146
Thanks to @poorbarcode  for helping me locate the problem

### Modifications
add     @AfterMethod(alwaysRun = true)
use Adress replace hostName

### Documentation

- [x] `doc-not-needed` 

### Matching PR in the forked repository

PR in forked repository: 

- congbobo184#1
@congbobo184
Copy link
Contributor

#17865 has fixed it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants