Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renaissance tests fail on test-skytap-ubuntu2004-ppc64le-1 #2358

Closed
Haroon-Khel opened this issue Oct 18, 2021 · 7 comments
Closed

Renaissance tests fail on test-skytap-ubuntu2004-ppc64le-1 #2358

Haroon-Khel opened this issue Oct 18, 2021 · 7 comments

Comments

@Haroon-Khel
Copy link
Contributor

Several renaissance tests from the jdk11 hotspot extended perf suite fail only on test-skytap-ubuntu2004-ppc64le-1

Passes on test-osuosl-ubuntu1804-ppc64le-1 https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1776/console

Latest failure https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1775/console

Failed tests

renaissance-als_0
renaissance-chi-square_0
renaissance-dec-tree_0
renaissance-gauss-mix_0
renaissance-log-regression_0
renaissance-movie-lens_0

Log is too long to post. Can be found here:
https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/1775/tapResults/

@sxa
Copy link
Member

sxa commented Oct 19, 2021

Is this likely to be the same underlying problem as this: #2355 ?

@Haroon-Khel
Copy link
Contributor Author

Is this likely to be the same underlying problem as this: #2355 ?

Possibly. The error logs do seem net related

@Haroon-Khel
Copy link
Contributor Author

https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/2546/tapResults/

21/11/15 11:16:52 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/11/15 11:16:52 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
21/11/15 11:16:52 INFO SecurityManager: Changing view acls to: jenkins
21/11/15 11:16:52 INFO SecurityManager: Changing modify acls to: jenkins
21/11/15 11:16:52 INFO SecurityManager: Changing view acls groups to: 
21/11/15 11:16:52 INFO SecurityManager: Changing modify acls groups to: 
21/11/15 11:16:52 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(jenkins); groups with view permissions: Set(); users  with modify permissions: Set(jenkins); groups with modify permissions: Set()
21/11/15 11:16:52 INFO PlatformDependent: Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system unstability.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
21/11/15 11:16:52 ERROR SparkContext: Error initializing SparkContext.
java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries! Consider explicitly setting the appropriate port for the service 'sparkDriver' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.
	at java.base/sun.nio.ch.Net.bind0(Native Method)
	at java.base/sun.nio.ch.Net.bind(Net.java:459)
	at java.base/sun.nio.ch.Net.bind(Net.java:448)
	at java.base/sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:227)
	at java.base/sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:80)
	at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)

@Haroon-Khel
Copy link
Contributor Author

I think #2355 (comment) affects this test case too. I have added 10.0.0.1 to the machine's /etc/hosts and now renaissance-als_0 passes on the machine in an ssh environment.
Rerunning all of the tests https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/2550/console

@Haroon-Khel
Copy link
Contributor Author

The tests passed 👍🏻

@Haroon-Khel
Copy link
Contributor Author

Ive reenabled the machine for testing. Closing

@sxa
Copy link
Member

sxa commented Nov 23, 2021

@Haroon-Khel What is going to happen the next time the playbooks are run on the machine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants