Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 Bug]: the status endpoint of a restarted node is called multiple times #13646

Closed
joerg1985 opened this issue Mar 1, 2024 · 5 comments · Fixed by #13647
Closed

[🐛 Bug]: the status endpoint of a restarted node is called multiple times #13646

joerg1985 opened this issue Mar 1, 2024 · 5 comments · Fixed by #13647

Comments

@joerg1985
Copy link
Member

What happened?

Restarting a node will register a new node to the hub, with the same status URL.
The restarted node will now get multiple calls to the status endpoint.

The GridModel does detect this state and updated the internal model correctly, but i am not sure about the Map<NodeId, Node> nodes inside the LocalDistributor. The HealthCheck inside the RemoteNode is not stopped and keeps calling the status endpoint.

How can we reproduce the issue?

1. run `java -jar selenium-server-4.17.0.jar hub --host 127.0.0.1 --port 4567`
2. run `java -Dwebdriver.chrome.driver=../chromedriver.exe -jar selenium-server-4.17.0.jar node --port 5555 --hub http://127.0.0.1:4567`
3. wait 30s, terminate the process from step #2
4. rerun command from #2, loop a view times

Relevant log output

N/A

Operating System

Win 10 x64

Selenium version

4.17.0

What are the browser(s) and version(s) where you see this issue?

N/A

What are the browser driver(s) and version(s) where you see this issue?

N/A

Are you using Selenium Grid?

4.17.0

Copy link

github-actions bot commented Mar 1, 2024

@joerg1985, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@diemol
Copy link
Member

diemol commented Mar 1, 2024

I don't fully understand, what do you mean by multiple times?

@joerg1985
Copy link
Member Author

@diemol Depending on the number of restarts the number of calls to the /status endpoint grows.

Let's say the node A is restarted 9 times, while the node B is not restated.
Each time LocalDistributor.runNodeHealthChecks is excuted, the /status endpoint of the node A is called 10 times and the /status endpoint of node B is called only once.

@joerg1985
Copy link
Member Author

These are the corresponding logs of the hub, the restarted node is registered for each restart and all instances (38217db3-43be-4e33-bb74-979ee92aa572, 8a7f62b9-37dd-4f6f-ba29-0f02f4f721f5, 6058d8ef-889a-454a-b9d6-1487e7690dc3, da5fe55b-768b-44aa-91aa-042f781475af) share the same status URL, therefore the health check will report UP.

The PR will ensure only da5fe55b-768b-44aa-91aa-042f781475af will reported UP and all others will be reported DOWN. This will ensure they will timeout and removed from the hub.

09:10:01.122 INFO [Hub.execute] - Started Selenium Hub 4.17.0 (revision e52b1be057*): http://127.0.0.1:4567
09:10:09.491 INFO [Node.<init>] - Binding additional locator mechanisms: relative
09:10:09.784 INFO [GridModel.setAvailability] - Switching Node 38217db3-43be-4e33-bb74-979ee92aa572 (uri: http://192.168.1.26:5555) from DOWN to UP
09:10:09.784 INFO [LocalDistributor.add] - Added node 38217db3-43be-4e33-bb74-979ee92aa572 at http://192.168.1.26:5555. Health check every 120s
09:10:16.422 INFO [Node.<init>] - Binding additional locator mechanisms: relative
09:10:16.499 INFO [GridModel.add] - Re-adding node with id 8a7f62b9-37dd-4f6f-ba29-0f02f4f721f5 and URI http://192.168.1.26:5555.
09:10:16.505 INFO [GridModel.setAvailability] - Switching Node 8a7f62b9-37dd-4f6f-ba29-0f02f4f721f5 (uri: http://192.168.1.26:5555) from DOWN to UP
09:10:16.506 INFO [LocalDistributor.add] - Added node 8a7f62b9-37dd-4f6f-ba29-0f02f4f721f5 at http://192.168.1.26:5555. Health check every 120s
09:10:23.483 INFO [Node.<init>] - Binding additional locator mechanisms: relative
09:10:23.555 INFO [GridModel.add] - Re-adding node with id 6058d8ef-889a-454a-b9d6-1487e7690dc3 and URI http://192.168.1.26:5555.
09:10:23.560 INFO [GridModel.setAvailability] - Switching Node 6058d8ef-889a-454a-b9d6-1487e7690dc3 (uri: http://192.168.1.26:5555) from DOWN to UP
09:10:23.560 INFO [LocalDistributor.add] - Added node 6058d8ef-889a-454a-b9d6-1487e7690dc3 at http://192.168.1.26:5555. Health check every 120s
09:10:32.462 INFO [Node.<init>] - Binding additional locator mechanisms: relative
09:10:32.538 INFO [GridModel.add] - Re-adding node with id da5fe55b-768b-44aa-91aa-042f781475af and URI http://192.168.1.26:5555.
09:10:32.542 INFO [GridModel.setAvailability] - Switching Node da5fe55b-768b-44aa-91aa-042f781475af (uri: http://192.168.1.26:5555) from DOWN to UP
09:10:32.543 INFO [LocalDistributor.add] - Added node da5fe55b-768b-44aa-91aa-042f781475af at http://192.168.1.26:5555. Health check every 120s

Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants