-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[neeo] Convert to OH jetty HttpClient #15571
Conversation
Signed-off-by: Ben Rosenblum <[email protected]>
The only issue I have with this implementation is that it's not quite as responsive when there are multiple button clicks in a short period of time (e.g. moving around a menu). Each button click sends a ton of sensor updates to the brain so it looks like it jams up. There is a thread pool which is supposed to be handling this but I'm worried that we end up bottling up at the client side since we only have one client. It may be a better approach to roll a separate client per thread in the pool? EDIT: A potential fix would be to use to generate a new HttpClient for each request. That said, the code comments explicitly say to not do this, probably to avoid the issues that this PR fixes. I think there needs to be a separate HttpClient per thread in the pool to avoid congestion, but they need to be in a pool and reused. EDIT2: And just to keep track, this is where we probably need to maintain the pool as this is what's sending the messages to api.notify() |
Signed-off-by: Ben Rosenblum <[email protected]>
I found a happy medium. For the load intensive threads (notifications), a Stack of HttpClient's is maintained. As a thread needs one, it can pop one off the top. When it's done it returns it to the stack. Stack should not get larger than the size of the thread pool (in my case that's usually about 5 threads). Remote is much more responsive now and the threads stay in control. |
Signed-off-by: Ben Rosenblum <[email protected]>
I didn't look into detail in every bit of code, but to me it looks like there is a general design issue. As an example:
You can call I believe this is also the case for there calls and you can probably fix your performance issues without doing fancy "several HTTP clients" stuff. |
I would definitely agree there are some design issues here. The binding and the additional io addon seem over complicated. Unfortunately the neeo remote has been a dead product for years now. Several of us still use them so for the moment it's worth fixing. At some point this is going to end up falling off when the juice isn't worth the squeeze. The real problem with this particular issue is the amount of data being sent per key press. I didn't count it but somewhere around 40 http post/get sessions happen for every key press on the remote. Most of it is updates to some kind of sensor data which all seem to never seem to change. That huge burst is what's causing the log jam because the actual single http session with the actual key press gets burried in there. The more hits, the more delays. I watched last night as I rapidly flew through a menu and the sensor hits went for almost 15 seconds after I stopped because of how deep the queue was. In my case also, I wasn't running any recipes at all. Just pressing a button. I have no doubt that what you're saying is spot on, there's probably a dozen of those splashed around the code. I'm unfortunately out of time to keep jamming at fixing this, yesterday was my only break for the next week or so and this won because my OH was just unstable. IMHO, the real fix for this is finding out why we're flooding the brain with sensor updates that are likely unnecessary. I appreciate you giving it a look for those of us who still use this. My understanding of how this binding works is high level at best. |
So I was REALLY wrong with my count. I just did a few tests. On average, EVERY BUTTON CLICK creates 244 sensor updates to the brain. This is causing a massive log jam. One thing to note of value, it looks like the actual doGet below isn't using the sensor values httpClient, so in theory we could just create one client for the sensors and let it take the bulk of the load and then let everything else use the built in jetty client. It would eliminate the need for the stack, just slow down the sensor updates (which I have no idea what that is, I've never used it). Also interesting to note, it looks like it runs through both of the Neeo remotes I have every time (122 per). You can see the neeo brain ID change between the top and bottom.
|
EDIT: This is a bug but not caused by this PR. If DEBUG logging is enabled this exception happens when updating anything on the UI. Disabling debug for io.neeo resolves this. This may have broke something else. I can't make changes on the UI now...
|
I am running into the same issue with the OH4.1 milestone. I never had the issue before. I had to disable the Neeo IO service to make my system stable again. I didn't have time to do further analysis on my own, but would appreciate this fix gets included. I don't think it can make things worse, and the whole product is in maintenance mode anyway. |
Even after removing the Neeo transport I still have a growth in httpclient threads. So there still is something else going on. |
I think I found the reason I still have the problem. I am also using the Neeo binding (not transport). It also has a similar servlet, probably with the same problem: https://github.com/openhab/openhab-addons/blob/main/bundles/org.openhab.binding.neeo/src/main/java/org/openhab/binding/neeo/internal/handler/NeeoForwardActionsServlet.java |
Try this... Uninstall both transport and binding. Wait for OH to dispose of everything. Shutdown OH. Install https://github.com/morph166955/openhab-addons/releases/download/neeo/org.openhab.io.neeo-4.1.0-SNAPSHOT.jar into your addons folder. Start OH. Only install the binding now. My system stabilized with the updated io module. |
I will try that. |
Not sure. My usage is limited to creating a few virtual devices and pushing them down to the brain. I let my OH rules handle the rest. The remote is nothing more than a nice interface for me. |
Did as you requested, installed your version of the transport, all OK. |
Mine leveled off after a while. Always stayed under 100 so it never destabilized the system. I have been having problems with one of the brains. I think the instability caused something to go crazy. My remote won't connect anymore, says the data from the brain is bad somehow. I have noticed that every time I power cycled the brains the threads went into a funky state and didn't release. I'll probably have to restart OH to clear those for now. |
Signed-off-by: Mark Herwege <[email protected]>
I just created a PR against your branch, applying the same changes you did for the transport to the binding. So far, that stabilizes my system. |
Awesome. It's very likely you don't need the whole stack part. You can probably just take the parts to convert to the built in httpClient from jetty. I didn't see anywhere on the binding side where it did the massive sensor bursts that the io did. |
That's what I thought, so I didn't include that in the binding. So far it works for me. If you can test as well, the combination potentially stabilizes our Neeos for a while. |
I'm on my phone at the moment. Do you have a jar somewhere I can pull? Do you want to submit a PR to my branch and we can just do one PR here? |
I already did a PR to your branch. Here is a jar: https://www.dropbox.com/scl/fi/llh10txvry7ba6zknci9w/org.openhab.binding.neeo-4.1.0-SNAPSHOT.jar?rlkey=b5pr0qx6yqhkey482qsygjjle&dl=0 |
Applied transport fix to binding
Merged your changes into this PR! Thank you! I also just finally fixed the brain that went sideways and it looks like things are working smoothly here. |
I restarted OH and this is now what I'm at:
This makes sense, I have 2 brains and each is building a thread pool that is 4 deep (I've seen 5 happen before but no larger). I'm "OK" with this because it's not really putting a huge strain on the system AND it's stable. While I do agree with @J-N-K that it's probably not the best fix, I think it's a good middle of the road that mixes stability with speed and most importantly gets those on snapshot (and milestone) back to a stable state. |
This pull request has been mentioned on openHAB Community. There might be relevant details there: https://community.openhab.org/t/neeo-remote-binding-transport/33408/427 |
The adjustments to the binding seem problematic @mherwege My remote stopped sending commands to the channels after a few hours. I can't find an error, the logs rolled on me at some point. I had to restart OH to get it to come back. Not sure why it crashed after a few hours. I did notice that it started to throw messages of:
But then it doesn't send to the channel. When I restarted OH, I didn't get the handleForwardActions messages, but it did start to update the channel again. |
I've got a 1 week uptime on my OH right now and it looks like the slow creep has stopped with the last merge. It creates a max of 5 on the stack and then disposes of them after that.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to my unique comment, I would like that you confirm that the HTTP client configuration is not updated when you are now using the common HTTP client from the core.
bundles/org.openhab.io.neeo/src/main/java/org/openhab/io/neeo/internal/NeeoApi.java
Show resolved
Hide resolved
bundles/org.openhab.io.neeo/src/main/java/org/openhab/io/neeo/internal/NeeoApi.java
Show resolved
Hide resolved
Signed-off-by: Ben Rosenblum <[email protected]>
@openhab/add-ons-maintainers - With CODEOWNERS now updated, can we resolve the pending review request for tmrobert8 and merge this in (or provide additional review as necessary so we can close this). 4.1 is currently broken for neeo users and it's crashing systems. |
I can't find anything in the code that implies to me that we are making any changes to the common client. @mherwege can you confirm? |
I also don’t see anything that changes the common client. |
openhab/openhab-core#3826 may have resolved the underlying root cause. That said, would it still be a good idea to commit this and migrate to the jetty client or do we leave it alone and close this? |
No dice, the system stayed stable but the moment I went to use the remote it went to dead again. |
So this PR remains useful because the change in core did not help? If you confirm, I will finish the review and merge. |
Correct. The core change maintains stability only when the remote is unused. The moment buttons are pushed it all crashes. This is absolutely still necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
Thank you! |
* Convert to OH jetty HttpClient * Adds a Stack for HttpClient storage * Add synchronized to prevent exception * fix binding * Stop stack on close * Resolves exception on registring forward actions * Reduces client count to 5 to align to expected thread pool limit of 5 --------- Signed-off-by: Ben Rosenblum <[email protected]> Co-authored-by: Mark Herwege <[email protected]>
Oddly enough 4.0 didn't leak. We still never found the root cause of what dependency got upgraded which cause this. |
So it sounds like there is no reason to risk including it in 4.0.x? |
Yes, sorry, I thought users were first complaining with 4.0. Apparently I was wrong. |
I don't have any reports from 4.0 that I can find and I don't remember having the issue until 4.1. |
* Convert to OH jetty HttpClient * Adds a Stack for HttpClient storage * Add synchronized to prevent exception * fix binding * Stop stack on close * Resolves exception on registring forward actions * Reduces client count to 5 to align to expected thread pool limit of 5 --------- Signed-off-by: Ben Rosenblum <[email protected]> Co-authored-by: Mark Herwege <[email protected]>
* Convert to OH jetty HttpClient * Adds a Stack for HttpClient storage * Add synchronized to prevent exception * fix binding * Stop stack on close * Resolves exception on registring forward actions * Reduces client count to 5 to align to expected thread pool limit of 5 --------- Signed-off-by: Ben Rosenblum <[email protected]> Co-authored-by: Mark Herwege <[email protected]> Signed-off-by: querdenker2k <[email protected]>
* Convert to OH jetty HttpClient * Adds a Stack for HttpClient storage * Add synchronized to prevent exception * fix binding * Stop stack on close * Resolves exception on registring forward actions * Reduces client count to 5 to align to expected thread pool limit of 5 --------- Signed-off-by: Ben Rosenblum <[email protected]> Co-authored-by: Mark Herwege <[email protected]> Signed-off-by: Jørgen Austvik <[email protected]>
Resolves #15563
Replaces ClientBuilder/HttpClient with OH Jetty HttpClient.