Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tibber] Crash at midnight, fetching of new data stops #13417

Closed
seime opened this issue Sep 19, 2022 · 11 comments · Fixed by #13537
Closed

[tibber] Crash at midnight, fetching of new data stops #13417

seime opened this issue Sep 19, 2022 · 11 comments · Fixed by #13537
Assignees
Labels
bug An unexpected problem or unintended behavior of an add-on

Comments

@seime
Copy link
Contributor

seime commented Sep 19, 2022

Log shows refreshing happens every 5 minutes as configured. At midnight errors occur. Guessing on Tibbers side, but the problem is that the binding can't deal with these and gives up while still reporting "online".

Expected behaviour: Thing goes offline and reconnect is attempted after some time

Setup: Pi4b with Pulse dongle. 2 homes in tibber, but only 1 active.

Running Openhab 3.3.0 with tibber binding patch (see #13416), line numbering therefore off

2022-09-18 23:58:51.704 [DEBUG] [ibber.internal.handler.TibberHandler] - API response: {"data":{"viewer":{"home":{"currentSubscription":{"priceInfo":{"current":{"total":1.5027,"startsAt":"2022-09-18T23:00
:00.000+02:00","level":"VERY_CHEAP"},"tomorrow":[{"startsAt":"2022-09-19T00:00:00.000+02:00","total":1.1146},{"startsAt":"2022-09-19T01:00:00.000+02:00","total":1.2886},{"startsAt":"2022-09-19T02:00:00.00
0+02:00","total":1.2241},{"startsAt":"2022-09-19T03:00:00.000+02:00","total":1.1544},{"startsAt":"2022-09-19T04:00:00.000+02:00","total":1.3058},{"startsAt":"2022-09-19T05:00:00.000+02:00","total":2.3079}
,{"startsAt":"2022-09-19T06:00:00.000+02:00","total":4.4889},{"startsAt":"2022-09-19T07:00:00.000+02:00","total":5.1018},{"startsAt":"2022-09-19T08:00:00.000+02:00","total":5.1866},{"startsAt":"2022-09-19
T09:00:00.000+02:00","total":4.688},{"startsAt":"2022-09-19T10:00:00.000+02:00","total":4.3473},{"startsAt":"2022-09-19T11:00:00.000+02:00","total":3.1144},{"startsAt":"2022-09-19T12:00:00.000+02:00","tot
al":2.8812},{"startsAt":"2022-09-19T13:00:00.000+02:00","total":3.0343},{"startsAt":"2022-09-19T14:00:00.000+02:00","total":3.0413},{"startsAt":"2022-09-19T15:00:00.000+02:00","total":2.8485},{"startsAt":
"2022-09-19T16:00:00.000+02:00","total":3.2944},{"startsAt":"2022-09-19T17:00:00.000+02:00","total":4.6221},{"startsAt":"2022-09-19T18:00:00.000+02:00","total":5.0858},{"startsAt":"2022-09-19T19:00:00.000
+02:00","total":5.4796},{"startsAt":"2022-09-19T20:00:00.000+02:00","total":5.2424},{"startsAt":"2022-09-19T21:00:00.000+02:00","total":4.8803},{"startsAt":"2022-09-19T22:00:00.000+02:00","total":4.5256},
{"startsAt":"2022-09-19T23:00:00.000+02:00","total":3.9117}]}},"daily":{"nodes":[{"from":"2022-09-17T00:00:00.000+02:00","to":"2022-09-18T00:00:00.000+02:00","cost":169.5052764,"unitPrice":3.312591,"consu
mption":51.17,"consumptionUnit":"kWh"}]},"hourly":{"nodes":[{"from":"2022-09-18T22:00:00.000+02:00","to":"2022-09-18T23:00:00.000+02:00","cost":2.791782375,"unitPrice":1.9799875,"consumption":1.41,"consum
ptionUnit":"kWh"}]}}}}}

2022-09-19 00:02:55.208 [DEBUG] [ibber.internal.handler.TibberHandler] - Error during websocket communication: java.util.concurrent.TimeoutException: Idle timeout expired: 30000/30000 ms
2022-09-19 00:02:55.214 [DEBUG] [ibber.internal.handler.TibberHandler] - Sending websocket disconnect message
2022-09-19 00:02:55.215 [DEBUG] [ibber.internal.handler.TibberHandler] - Send message: {"type":"connection_terminate","payload":null}
2022-09-19 00:02:55.218 [DEBUG] [ibber.internal.handler.TibberHandler] - Closing a WebSocket due to Disconnected
2022-09-19 00:03:10.221 [WARN ] [org.eclipse.jetty.io.ManagedSelector] - java.nio.channels.ClosedSelectorException
2022-09-19 00:03:10.224 [WARN ] [ibber.internal.handler.TibberHandler] - CLOSE FRAME - Failed to stop websocket client: null
2022-09-19 00:03:52.062 [DEBUG] [ibber.internal.handler.TibberHandler] - API response: {"data":{"viewer":{"home":{"currentSubscription":{"priceInfo":{"current":{"total":1.1146,"startsAt":"2022-09-19T00:00
:00.000+02:00","level":"VERY_CHEAP"},"tomorrow":[]}},"daily":{"nodes":[{"from":"2022-09-18T00:00:00.000+02:00","to":"2022-09-19T00:00:00.000+02:00","cost":64.15511,"unitPrice":1.504223,"consumption":42.65
,"consumptionUnit":"kWh"}]},"hourly":{"nodes":[{"from":"2022-09-18T23:00:00.000+02:00","to":"2022-09-19T00:00:00.000+02:00","cost":null,"unitPrice":1.5026875,"consumption":null,"consumptionUnit":"kWh"}]}}
}}}

2022-09-19 00:03:52.071 [DEBUG] [ibber.internal.handler.TibberHandler] - Attempting to reopen Websocket connection
2022-09-19 00:03:52.084 [DEBUG] [ibber.internal.handler.TibberHandler] - New socket being created
2022-09-19 00:03:52.085 [DEBUG] [ibber.internal.handler.TibberHandler] - Starting Websocket connection
2022-09-19 00:03:52.104 [WARN ] [ty.util.ssl.SslContextFactory.config] - Trusting all certificates configured for Client@bc71a2[provider=null,keyStore=null,trustStore=null]
2022-09-19 00:03:52.106 [WARN ] [ty.util.ssl.SslContextFactory.config] - No Client EndPointIdentificationAlgorithm configured for Client@bc71a2[provider=null,keyStore=null,trustStore=null]
2022-09-19 00:03:52.111 [DEBUG] [ibber.internal.handler.TibberHandler] - Connecting Websocket connection
2022-09-19 00:03:52.437 [DEBUG] [ibber.internal.handler.TibberHandler] - Sending websocket connect message
2022-09-19 00:03:52.438 [DEBUG] [ibber.internal.handler.TibberHandler] - Send message: {"type":"connection_init", "payload":"token=XXXXXX"}
2022-09-19 00:03:52.511 [DEBUG] [ibber.internal.handler.TibberHandler] - Connected to Server
2022-09-19 00:03:52.513 [DEBUG] [ibber.internal.handler.TibberHandler] - Send message: {"id":"1","type":"start","payload":{"variables":null,"extensions":null,"operationName":null,"query":"subscription {\n
 liveMeasurement(homeId:\"XXXXXXX\") {\n timestamp\n power\n lastMeterConsumption\n accumulatedConsumption\n accumulatedCost\n currency\n minPower\n averagePower\n maxPower\nv
oltagePhase1\n voltagePhase2\n voltagePhase3\n currentL1\n currentL2\n currentL3\n powerProduction\n accumulatedProduction\n minPowerProduction\n maxPowerProduction\n }\n }\n"}}
2022-09-19 00:03:52.733 [DEBUG] [ibber.internal.handler.TibberHandler] - Error/terminate received from server: {"type":"data","id":"1","payload":{"errors":[{"message":"not able to start streaming for devi
ce ID XXXXXXXXXXX, stream ID api:XXXXXXXX","locations":[{"line":2,"column":2}],"path":["liveMeasurement"]}]}}
2022-09-19 00:03:52.734 [DEBUG] [ibber.internal.handler.TibberHandler] - Sending websocket disconnect message
2022-09-19 00:03:52.734 [DEBUG] [ibber.internal.handler.TibberHandler] - Send message: {"type":"connection_terminate","payload":null}
2022-09-19 00:03:52.736 [DEBUG] [ibber.internal.handler.TibberHandler] - Closing a WebSocket due to Disconnected
2022-09-19 00:04:02.114 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
java.lang.NullPointerException: null
        at org.openhab.binding.tibber.internal.handler.TibberHandler.open(TibberHandler.java:350) ~[?:?]
        at org.openhab.binding.tibber.internal.handler.TibberHandler.updateRequest(TibberHandler.java:253) ~[?:?]
        at org.openhab.binding.tibber.internal.handler.TibberHandler.lambda$0(TibberHandler.java:241) ~[?:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]
2022-09-19 00:04:22.738 [WARN ] [ibber.internal.handler.TibberHandler] - CLOSE FRAME - Failed to stop websocket client: java.lang.InterruptedException
@seime seime added the bug An unexpected problem or unintended behavior of an add-on label Sep 19, 2022
@kjoglum
Copy link
Contributor

kjoglum commented Sep 21, 2022

@seime:
For sure there is a bug, and presumably related to an issue on Tibber side of things, and most often around midnight. It is clear from logs, as Tibber reports error message "not able to start streaming for device".

I have also experienced some incidents where the reconnect attempts stop for some reason, but have never seen log results as yours, getting Scheduled runnable ended with an exception, which indicates full stop in query refresh/reconnect attempts. Are you getting consistent log results after a OH reboot, when Tibber is started/fails again? The binding is set up to attempt reconnects as part of the regular refresh queries. For me as well, as mentioned, I have had some incidents, but haven't been able to identify from logs what causes stop in attempts for reconnect as I usually manage to reconnect, often several successful reconnects, but the connection fails without log notification after some time (varies from hours to weeks).

And this mechanism of the binding is also the reason why the binding remains Online. Meaning, as the binding both covers regular queries (Tibber subscription) as well as live measurements (Tibber Pulse), the intention has been to keep the binding Online as long as the regular queries are responsive, allowing continuous attempts (as part of refresh interval) to reconnect live measurements.

Will continue to chase for the reason why the reconnect attempts stop.

@seime
Copy link
Contributor Author

seime commented Sep 21, 2022

Thanks for looking into this.

I set the log level to debug after a few incidents where my water boiler would stay off for longer periods of time, causing cold morning showers and all-time low WAF ;). I used to rely on the (price) level attribute, but now replaced by better calculations based on data from #13416 .

I think sometimes the binding is able to reconnect, but there are also a few warnings in the log - including another exception lately. I cannot tell whether this error is consistent in any way as I've only recently started to notice. I'm still adding new things to openHAB (including developing new bindings etc), and therefore I tend to restart openHAB every now and then when I hit snags.

My logs are now at TRACE with a notification if data gets stale. I'll add more logs if I notice this again.

[19:45:41] pi@pi4b:~$ grep Tibber /var/log/openhab/openhab.log | grep WARN
2022-09-16 18:03:07.044 [WARN ] [ibber.internal.handler.TibberHandler] - IO Exception: java.util.concurrent.TimeoutException: Total timeout 20000 ms elapsed
2022-09-16 18:08:27.047 [WARN ] [ibber.internal.handler.TibberHandler] - IO Exception: java.util.concurrent.TimeoutException: Total timeout 20000 ms elapsed
2022-09-18 18:35:54.338 [WARN ] [ibber.internal.handler.TibberHandler] - CLOSE FRAME - Failed to stop websocket client: null
2022-09-18 18:36:58.784 [WARN ] [ibber.internal.handler.TibberHandler] - CLOSE FRAME - Failed to stop websocket client: null
2022-09-18 23:00:58.696 [WARN ] [ibber.internal.handler.TibberHandler] - CLOSE FRAME - Failed to stop websocket client: null
2022-09-18 23:14:48.846 [WARN ] [ibber.internal.handler.TibberHandler] - CLOSE FRAME - Failed to stop websocket client: null
2022-09-18 23:18:37.579 [WARN ] [ibber.internal.handler.TibberHandler] - IO Exception: java.util.concurrent.TimeoutException: Total timeout 20000 ms elapsed
2022-09-19 00:03:10.224 [WARN ] [ibber.internal.handler.TibberHandler] - CLOSE FRAME - Failed to stop websocket client: null
2022-09-19 00:04:22.738 [WARN ] [ibber.internal.handler.TibberHandler] - CLOSE FRAME - Failed to stop websocket client: java.lang.InterruptedException
2022-09-19 23:19:26.461 [WARN ] [ibber.internal.handler.TibberHandler] - IO Exception: java.util.concurrent.TimeoutException: Total timeout 20000 ms elapsed
2022-09-20 10:48:40.878 [WARN ] [ibber.internal.handler.TibberHandler] - CLOSE FRAME - Failed to stop websocket client: null
2022-09-20 13:14:09.636 [WARN ] [ibber.internal.handler.TibberHandler] - Websocket Client Stop Exception: java.lang.InterruptedException
2022-09-21 15:11:07.564 [WARN ] [ibber.internal.handler.TibberHandler] - Websocket Client Stop Exception: java.lang.InterruptedException
[19:45:43] pi@pi4b:~$

(once of the warns):

2022-09-21 15:10:37.564 [WARN ] [ty.websocket.common.WebSocketSession] - Exception while notifying onError
org.eclipse.jetty.websocket.api.WebSocketException: Session closed
        at org.eclipse.jetty.websocket.common.WebSocketSession.outgoingFrame(WebSocketSession.java:350) ~[bundleFile:9.4.46.v20220331]
        at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.uncheckedSendFrame(WebSocketRemoteEndpoint.java:322) ~[bundleFile:9.4.46.v20220331]
        at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:109) ~[bundleFile:9.4.46.v20220331]
        at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:400) ~[bundleFile:9.4.46.v20220331]
        at org.openhab.binding.tibber.internal.handler.TibberHandler$TibberWebSocketListener.sendMessage(TibberHandler.java:530) ~[?:?]
        at org.openhab.binding.tibber.internal.handler.TibberHandler.close(TibberHandler.java:374) ~[?:?]
        at org.openhab.binding.tibber.internal.handler.TibberHandler$TibberWebSocketListener.onWebSocketError(TibberHandler.java:448) ~[?:?]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
        at org.eclipse.jetty.websocket.common.events.annotated.CallableMethod.call(CallableMethod.java:70) ~[bundleFile:9.4.46.v20220331]
        at org.eclipse.jetty.websocket.common.events.annotated.OptionalSessionCallableMethod.call(OptionalSessionCallableMethod.java:72) ~[bundleFile:9.4.46.v20220331]
        at org.eclipse.jetty.websocket.common.events.JettyAnnotatedEventDriver.onError(JettyAnnotatedEventDriver.java:192) ~[bundleFile:9.4.46.v20220331]
        at org.eclipse.jetty.websocket.common.WebSocketSession.callApplicationOnError(WebSocketSession.java:416) [bundleFile:9.4.46.v20220331]

@kjoglum
Copy link
Contributor

kjoglum commented Sep 21, 2022

Appreciate the log updates. Will continue chasing my own logs as well.

The IO Exceptions relate to the query part of the binding, most often around hour transitions, due to delay/difficulties getting updated data from the Tibber server.

The CLOSE FRAME warnings you are seeing is the same behaviour as for me. These warnings would indicate a OnWebSocketError on Tibber side of things, whereas the binding triggers a websocket close to ensure fully closed websocket (often resulting in Failed to stop websocket client: null as the connection actually has been successfully as part of OnWebSocketError from server side). Regardless, a sequence of CLOSE FRAME would be a strong indication of successful reconnects in between (as could be supported by debug log, Attempting to reopen.. --> Connected to server).

So, need to figure out why/what happens when the refresh is killed / reconnect attempts just stop.

@openhab-bot
Copy link
Collaborator

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/tibber-binding-hangs-when-a-new-week-starts/139308/2

@kjoglum
Copy link
Contributor

kjoglum commented Sep 27, 2022

Good catch recognising the failure as part of new week starting. I have now also experienced the same issue, getting Scheduled runnable ended with an exception at the transition from Sunday to Monday, shutting down live stream completely.

It might appear like the Tibber API maintains the websocket session at all other times than at the week transition, thus enabling the Tibber binding to reconnect as part of the refresh intervals. The Error/Terminate is logged as the Tibber API is sending an error message (might seem like the error message is only sent as part of failure during week transition), and furthermore it appears like the Tibber API actually closes the websocket session completely.

As part of refresh interval, trying to reconnect, the binding is trying to get a verification of successful reconnect/websocket establishment, i.e. if (!session.isOpen()). However, as the Tibber API apparently has closed the session, it might be the reason for getting NullPointerException which again causes the scheduler to terminate.

So, will have a look at rewriting the verification part, and see if this will enable reconnects also at week transitions.

@kjoglum kjoglum closed this as completed Sep 27, 2022
@kjoglum kjoglum reopened this Sep 27, 2022
@kjoglum
Copy link
Contributor

kjoglum commented Sep 28, 2022

Just created a new version for testing on my own behalf. If others would be interested in trying before doing an official PR:
Tibber 3.4.0

The change to the binding relates to the verification part, to check if the websocket session actually becomes open as part of reconnect attempts (seems to be the reason causing NPE and termination of scheduled reconnect attempts around week transitions)). Now, the binding will still check if the websocket client is active, but should now manage to only log a warning if the client is not recognised as active, and thus avoid NullPointerException, and also still enable reconnect attempt at next refresh interval.

@seime
Copy link
Contributor Author

seime commented Sep 29, 2022

Thanks @kjoglum, I will test the new version.

Experienced this again this night, so it appears to be a midnight problem and not just Sun/Mon.

2022-09-29 00:04:42.686 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
java.lang.NullPointerException: null
        at org.openhab.binding.tibber.internal.handler.TibberHandler.open(TibberHandler.java:350) ~[?:?]
        at org.openhab.binding.tibber.internal.handler.TibberHandler.updateRequest(TibberHandler.java:253) ~[?:?]
        at org.openhab.binding.tibber.internal.handler.TibberHandler.lambda$0(TibberHandler.java:241) ~[?:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]

@srednak
Copy link

srednak commented Sep 29, 2022

Same thing happened to me last night.

I have just installed the new version. Thanks @kjoglum

@kjoglum
Copy link
Contributor

kjoglum commented Sep 29, 2022

@seime @srednak
I also had the same issue last night. However, running with the latest version, no NPE, and the binding managed reconnect at third attempt, and I am still getting live measurements. So I believe we are closing in to a functional bug fix.

@seime
Copy link
Contributor Author

seime commented Oct 11, 2022

Fix appears to be working, have not noticed it going offline since updating to the latest version. Thanks @kjoglum

@openhab-bot
Copy link
Collaborator

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/tibber-binding/67019/155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An unexpected problem or unintended behavior of an add-on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants