Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable binding to store historic states #3000

Closed
wants to merge 2 commits into from
Closed

Conversation

altaroca
Copy link

please refer to #844
this adds a method BaseThingHandler.updateHistoricState (ChannelUID channelUID, State state, ZonedDateTime dateTime) and all logic to route the corresponding ItemHistoricStateEvent through the core to the persistence layer and any ModifiablePersistenceService.
I have tested it using my fork of the E3DC binding and influxdb modified to implement ModifiablePersistenceService.
I made some choices for simplicity sake:

  • the item current state and its historic state only meet in the persistence. Thus the historic state (even if younger) does not affect the current state.
  • the events are separate but share the same publication topic

@altaroca altaroca requested a review from a team as a code owner June 14, 2022 16:01
Copy link
Member

@J-N-K J-N-K left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. I have done an initial review, but I also have a general question: What happens if the state has a date this is in the future? Also please check for the SAT errors.

@altaroca
Copy link
Author

wow, that was fast. I must apologize. I won’t be able to react that quickly. Please give me a few days.

@wborn wborn added the bounty label Jun 15, 2022
@wborn wborn linked an issue Jun 15, 2022 that may be closed by this pull request
@altaroca
Copy link
Author

altaroca commented Jun 20, 2022

@J-N-K i agree with your proposals and have made changes accordingly.

What happens if the state has a date this is in the future?

that date is treated as any other. In other words, the binding is responsible for providing a correct date. There may be instances where a future date is needed. Maybe some kind of forecast? I cannot think of anything useful now.

@splatch
Copy link
Contributor

splatch commented Jun 21, 2022

While I like a lot that this topic come to concrete form I suppose that a bit better way would be proper tagging of states instead of multiplication of events. As outlined - bindings can provide estimations which are ahead of time, hence it would require another kind of event.

I think that making a type TimeBoudState[DateTime, State] would better reflect that case because:

  • This could be still a valid state
  • It would allow to get whole thing working with channel-item profiles
  • It can be detected by persistence adapters

Thing is, if core is ready for such step?

@altaroca
Copy link
Author

@splatch I do not fully understand your comment. What do you mean by 'can still be a valid state'?
The PR provides the entity you propose. Only difference is it is called HistoricState.

@splatch
Copy link
Contributor

splatch commented Jun 22, 2022

The PR provides the entity you propose. Only difference is it is called HistoricState.

@altaroca roger, I missed type itself and got lost by changes in other areas. If you rely on the state then main question I could raise is why distinguish situation by introducing ThingCallback.historicStateUpdated(ChannelUID channelUID, State state, ZonedDateTime dateTime) which hide access to this state in bindings. If bindings have access to String/DateTime and other representations, then this part of change, strictly speaking is not required.

I see a point in making such state a first level citizen because some of bindings could provide a timestamp of received update. To give more concrete example - a CAN interface controller can attach timestamp to the message which delivers state update. Binding then knows exact timestamp of an update and it is not a historic state per say as it is instant value with receive time information.

@altaroca
Copy link
Author

@splatch I had thought about that and decided against it. There were two reasons:

  1. sub-classes of State are often checked by using expressions like instanceof DecimalType. This breaks if there was another type hierarchy based on HistoricState like DecimalHistoricType etc. This might be solved but requires a lot of refactoring in the core and in add-ons.
  2. I fear the complexity of deciding which state is the current one. Will a younger HistoricState overwrite the current state? Yes, for timed measurements. No, for forecasts in the same channel. I do not see a clear rule. That's why I kept State and HistoricState separate.

@splatch
Copy link
Contributor

splatch commented Jun 23, 2022

  1. Why you consider checking and casting for DecimalState as a fine operation, but checking for TimeBoundState as something wrong? It is a state just like others. In case if primitive type is required you can rely on composition such as below
interface WrapperState extends State /*, ComplexType? */ {
  State unwrap();
  // 24.08.2022 - navigate to specific type of state under wrapper
  <T extends State> T unwrap(Class<T> type);
  <T extends State> boolean has(Class<T> type);
}
class TimeBoundState implements WrapperState {
  private DateTime dateTime;
  private State state;
  State unwrap() {
    State myState = state;
    while (myState instanceof WrapperState) {
       // it can be also moved to State/Type itself so DecimalType will always return this, but wrapper states will unpack themselves
       myState = myState instanceof WrapperState ? ((WrapperState) myState).unpack() : myState;
    }
    return myState;
  }
  // ..
}

// for callers doing persistence and other parties which need to process concrete state instance
State actualState = state instanceof WrapperState ? ((WrapperState) state).unpack() : state;
if (actualState instanceof DecimalType) {
   ...
}
// 24.08.2022 - unpacking to a specific type
DecimalType actualState = state instanceof WrapperState ? ((WrapperState) state).unpack(DecimalType.class) : state.as(DecimalType.class);
if (state instanceof DecimalType || (state instanceof WrapperState && ((WrapperState) state).has(DecimalType.class)) {
   DecimalType dec = state instanceof DecimalType ? (DecimalType) state : ((WrapperState) state).unpack(DecimalType.class);
}
  1. If you keep going with a HistoricState then its a matter of question when we will need to start dealing with FutureState, ScheduledState and related framework changes. Obviously these situations may not appear immediately, but what you signal with historic state is a paired information state+timestamp. If you look at it in such way then it doesn't matter if its past, future or almost now.
    Processing of time bound states might be completely omitted by items, or based on time accuracy criteria. Just like you do it right now via applyHistoricState. Its based on separate event but may be based on state kind iself.

@altaroca
Copy link
Author

  1. Why you consider checking and casting for DecimalState as a fine operation, but checking for TimeBoundState as something wrong?

Err, no. That's not my point. I meant to say that no existing code will understand the meaning of types like DecimalTimeBoundState because of the way everyone uses instanceof on known sub-classes of State. And we would have to refactor classes like InfluxDBStateConvertUtils.

  1. If you keep going with a HistoricState then its a matter of question when we will need to start dealing with FutureState, ScheduledState and related framework changes. Obviously these situations may not appear immediately, but what you signal with historic state is a paired information state+timestamp. If you look at it in such way then it doesn't matter if its past, future or almost now.

Of course. I completely agree. And I do not care about the name. It's up for discussion what's the best name: HistoricState, TimeBoundState or something else.

What I mean is that people, seeing State and HistoricState being passed along as equal in the same channel will start to wonder which one takes precedence if any. Or do you suggest to have entirely different channels for State and HistoricState? My current implementation uses the same channel and you could interpret HistoricState as a kind of sub-channel on the primary channel.

@J-N-K
Copy link
Member

J-N-K commented Jun 23, 2022

IMO if timestamps in the future are allowed, something like TimeboundState would be better than HistoricState as that implies that the time is in the past.

I wonder if it would not be better to extend State with a DateTime getDateTime().(I personally would prefer a Long getTimeStamp(), but that's debatable). A default implementation could return the current time (which is similar to what we have now. If the consumer does not support timed states, but the state does have time information, the time information would be discarded.

This could also replace the need for the HistoricState in the persistence implementations.

@J-N-K
Copy link
Member

J-N-K commented Jul 6, 2022

@wborn Since you created the original issue: I would like your opinion here. WDYT about adding a timestamp to State? This would require no changes at all for code that is currently not supporting timed values but allow those that implement it to use the date information.

@openhab-bot
Copy link
Collaborator

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/panasonic-comfort-cloud-binding/133848/26

@J-N-K J-N-K mentioned this pull request Jul 22, 2022
@J-N-K
Copy link
Member

J-N-K commented Aug 20, 2022

@openhab/core-maintainers I would really like to hear your opinion here.

@wborn
Copy link
Member

wborn commented Aug 20, 2022

Thanks for working on this @altaroca! 👍

I would really like to hear your opinion here.

I don't have a strong opinion but it seems that in previous related PRs the timestamp always has been kept separate from the State as well.

See for instance this code:

@J-N-K
Copy link
Member

J-N-K commented Aug 20, 2022

Yes, but I don't understand the reasoning behind that. From what @kaikreuzer said in #3033 a possible argument is that this timestamp might not be the actual timestamp of the value change. While this might be true, if the timestamp is @nullable, a binding can chose not to set it, if it is not known. But having a timestamp together with a state will remove the need for extra-code to handle future values (e.g. forecasts) and historic values (retrieved from persistence service) and will also allow bindings that CAN determine the exact point in time of a value change to communicate that. And it is fully backward compatible because all add-ons agnostic of the timestamp will not be affected (they produce states without timestamps, just like before and they consume states ignoring the timestamp, just like before), but persistence services can properly store the timestamp together with the value. If no timestamp is present, they use the system time, like it is now.

@altaroca
Copy link
Author

i cannot foresee all downstream implications of this change. Let's approach it semantically:
state means (abbreviated from Merriam Webster)

  1. mode or condition of being,
  2. a condition or stage in the physical being of something

This definition does not refer to a timestamp. So if you wanted to always combine a State object with a time you should at least call it differently, such as StateWithTime or the above TimeboundState.

Now the question is: can openhab do without a State object and can you replace it in all places with a TimeboundState object?

@wborn
Copy link
Member

wborn commented Aug 21, 2022

Adding forecasts on the same channels probably will also have its own set of issues. You probably don't want to persist forcasted state and actual state using the same item. I can imagine there are also devices that store their forecasts made at a certain time in a buffer. So if you then read these forecasts at a later time you might want to persist "historic forecasts". 😉

@J-N-K
Copy link
Member

J-N-K commented Aug 21, 2022

It's questionable if someone wants to persist forecasts at all (maybe to use restoreOnStartup). Anyhow, if the timestamp is persisted together with the value, I don't see any issues arising from that.

My concern is that we now have State, add HistoricState for persisted states (or states that shall be persisted) here and then add FutureState for forecasts later. Then we have three non-interchangable types for representing a state.

Keeping aside that this leads to a very bulky API (and therefore a high burden for developers), I doubt that users will benefit from that. We now separate Command and State and even that is not understood by many. Do we really expect a majority to understand why a State is not a State depending on the presence of a time information? And even more why there are two different types of States with time-information? Or would the user prefer to have one single type to deal with?

I doubt that this HistoricState can be kept internal. As soon as such functionality is added someone will request to make it available e.g. in rules.

@seime
Copy link
Contributor

seime commented Aug 22, 2022

I do see usecases for persisting forecasts - electricity prices and weather are two. If my network goes down and openHAB is restarted, I would loose the ability to make decisions on these data without persistence.

IMHO a State should have a timestamp reflecting the actual time this value was sampled/produced - by the sensor/service or by the framework if the underlying sensor/service lacks this.

The State timestamp is typically different from the Event timestamp. which is the timestamp the Event itself was produced. I think we need both - if only not to mix the two concepts and create confusion.

Could we explore the feasability to create new Event types and topics for historic and forecast timeseries? Ie one (set) for bindings that produce these time series and that the framework listens for (and persists the included States with timestamp), and another set of events for rule engines to listen for when new timeseries are available?
I'm not very familiar with the inner workings of openHAB here, so bear with me if it breaks backwards compat, creates a lot of work or is simply a bad idea.

@splatch
Copy link
Contributor

splatch commented Apr 8, 2023

Changing historic or future values in a database is something substantially different to me - yes, we should allow a binding to do that (possibly by means of a new method in the handler callback), but similar to PersistenceResource.putItemState() this does not have to go through the CommunicationManager, but rather be handled by the persistence layer, i.e. the PersistenceManager directly.

The thing handler callback have already bunch of methods, adding one more does not impact it much (its already over bloated and does everything from configuration validation through thing status ending with channel updates).
What I’d be careful of is assumption that past or future states are reserved for persistence alone. Close time forecasts are most relevant for pre-heating/cooling, watering and other "plan" based situations which will often be based on rules (when forecast change trigger?).

Basic fact that REST resource does not use communication manager does not mean that it can't be used elsewhere. It is an internal component which is not used directly by any REST endpoint (even SSE which is closest by nature).

Reason why I was mentioning profiles is because they could be used i.e to calculate mean/average over a moving window, effectively delivering re-computed state from forecast and time

@kaikreuzer
Copy link
Member

Reason why I was mentioning profiles is because they could be used i.e to calculate mean/average over a moving window, effectively delivering re-computed state from forecast and time

I do not see much value here as forecast values will very often come in bulk. If you process 240 events for updated values for the next 24 hours sequentially, it will not only cause huge unnecessary CPU utilization, but also can result in inadequate intermediate results, because the full picture simply wasn't there yet.
What is true is that rules might be interested to know when an update of future values might have happened; but this could be provided as an event (of a new type) by the persistence manager after receiving a bulk update.

@splatch
Copy link
Contributor

splatch commented Apr 8, 2023

I do not see much value here as forecast values will very often come in bulk. If you process 240 events for updated values for the next 24 hours sequentially, it will not only cause huge unnecessary CPU utilization, but also can result in inadequate intermediate results, because the full picture simply wasn't there yet.

@kaikreuzer I didn't mean ingestion of future/past states. What I meant is necessity to continuously poll database in order to re-calculate items which are derivative of these states. While rules have a cron trigger, using that mechanism to average i.e. future temperature predictions will cost extra. Doing that in profile which can keep last couple of states will be lighter than above.
At present PersistenceManager does not post any events, in fact it is a listener for state change/update. I do see a risk of endless loop, however without code it is just a unjustified fear.

@henfiber
Copy link

henfiber commented Apr 9, 2023

I am also interested in this for various use cases. I drafted this flowchart:

oh-historical-and-future-states

Some notes:

  • Since Persistence services are not called imperatively but listen for events, these timestamps should be included in events.
  • Since the State is not always updated (for historical or future timestamps), we cannot reuse *StateUpdated events. I have introduced two new events (with the suffix Received instead of Updated since the State is not updated).
  • Besides events, adding lastUpdate to the State would allow the framework to compare with the previous lastUpdate and take action accordingly (update if newer than previous one, and not in the future).
  • The separate HistoricalStateReceived and FutureStateReceived events allow new Persistence strategies to be added with the same name. For instance, some users may opt to use a different database for future states or disable historical or future values entirely for some items where they would be considered erroneous.

Therefore, I am thinking that adding timestamps both to the State and Events may seem more work but it may help with ingestion and persistence of historical/future states. With the added benefit that it will make it easier (and more performant) for users to have access to lastUpdate in rules and the UI (with some extra work there) without reaching to persistence extensions or creating DateTime items.

(The mermaid diagram is here for anyone wanting to make changes)

@J-N-K
Copy link
Member

J-N-K commented Apr 9, 2023

@splatch Can't fix stupid. You can already create rules that run endlessly and also create two rules that always call each other. IMO applying a profile that does such things just makes no sense (while a profile that converts values or data types makes sense and is not critical).

@henfiber Conceptually it would make more sense to couple timestamp and state than timestamp and event, especially since we then would not need any additional methods. States without timestamps are considered to be "now" (like it is at the moment) and everyone not interested in future or past events can simply discard every event that has a timestamp. Items would only set states that have no timestamp.

That aside: We don't need new events. Since #3141 we differentiate between

  • item shall update: ItemStateEvent
  • item has updated: ItemStateUpdatedEvent
  • item has changed: ItemStateChangedEvent

so using ItemStateEvent would be no issue. The item can then decide if it uses the event to update its state (like it does now, if the state can be converted) or just passes to the listeners (like persistence services).

@kaikreuzer Bulk updates are another issue, we currently have no means to do so (although I see a benefit for that). This should really be a new event type but there is another issue: If we don't couple timestamps and states but timestamps and events we can't transport states for different timestamps in one event.

@jlaur
Copy link
Contributor

jlaur commented Apr 9, 2023

I don't think it's possible to leave events out here, because events are the way the persistence manager gets the data it passes to the persistence services.

The implementation in this PR actually shortcuts the persistence manager and directly reaches out directly to the modifiable persistence services. I'm not yet decided whether this is good or bad, but it raises the general question of whether we should consider persistence configurations for such "binding provided" time-bound states. What if the user has two modifiable persistence services configured, but only wants to use one of them for future prices?

I'm not sure if I understand this correctly, but shouldn't this work pretty much like now, i.e. the persistence service configured for item price (which is a future price if timestamp is in the future) will receive those time-bound states?

My doubt is if we are considering getting anything back from persistence, i.e. like restoreOnStartup? I would assume restoreOnStartup to be able to push back time-bound states as well, similar to what bindings would do.

But besides that, shouldn't we be able to handle time-bound states completely independent of persistence services? If a binding would update an item not persisted with a state one hour into the future, this should work. But what will happen in an hour, will there be an event that the (current) item value changed? If not, all bindings would have to implement this by themselves. And if yes, then how will it work? Will it be feasible to keep queued future states in memory, or do we need persistence?

The persistence we might need here I see as something different from the persistence the user has configured for items. This is persistence needed by the event bus to keep track of future state updates, and only to preserve memory. Also it should work for all items whether persisted or not by persistence services. When openHAB is restarted, we can flush all this. It may be repopulated by bindings or by persistence services when using restoreOnStartup, just like the current states are repopulated or restored today.

Sorry if I misunderstood something, just want to understand the thinking here and be sure we are not coupling existing persistence services too much with the need for the event bus to be updated when future states become present.

@J-N-K
Copy link
Member

J-N-K commented Apr 10, 2023

A possible way would be that the item keeps track of future states and schedules its own update for the time that the state is for. When restoreOnStartup is used, we could then also restore the future states from persistence (which is easy, because it is a HistoricItem that already contains a timestamp).

However, for calculations like "when is the best time to start the dish washer" we would still probably need persistence.

@masipila
Copy link

masipila commented Apr 10, 2023

A possible way would be that the item keeps track of future states and schedules its own update for the time that the state is for. When restoreOnStartup is used, we could then also restore the future states from persistence (which is easy, because it is a HistoricItem that already contains a timestamp).

This would be an ideal solution in my opinion. This would allow for example to use the same spot price Item to for "price right now" UI element. At the moment, I have an hourly Rule that reads the price for the current hour from my database and updates the "spot price" Item.

However, for calculations like "when is the best time to start the dish washer" we would still probably need persistence.

Yes, I think so. At least I am very curious to do analysis of the "control points" vs. "actuals".

@jlaur
Copy link
Contributor

jlaur commented Apr 10, 2023

A possible way would be that the item keeps track of future states and schedules its own update for the time that the state is for.

Exactly. This is similar to what I currently (have to) do manually in the EDS binding:
https://github.com/openhab/openhab-addons/pull/14376/files#diff-ba1eee4670519df1d485d8cc46217096018fb204bd597ea8d6237ad27ef457e7R492-R497

When restoreOnStartup is used, we could then also restore the future states from persistence (which is easy, because it is a HistoricItem that already contains a timestamp).

Yes, so they will then also keep track of future states and schedule updates after being restored.

However, for calculations like "when is the best time to start the dish washer" we would still probably need persistence.

Definitely. Comments above were meant for the event bus. Here we need access to persistence from the Energy Management Service.

@J-N-K
Copy link
Member

J-N-K commented Apr 10, 2023

This PR already contains parts of what we discussed, but there is also a lot missing (e.g. the scheduling and restoring part). And of course it should not be Historic if also future states are processed. I'm willing to contribute parts of the missing code, but ONLY if there is an agreement over the design. I'll not again put something up for discussion, implement the result and then rewrite everything because the overall design is questioned.

@henfiber
Copy link

henfiber commented Apr 11, 2023

@J-N-K it seems that many people have interest from a slightly different angle in this initiative, so I think you're right to be asking for some consensus before spending time on this. I will be happy to contribute (testing, updating docs, code if possible etc.) regardless of the decisions. That said I have some comments.

Let me summarize what I have understood as your current proposal:

  • A timestamp will be coupled to state. We are going to have a new TimeBoundState (or similarly-named) object having both a state and a dateTime parameter.
  • The timestamp may also be in the future, so we will not use the name "Historic".
  • No new event types will be created. ItemStateEvent will play the role of notifying consumers that a new State was received even if the Item state was not updated.

Please correct me if I have misunderstood something.

My questions/comments on that:

1. Will a binding be able to send a timestamp with every state and make the items (indirectly) updated with this timestamp-state pair?

From your previous comment, I understand that this won't be the case:

States without timestamps are considered to be "now" (like it is at the moment) and everyone not interested in future or past events can simply discard every event that has a timestamp. Items would only set states that have no timestamp.

In my view, Items should also set states that have a timestamp. With the condition that the timestamp is newer than the previously received timestamp (stored in TimeBoundState) and is not in the future.

If Items only set states without a timestamp, then:

  • A binding/rule that wants to send N timestamped measurements to an OpenHAB item for the last hour, will have to send N-1 measurements with a timestamp, and the last one without a timestamp in order to set it as the current state.
    • This leads to inconsistency, since the N-1 measurements will be persisted with the Device time and the last measurement will be persisted with OpenHAB or Database server time.
  • Or the binding can send all N timestamped measurements, and also send the latest (current) state without a timestamp so the Item state is also updated.
    • this will introduce an erroneous duplicate in the database.
  • If the binding receives and forwards timestamped states one-by-one to OH, the item will not be updated unless it discards the timestamp or unless it sends duplicate states (with and without timestamp)

So, it will not be possible to both send timestamped measurements and set the current state at once without introducing inconsistency or duplicates.

2. Will the timestamp(s) be available to SSE consumers?

If the state will only be coupled with the State and not with events, will SSE consumers have a way to receive the timestamp with the stream of state updates?

For a Java binding it may not be an issue to get the timestamp from the registry, but it may be an issue for SSE consumers like Habapp or the UI if they need to make an extra HTTP request to get the timestamp on every change.

I have read your concerns regarding event timestamps breaking the strict order in the SSE stream, but I don't see why OpenHAB should change the order. It should continue emitting the events in the incoming order. The timestamps would be just reported as an extra attribute. It is up to the SSE consumer to re-order the events (by introducing latency/buffering) if an application has such a requirement.

3. Should we save the timestamp only when set by a binding?

I would argue that it would be useful to store the timestamp even if a binding did not include one with the state. In this case, it would be the ingestion time, i.e. the system clock time ("now") when the new state was received.

With the suggested (and current) approach the ingestion time is not stored at all, so we have to use the consumer's time. Persistence services are one of the consumers and some of them use their own now() function (not even the consumer's Java time). Other use cases where this is not optimal is Remote OpenHAB instances which in the absence of a stored ingestion timestamp states will always be reported as "now" on reconnects.

An even greater issue is when someone opens the OpenHAB dashboard after some time or makes a REST API call to /items, there will be no way to get the information when the item state was updated. One would argue, you can get the timestamps from persistence, but there are cases where this will be sub-optimal at best. For instance, if one wants to make a "monitoring" page listing all of their items with the last update, they would have to make persistence queries to each one (may be 100's). The same for rules with a similar role that run on a schedule.

Finally, the absence of a stored ingestion timestamp breaks "idempotence". i.e. the state cannot be uniquely identified by its value and we cannot enforce at-most-once or exactly-once semantics if requested or forwarded multiple times. The need for idempotence arises in distributed scenarios such as syncing with remote OH instances or MQTT brokers, or if we added for instance a streaming/queue persistence service such as Kafka.

In contrast, if the timestamp was always saved (in the existing State object or elsewhere) it could be always made available to any interested consumer.


Now, all that said (and apologies for the long comment), even if this is implemented as suggested it will be an improvement to the existing situation.

Just wanted to share some concerns in case others have similar use cases.

@kaikreuzer
Copy link
Member

For me it feels that this discussion goes into a fairly wrong direction and I do not agree with the summary of @henfiber, sorry.
Wrt @jlaur's question on whether or not we need persistence: The PR title says "Enable binding to store historic states" and its description mentions "logic to route [...] through the core to the persistence layer and any ModifiablePersistenceService". This is imho a fairly "simple" requirement. Changing this to "an item must not only have a state, but needs to keep its own history (both past and future) in memory" is imho a very different kind of beast and I would try to avoid opening such a can of worms. So my answer to your question is simple: Yes, we need a ModifiablePersistenceService in order to provide the feature.

We already had different options discussed above and for various reasons, I still prefer option 3.

For the requirement that this PR wants to address, I do not see the need to have timestamps added to existing state update events. This would introduce many issues that were already mentioned above (like ordering, measurement time vs. processing time, the fact that persistence services persist states, but not events, bulk update handling, ...) and I believe this is a fully independent discussion that should not be in scope here.

Please also note that imho the requirement in this PR is also primarily about bulk updates of history. Be it missing history that is recovered and added as bulk from a cloud service, be it the prices for electricity for every hour of the next day, be it the weather forecast for the next 7 days. In all these use cases, single events do not make sense on their own, but what would be needed is an event that a certain historic (past or future) time range has just been updated - that's why I suggested a new event type that is issued by the persistence manager.

@J-N-K
Copy link
Member

J-N-K commented Apr 13, 2023

I also don't think we need timestamps on events, I think we need to find a way to couple timestamps and states, and probably also (as @kaikreuzer correctly points out) also a way to group several of these "combined states" for bulk updates. Nothing of that requires timestamps for events and if I understood correct what @kaikreuzer is saying, I also agree that we should probably not re-use the existing events for that and I'm fine with using "option 3" if the item and persistence manager are not bypassed by these methods.

But I think we cannot separate the discussion here from the "full picture" (maybe it would have been better to start a discussion independent from the code here). Historic data and forecasts are quite similar in that they don't represent the current state of an entity but the state of that entity at a different point in time. Inserting them in persistence is only the half way for using them (it's fine for historic events, but not sufficient for forecasts). I'm not saying that the "second half" needs to be addressed in this PR (or even in the very near future), but we should have a picture of what the design should look like, so we don't introduce something now that we have to revert or break because we find out that we forgot something very important that we need to use the forecasts for our energy management system.

We already have a coupling of states and timestamps, the HistoricItem (which is a bit weird in naming when we talk about forecasts, but we can probably live with that. What we would need is a HistoricItemFactory (which we currently don't have) that the BaseThingHandler can use. We can then use a postUpdate(ChannelUID, List<HistoricItem>) method to inform the callback about these values. The callback should then route this list in the same direction like the state from postUpdate(ChannelUID, State). The CommunicationManager should pass it to the item. We can add a default method to the StateProfile that applies the profile for all values of the list before passing it to the item. A new method setTimedStates(List<HistoricItem>) is added to the GenericItem that in a first step only notifies listeners and by set puts the values to persistence (because the persistence manager is a listener to the items).

This is what should be covered here. In a follow-up PR we can add filters to persistence services so that different persistence series attached to the same item can be used for current states and the timed states and (if we agreed that it is desirable) an update/restore logic for future values for the item.

@spacemanspiff2007
Copy link
Contributor

But I think we cannot separate the discussion here from the "full picture" (maybe it would have been better to start a discussion independent from the code here). Historic data and forecasts are quite similar in that they don't represent the current state of an entity but the state of that entity at a different point in time. Inserting them in persistence is only the half way for using them (it's fine for historic events, but not sufficient for forecasts). I'm not saying that the "second half" needs to be addressed in this PR (or even in the very near future), but we should have a picture of what the design should look like, so we don't introduce something now that we have to revert or break because we find out that we forgot something very important that we need to use the forecasts for our energy management system.

I think that's a good idea.
Maybe take a step back and describe how everything should work from a feature/user perspective and then see what needs to be done.

It's trivial to have a binding generate a timestamp -> value table. But then what?
I'd expect this to go through the event bus somehow (and consequently the RestAPI consumers).
Then an item will receive this table. What will the state of the item be?
Will the state of the item automatically change according to the table as time moves on?
If persistence shall override already existing values there will be code changes needed for almost all persistence services.
How many values from the future will be restored for the item state? The whole table?
Can I feed the databases future values myself and have openHAB automatically restore them?
Will the item automatically restore from the future states in the database or will there be the need for a rule for every item?

Or will this still behave like a normal item and there will be a function to query the current value from persistence which can only be used in rules.

@kaikreuzer
Copy link
Member

Inserting them in persistence is only the half way for using them (it's fine for historic events, but not sufficient for forecasts)

Where do you see a difference here? (Btw, ChatGPT suggested to use the expression "time-series data" as it covers historic data and forecasted data equally well.)
My idea was to have the new "time-series update" event be published by the persistence manager and this could provide a time range within data has been changed. This could be past or future or even a range from past to future, so there wouldn't be a need to differentiate at all. Wrt potential use cases, I would also think that this should be what is needed to inform anyone interested. If we think in future that we additionally need the concrete updated values to be sent with the event as well, this is something that can be added at a later point and I do not see the risk that we would have to revert anything.

What we would need is a HistoricItemFactory (which we currently don't have) that the BaseThingHandler can use.

To avoid the "weird" name, we might also think about a TimeSeriesFactory, which could create structs that directly relate many time-based values with a single item. The call postUpdate(ChannelUID, TimeSeries) would then be unambiguously refer to the (or even multiple) item(s) that are linked to the channel and not leave room for passing updates for different items in a single call (noting that the handler itself should not even know about the items that are linked to the channel).
The method GenericItem.setTimedStates(List<HistoricItem>) would then likewise become GenericItem.setTimeSeries(TimeSeries).

In general, I like your proposal to pass it this way through the CommunicationManager, the GenericItem and through a listener to the PersistenceManager - it feels to me as if we are getting closer to a solution that shouldn't be too hard to implement and that fits into the existing architecture.

@J-N-K
Copy link
Member

J-N-K commented Apr 14, 2023

TimeSeries sounds like a good name and I agree that it would be better to use something like that than a List<Something> because we can then change the internals whenever we like. We don't need a factory then, we can just use the TimeSeries constructor. The class could look similar to this (with proper .equals and .toString methods):

public class TimeSeries {
    private final List<Entry> states = new ArrayList<>();

    public void add(Instant timestamp, State state) {
        states.add(new Entry(timestamp, state);
    }
    
    public Stream<Entry> getStates() {
        return List.copyOf(states).stream();
    }

    public record Entry(Instant timestamp, State state) {
    }
}

WDYT?

Regarding the difference: for historic data we have a good coverage of everything we need in the PersistenceExtensions, we can retrieve and analyze data and the state with the latest timestamp is always the state that needs to be restored on startup.

If we allow future state in the persistence, how we determine the state that we restore, if we want to automatically update the item when we reach the respective time. None of this would affect the persistence itself, but I would like to discuss as much of the design as possible before we add code, since there might be something that you or me don't think of at the moment which needs fundamental changes - even though I don't really expect that.

So my summary would be:

  1. We add a new class TimeSeries that can be used to transport a list of timestamp/state tuples.
  2. We route this via new methods in the ThingHandlerCallback and the CommunicationManager to the GenericItem which then informs its listeners after applying whatever it would do with a simple State.
  3. We add a new ItemTimeSeriesUpdatedEvent (with similar content to the ItemStateUpdatedEvent, with converted units, ...).

Open questions:

  1. Should we add a new type of listener (TimeSeriesListener), or do we just extend the StateChangeListener? I think a new listener would be better, because the StateChangeListener really deals with single states that are updated and changed.
  2. Do we need another type of event (ItemTimeSeriesEvent) to allow the REST API or websockets to send a TimeSeries? I would say yes.
  3. a) Which value shall be restored if there are historic and future values? I think the last value that is before now is correct. b) Shall this be handled by the persistence manager or by the individual persistence services?
  4. Shall the item auto-update its state according to the persisted forecast? IMO yes, because using the same item for forecast and actual value makes no sense.

The actual implementation of 4 can be discussed in the future. If the answer to 3b is "the persistence service" then it needs more changes in the services.

@kaikreuzer
Copy link
Member

WDYT?

TimeSeries class looks good. 👍

how we determine the state that we restore, if we want to automatically update the item when we reach the respective time.

I would see this as a responsibility of the PersistenceManager - it could query whether there are any future states stored for an item and do a query for the very first of them in order to keep this in memory for sending a state update event when the time has come. If new future states are stored, the PersistenceManager is the first to know and can update its internal memory map.

So my summary would be

Sounds good to me.

Open questions:

  1. I'd also vote for a new listener interface.
  2. Yes
  3. a) Clearly always the latest before now. That's the whole idea about this persistence strategy.
    b) Not sure if there needs much to be changed about the current implementation? We can have a closer look and decide.
  4. Yes. That's actually what I described above.

@J-N-K
Copy link
Member

J-N-K commented Apr 14, 2023

Looks like we have the same picture now, nice. @altaroca Do you also agree and would you be willing to adapt the PR accordingly?

@henfiber
Copy link

@J-N-K, @kaikreuzer I understand that the discussion regarding the implementation of this feature is nearing its final decision, and I respect the expertise and knowledge that you and the other core maintainers bring to the table.

However, I would like to kindly request a reconsideration of the concerns I raised in my last comment, as I believe they could lead to a more robust and effective solution. I will rephrase and shorten them a bit here:

  1. Current State should also be affected by a postUpdate(ChannelUID, TimeSeries) call (under conditions listed in my previous comment) if we want to have a consistent Persistence service without mixed clock sources, ignored timestamps or duplicates.
  2. External Event Listeners may be interested in getting this new information as well.
  3. Current State is not special; it is actually a specific Historic state which should have a unique recorded timestamp replicated in persistence.

If I should rank their relative importance I would go with 1 > 3 > 2. So, we may ignore the 2nd one for now, i.e. attaching timestamps to events. The 3rd one can also be addressed separately although I think it is closely related to what we are discussing here.

Regarding the 1st corcern though, I have listed a common case which I believe cannot be handled effectively if the GenericItem ignores the TimeSeries payload and just passes it to Persistence. So, I would greatly appreciate it if you could review this and check whether the proposed implementation addresses it.

I apologize if I come across as overly persistent. I am genuinely interested in this feature, so I am trying to provide feedback in the spirit of what @J-N-K mentioned:

don't introduce something now that we have to revert or break because we find out that we forgot something very important

discuss as much of the design as possible before we add code, since there might be something that you or me don't think of at the moment which needs fundamental changes

@J-N-K
Copy link
Member

J-N-K commented Apr 15, 2023

@henfiber

  1. What kind of state change would you expect if the data is only for the past? And what if they are only for the future. We already agreed on setting the item to the correct state when the time that is associated with the state has come.
  2. What event listeners are you think of? With the ItemTimeSeriesUpdatedEvent the information is on the event bus. Whoever is interested, can subscribe to that event and should get the information.
  3. I don't understand what you mean? We already set the state if it is the current state and we also persist it - if configured.

@henfiber
Copy link

henfiber commented Apr 18, 2023

@J-N-K

  1. What event listeners are you think of? With the ItemTimeSeriesUpdatedEvent the information is on the event bus. Whoever is interested, can subscribe to that event and should get the information.

If ItemTimeSeriesUpdatedEvent will be made avaialble to external consumers (e.g. Habapp through SSE) I am already covered, that would be great.

  1. What kind of state change would you expect if the data is only for the past? And what if they are only for the future. We already agreed on setting the item to the correct state when the time that is associated with the state has come.

I think you have covered the future states situation nicely.

What I am more concerned about is the "gray" area between what is considered "past" and what "current" state. For me the "current" (virtual) state is the latest state we have received before "now" (with or without timestamps attached).

EDIT: Just noticed that in your previous comment you mentioned something similar for restoreOnStartup:

3.a) Which value shall be restored if there are historic and future values? I think the last value that is before now is correct.

This is my exact whole point, the last value before "now" should be the state, but not only for restoring state from persistence, but also for updating state from TimeSeries updates.

I will explain below my perspective about the issues arising when Items only accept states without timestamps.

How historical state is usually retrieved

Devices or services that keep a local history of measurements, do not usually separate between "past" and "current" data. They forward, export or return the last measurements in one of the following ways:

  • Return the X last measurements for the <last Y period> in batch mode (including the current/latest state, i.e. the latest sample)
  • Queue up to X last measurements when communication becomes unavailable and send them in a FIFO manner when communication is restored. This includes the latest ("current") state.

I don't know of any device that sends only "past" data (i.e. excluding the current, latest state), unless it has an API where you can explicitly add a filter argument to do so.

Specific example: OwnTracks

Let's look at a specific example, OwnTracks, and see how it operates and what happens depending on how we handle historical states.

The OwnTracks json payload includes a tst field which contains the UNIX epoch timestamp of the GPS fix. As mentioned in the documentation:

In the event that the endpoint is unreachable, the payload will be queued and posted at a later time.
we queue messages many thousands on the app until connectivity can be established

Additionally:

Outgoing messages may only be sent in batched intervals. Location updates or event messages may be delayed.

So,

  • OwnTracks queues messages and sends them in a FIFO manner when communication becomes available.
  • the GPS-fix timestamps are important since you cannot assume that a received location update occurred "now".

Let's see the sequence of events:

We have an OpenHAB item Bob_Location which is indirectly mapped through a binding to an OwnTracks device tracker.

OwnTracks detects a location update (at 2023-04-18 13:38:25) and queues the message due to a lack of connectivity:

{
  "_type": "location",
  "tid": "BOB",
  "lat": 48.85833,
  "lon": 2.29513,
  "tst": 1681825105
}

OwnTracks detects a second location update (at 2023-04-18 13:39:25):

{
  "_type": "location",
  "tid": "BOB",
  "lat": 48.85950,
  "lon": 2.29640,
  "tst": 1681825165
}

OwnTracks now has connectivity, so it first sends the previously queued message and then the latest one (FIFO).

What should be the in-memory and persisted states of Bob_Location after these two messages?

I would expect that:

  • The Bob_Location Item has the state 48.85950, 2.29640 since this is the most recent location update.
  • There are two new records in persistence with the timestamps and coordinates of the two messages.

oh-historical-persistence-from-bindings-desired-situation


What will be the resulting OH state with currently accepted implementation plan?

From my understanding:

“Items would only set states that have no timestamp”.

i.e. TimeSeries updates are only concerned with Past or Future states but not the current state.

So we end up with this situation:

oh-historical-persistence-from-bindings-proposed-no-state-update

The Bob_Location Item is not updated and does not match the latest state we received from OwnTracks.


Since it does not make sense to leave the Bob_Location item unchanged (the location has changed), an extra postUpdate call without a timestamp has to be made, which will lead to the following situation with duplicate persisted states and mixed clocks:

oh-historical-persistence-from-bindings-proposed-plus-extra-call


If we want to avoid the duplicates, then we somehow need to separate the last ("current") value from the "past" (queued) values.

So we end up with this situation:

oh-historical-persistence-from-bindings-proposed-n-minus-one

Unfortunately, this is not possible for FIFO (non-batch) updates since the binding cannot know which update is the latest. And is not ideal from a consistency perspective since we end up with mixed clock sources.


Finally, we can completely ignore the timestamps and only send postUpdate calls w/o timestamps which defeats the purpose of this feature since we are coming back to the present situation:

oh-historical-persistence-from-bindings-current-situation


Therefore, what I advocate for, is the situation in the first diagram which is accomplished by updating the state of the Bob_Location item, not only from states without timestamps but also from States with timestamps, when the msg.timestamp > State.timestamp && msg.timestamp < now() (the condition is also shown in the flow chart in one of my previous comments) .

Which seems a reasonable expectation to me since an OpenHAB item's state represents the most recent received state.


Appendix: Devices that may operate in a similar manner to OwnTracks

  1. Environmental monitoring sensors (temperature, humidity, air quality, soil moisture etc) which often operate on battery power and need to conserve energy. They may wake up periodically, take readings, timestamp them, and store the data in a local buffer. When communication is available, they send the buffered data in batch mode or one-by-one (FIFO).
  2. Wearables: Devices like fitness trackers, smartwatches (heart rate, step count etc.) buffer the data with timestamps and synchronize with a cloud service or mobile device when communication is available, either through Bluetooth, Wi-Fi, or other means.
  3. Asset tracking devices for vehicles or other objects which may store the location data along with timestamps in a local buffer and transmit the buffered data when in range of a communication network, such as cellular or Wi-Fi.
  4. Some Smart home devices such as smart thermostats (e.g. the Nest Learning thermostat which keeps local history), Air Conditioners which keep consumed energy history (see Panasonic Comfort Cloud binding), and Alarm panels which have local memory.
  5. DIY sensors or wearables, such as a ESP32-based wearable device with a heart rate sensor (e.g., MAX30102) which can be programmed to operate in this way.
  6. Mobile apps such as OwnTracks which I described above. In general, any moving IoT devices which
  7. Other specific devices and services mentioned in the forum such as GROHE ONDUS and the E3DC binding which triggered this Pull Request.

Regarding (3)

Current State is not special; it is actually a specific Historic state which should have a unique recorded timestamp replicated in persistence.
I don't understand what you mean? We already set the state if it is the current state and we also persist it - if configured.

I will try to explain it better in a subsequent comment.


Apologies again for my long comment, hopefully you will find it helpful.

@J-N-K
Copy link
Member

J-N-K commented Apr 20, 2023

I don't think this is an issue. The PersistenceManager can check if the timestamp of the persisted value closest to now (in the past) is newer or older than the an element in the time series and issue an ItemStateEvent we only have to find a way to veto persistence for this value, so it isn't duplicated in the persisted data.

@masipila
Copy link

masipila commented Jul 2, 2023

Hi all,

I thought I'd drop a note to others in this thread that I'm most likely not able to contribute too much on making the energy optimizations after #3597 lands even though I was earlier showing strong interest to contribute and generalize these things to be as easy for non-developer background users as possible.

We got fantastic news a couple of weeks ago that the Finnish Olympic Committee will support our curling team next winter season on our run to Milano Cortina 2026. This will mean that I will work less to be able to curl more, which means that when I'm home I want to spend that time with the family.

Cheers,
Markus

@kaikreuzer
Copy link
Member

Hey @masipila, That's an awesome opportunity for you! All the best for your preparations and whenever you feel you need a break from curling and dive into some code instead, please think of us!

@openhab-bot
Copy link
Collaborator

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/control-a-water-heater-and-ground-source-heat-pump-based-on-cheap-hours-of-spot-priced-electricity/136566/255

@openhab-bot
Copy link
Collaborator

This pull request has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/entso-e-binding-for-nordpool-spot-prices/143833/19

@J-N-K
Copy link
Member

J-N-K commented Nov 12, 2023

Superseded by #3597

@J-N-K J-N-K closed this Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bounty enhancement An enhancement or new feature of the Core persistence
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Historic state updates from bindings