Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

start_time broken with option hd_time_flags=1 #756

Closed
chr-sy opened this issue Aug 24, 2022 · 15 comments
Closed

start_time broken with option hd_time_flags=1 #756

chr-sy opened this issue Aug 24, 2022 · 15 comments

Comments

@chr-sy
Copy link
Contributor

chr-sy commented Aug 24, 2022

I think there is another problem with start times which renders this function kind of broken.

The problem occurs for cases with hd_time_flags=1 (i.e. "local time") set.

According to the MDF spec, activating this bit means that "the start time stamp in nanoseconds represents the local time instead of the UTC time".
In contrast, asammdf produces the respective value (HeaderBlock.abs_time) by means of the datetime.datetime.timestamp() function (within the start_time setter function). This function produces an UTC timestamp (i.e. ns from Unix epoch in UTC) from the start time (which contradicts the MDF spec!)
During asammdf read, this value is processed via the datetime.datetime.fromtimestamp() function, which takes UTC timestamps and converts the resulting datetime to the local time zone.

Thus, times are processed correctly as long as you only use asammdf for read and write operations. Problems arise once you read files generated from 3rd party tools with asammdf. If these tools implement the MDF spec correctly, they write "local" timestamps to the file, which are interpreted as UTC timestamps by asammdf (or rather the underlying standard libs). As a result, time zone differences are added twice resulting in wrong times (as can be seen in the screenshots above).

Now how to resolve this issue?

I'd suggest to add some simple steps for cases with hd_time_flags=1:

  1. start_time setter should interpret all times as UTC (i.e. replace the tzinfo by utc without conversion so that it is not converted automatically) before calling the timestamp() function.
  2. start_time getter should call the fromtimestamp() function with parameter tz=utc and replace the tzinfo of the resulting datetime by the local timezone without conversion (EDIT: In fact, it would be more sincere to remove it completely).

Originally posted by @chr-sy in #675 (comment)

@chr-sy
Copy link
Contributor Author

chr-sy commented Oct 25, 2022

At first, thank you for taking up this issue!
As far as I understand, this commit does not really affect the underlying problem. As I tried to figure out above, there is a contradiction between the mdf spec requesting time stamps in "local time" for hd_time_flag=1 and the asammdf code providing time stamps in utc instead. This is still unsolved...

@danielhrisca
Copy link
Owner

I can't see what else could be done

@chr-sy
Copy link
Contributor Author

chr-sy commented Apr 26, 2023

Sorry for the late reaction, I was busy with other projects...

I just tried the topic described above with version asammdf-7.3.12. The problem still exists:

Please try to run the following code:

import asammdf
import datetime
mdf_file = MDF(version="4.10")
start_time = datetime.datetime(2023, 4, 26, 8, 0, 0)
mdf_file.header.start_time = start_time
mdf_file.save("./local_time_test.mf4", overwrite=True)

According to https://www.epochconverter.com I would expect start_time to be 1682496000 s for the resulting file (tz-naive conversion into timestamp as defined in the MDF spec).
However, asammdf converts the time to 1682488800 s and thus violates the MDF spec.

The reason for this behaviour is (as described above) that asammdf converts the start time from the (guessed) local timezone to UTC in the start_time setter and back from UTC to the (once again guessed) local timezone in the start_time getter. This is not a problem as long as files are written and read only via asammdf (and only written and read in the same timezone) but may fail for 3rd party files which comply with the MDF spec.

The solution would be to write the following timestamp to the header to suppress timezone conversion for tz-naive start_times:
datetime.datetime.timestamp(pytz.utc.localize(start_time))

Thank you very much for your efforts in writing and maintaining this package!

@danielhrisca
Copy link
Owner

danielhrisca commented Apr 26, 2023

When I run this on my PC

import asammdf
import datetime
mdf_file = asammdf.MDF(version="4.10")
start_time = datetime.datetime(2023, 4, 26, 8, 0, 0)
print(start_time.timestamp())
mdf_file.header.start_time = start_time
print(mdf_file.header.start_time.timestamp())
print(mdf_file.header.abs_time)
mdf_file.save(r"D:\TMP\local_time_test.mf4", overwrite=True)

I get the following output

1682485200.0
1682485200.0
1682485200000000000

and the site gives this

image

@chr-sy
Copy link
Contributor Author

chr-sy commented Apr 26, 2023

That is exactly what I wanted to show you. Timestamps are always UTC!

You convert your local datetime.datetime(2023, 4, 26, 8, 0, 0) to 1682485200 which is 2023-04-26 05:00 UTC (so I assume that you are in UTC+3h).

If I opened your mf4 file, this timestamp would give me a local datetime.datetime(2023, 4, 26, 7, 0, 0) because I am in UTC+2h.

As I read the MDF spec, this is NOT the desired behaviour:
grafik
For me, that means that start_time contains a timestamp which is read "as is", independent of the local time zone (more or less like a string date) and always gives the same result. Therefore, tz-naive datetimes would have to be handled as UTC datetimes because that's the only way to avoid any conversion.

@danielhrisca
Copy link
Owner

I can't wrap my head around this UTC/localtime conversion shit. Can you open a PR with a fix?

@chr-sy
Copy link
Contributor Author

chr-sy commented Apr 27, 2023

I will try...

@chr-sy
Copy link
Contributor Author

chr-sy commented Apr 27, 2023

Just created the PR (#857). Maybe there are more locations which need adjustment (file history, string representation, ...).

EDIT: just added another commit with changes to file history time_stamp getter and setter and start_time_string function.

If you want to see the effect, please open mf4 files from other sources than your asammdf lib (e.g. from the Vector MDF4 lib) which use hd_flags = 1.

@markwaterbury
Copy link

I am also seeing this behavior.

My understanding is that any file with time_flags == 1 will have a naive time object (ie. no timezone/DST information included), but the timezone should be interpreted as the local timezone. The MDF Validator displays this in a comprehensible way:
image
Timezone and DST are unknown, but the timezone is assumed to be the local timezone. The time should be represented as 08:52:55 with the timezone being the local timezone.

However, asammdf takes this naive time, and converts it from an assumed UTC timezone to the local time, offsetting the time to 03:52:55, which is not correct, although it now has the local timezone, which is correct.
image

As a workaround, I have to do the following somewhat convoluted method to get this time into the proper format from what asammdf returns:
image
.astimezone(timezone.utc) offset the time back to its original value
.replace(tzinfo=None) switch back to a naive time object
.astimezone() add the local timezone to the object without changing the value

I believe the proper way to handle this type of time (with time_flags == 1) is to simply take the naive time, and add the local timezone to it with .astimezone()
Example:
datetime.now() returns a naive time
datetime.now().astimezone() returns the same time value, but with the local timezone set on the object

Hopefully this helps to clarify what needs to be fixed.

@chr-sy
Copy link
Contributor Author

chr-sy commented Mar 25, 2024

@markwaterbury: Did you see PR #857? There I tried to implement exactly the same:
In my understanding of the mdf spec, the use of hd_time_flags = 1 should be limited to very special cases ("Should only be used if UTC time is unknown", i.e. your code/system/... doesn't know your local time zone, i.e. you are working with tz-naive time information). Thus, I tried to implement the start_time getter and setter functions to accept/provide tz-naive datetime objects, respectively. Converting those into any timezone should be done outside of asammdf in my opinion.

@markwaterbury
Copy link

I did see your PR. It seemed like there might have still been some confusion from @danielhrisca, so I was just trying to provide additional context for this issue (which seems like it should still be open).

@danielhrisca
Copy link
Owner

Hello @chr-sy @markwaterbury and sorry for the long delay. I've merged the PR into the development branch. Thanks!

@markwaterbury
Copy link

This is working from the standpoint of not offsetting the time from UTC now. However, I just saw the following from above:

I tried to implement the start_time getter and setter functions to accept/provide tz-naive datetime objects

and I see that in the development branch, there is no timezone data on the start_time when time_flags==1 now. I think the behavior it had before (tzinfo==tzlocal()) should be retained.

Example, time should be 08:52:55 with the local timezone assumed:
7.4.2 version (tz correct, time wrong):
datetime.datetime(2024, 2, 7, 3, 52, 55, tzinfo=tzlocal())
Development branch (time correct, but tz missing):
datetime.datetime(2024, 2, 7, 8, 52, 55)
start_time should return:
datetime.datetime(2024, 2, 7, 8, 52, 55, tzinfo=tzlocal())

@chr-sy
Copy link
Contributor Author

chr-sy commented May 22, 2024

Sorry @markwaterbury, I disagree about that!

If you have any way to get a valid tzlocal, why should you use time_flags=1? Please bear in mind that, according to the MDF spec, "(time_flags=1) should only be used if UTC time is unknown" (see Table 14: Structure of HDBLOCK). In my understanding, it is pretty straightforward to get a tz naive timestamp in such a case because a tz naive input is the only scenario to use time_flags=1.

@markwaterbury
Copy link

Bit 0: Local time flag
If set, the start time stamp in nanoseconds represents the local time instead of the UTC time

When the file was created, there was no way to get the local timezone, because it (and UTC time) were unknown. When I read the above from the spec, it seems that local time would have to be assumed whenever or wherever the file is read, even if that is different than the (unknown) timezone in which the file was created. I thought the best way to do this would be to assign tzlocal(). This seems to be how MDFValidator handles this situation as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants