Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extractor: fix reading of null-terminated strings #157

Closed

Conversation

Fedr
Copy link

@Fedr Fedr commented Oct 18, 2024

If a string field has some noise after zero-termination, e.g.

Noise after zero

the size of string is determined incorrectly, and the noise is included in it.

This pull-request resolves it.

@abellgithub
Copy link
Collaborator

If you have found a bug in decoding/encoding LAZ files, please submit an issue for that.

The description of the function that you're wanting to change says that it trims trailing null bytes. It's perfectly fine to have a c++ string containing embedded null bytes.

@Fedr
Copy link
Author

Fedr commented Oct 18, 2024

The description of the function that you're wanting to change says that it trims trailing null bytes. It's perfectly fine to have a c++ string containing embedded null bytes.

The screenshot from the debugger above shows the issue in a real LAS file: after null-byte there are some not-null noise bytes, which are not removed currently.

C++ string indeed can be contain zero-chars inside. But that string will not be equal to "LASF_Spec". So the search for VLR by that string fails. This is the real issue.

@abellgithub abellgithub reopened this Oct 18, 2024
@abellgithub
Copy link
Collaborator

abellgithub commented Oct 18, 2024

What you seem to be showing is that your LAS file is invalid and does not meet the LAS specification IF your expectation is that the User ID is "LASF_Spec":

The User ID field is ASCII character data that identifies the user that created the variable length
record. It is possible to have many Variable Length Records from different sources with different
User IDs. If the character data is less than 16 characters, the remaining data must be null.

You should correct the file and perhaps contact your data provider and report the error.

@Fedr
Copy link
Author

Fedr commented Oct 18, 2024

You should correct the file and perhaps contact your data provider and report the error.

We received LAS file from our customer, so some scanner software in the wild does save such files.

What you seem to be showing is that your LAS file is invalid and does not meet the LAS specification IF your expectation is that the User ID is "LASF_Spec".

Most probably you are right and the file is invalid according to the specifications. But other LAS-processing software, e.g. CloudCompare can open such files and read corrupted "LASF_Spec" records.

So we had to fork your repo and introduce the fix there. Thanks for looking at it!

@abellgithub
Copy link
Collaborator

abellgithub commented Oct 18, 2024

Reading that VLR as a "LASF_Spec" User ID is incorrect according to the specification. That CloudCompare reads it as you expect doesn't make the file or CloudCompare correct. We have seen lots of strange things in these text strings and people may well have expectations that their User ID that contains an embedded null is properly read and written.

It's quite easy to correct the file to meet your expectations.

The file should have a "generating software" string in the header that says which software wrote the file. I would contact the vendor and report the issue.

@Fedr
Copy link
Author

Fedr commented Oct 18, 2024

Thanks for the suggestion. generating_software = "LASzip DLL 3.4 r3 (191111)" in that LAS file.

@abellgithub
Copy link
Collaborator

:(
They have been making changes in LAStools recently and some bugs have crept in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants