Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More debugging tools/information for dealing with issues in bytes processing #102

Open
achaikou opened this issue May 3, 2019 · 1 comment

Comments

@achaikou
Copy link
Contributor

achaikou commented May 3, 2019

Let say DLISIO throws an error on opening some file.
If user wants to dig deeper into binary and figure out where is the problem, user might appreciate more information than is currently provided.

  • It would be nice if user knew exact byte position where failure happened (or at least values of neighbour bytes to be able to identify failure position in file on their own).
  • It would be beneficial to have easy access to already processed attributes in the same set, because actual failure might happen many bytes before reported place (imagine if bytes were incorrectly interpreted as 100-bytes long indent. Failure would be thrown at least 100 bytes after the actual error occurred.)

Hence it might be worth considering to extend support for binary debug.

@achaikou
Copy link
Contributor Author

More useful points for issues investigation:

  1. Include failure offset in error messages.
    Make all the main methods (findoffsets, stream.at, read_fdata, parse functions, etc) return offset of the failure point. When figuring out what exactly is wrong with the data this is the main point for manual investigation (one needs to jump to the correct point in the file).
    Major problem is however that this data is lost during high levels of processing.

  2. Fail early if virtual record header doesn't contain bytes "FF01".
    Currently dlis_vrl completely ignores 'FF' bytes and majorly ignores version (01). If we process these bytes, we might fail early when they are not as expected (so this is not start of new VR). Right now we do nothing with them probably with consideration of other possible versions of dlis. However in all seen files bytes always were "FF01", so it might be beneficial to drop that all together.

  3. Introduce the check "current logical record length (len) should be bigger than data remaining in VR (remaining)" in findoffsets.
    Without this check we often end up with negative value of remaining variable, which already is a clear indication that something went wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant