-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simpler chunk length check, avoid recursion limit crash #803
Conversation
def __new__(cls, *args, **kwargs): | ||
"""Mandatorily wrap _read_chunk in a check_chunk_n decorator.""" | ||
if "_read_chunk" in cls.__dict__: | ||
method = getattr(cls, "_read_chunk") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @JelleAalbers for the improvement. I understand the request from Carlo and your new implementation. I implemented #758 to force-check the chunk size. Can we do something like hasattr(cls._read_chunk, '__closure__') and cls._read_chunk.__closure__ is not None
here to test whether _read_chunk
is decorated?
Or use some method like https://stackoverflow.com/a/68583930.
Because I wanted to make sure the chunks are really checked, so applied the wrapper at the lowest level before strax.load_file
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I understand the check is important, and the storage class hierarchy is sufficiently confusing that I understand you reached for decorators. And there will be a way to fix the decorator solution with more metaprogramming (at the risk of making more hard-to-spot bugs).
I think _read_and_format_chunk
is the better place to do this check. The _read_chunk
method returns a numpy array, not a strax Chunk: it's _read_and_format_chunk
's job to turn that into a chunk. Creating a chunk triggers checks, like whether the data matches the time range, the start and end time are integers, etc. So it makes sense to check the chunk length matches the metadata at the same time.
There is no code that calls _read_chunk
directly -- at least not in the AxFoundation or XENONnT GitHub. The only reason I'd write something like that is to avoid all the checks that come with making a strax chunk, like when trying to read data whose metadata is corrupted. Then you don't want the chunk length check either.
@FaroutYLq did you check the chunk size manually in outsource but not using |
No I was using |
No not really. |
What is the problem / what does the code in this PR do
This fixes a crash @cfuselli always sees after restrax for some time.
Can you briefly describe how it works?
In #758 we made StorageBackend check that the chunk length matches the length specified in the metadata. We wrap the class attribute _read_chunk in
__new__
. But__new__
runs every time we make a new instance. Applications that make backends many times (like a running restrax session) end up with many layers of wrappings, and eventually hit python's recursion limit.This keeps the check (and its unit test) but implements it more simply. Since
_read_chunk
is only used inStorageBackend._read_and_format_chunk
, we can just add the check there. It will still apply to every storage backend.@dachengx please let me know if I missed a specific reason behind the implementation in #758!