-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: invalidate metadata cache when dataset fails to load #87
Conversation
Codecov Report
@@ Coverage Diff @@
## main #87 +/- ##
==========================================
+ Coverage 71.82% 71.83% +0.01%
==========================================
Files 126 126
Lines 10086 10106 +20
==========================================
+ Hits 7244 7260 +16
- Misses 2842 2846 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
So the motivation for this change is to ensure cache provides the latest data if/when an Explorer URL-to-S3 Path mapping is no longer valid for a dataset? When is this occurring in practice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks correct! Have Q's and comments, but can approve if you don't find any of them to be a concern.
server/app/app.py
Outdated
create_data_function=MatrixDataLoader( | ||
location=dataset_metadata["s3_uri"], url_dataroot=url_dataroot, app_config=app_config | ||
).validate_and_open, | ||
create_data_args={}, | ||
) | ||
|
||
|
||
def expire_metadata_cache(url_dataroot: str = None, dataset: str = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest evict_dataset_from_metadata_cache
, just so it doesn't sound like the entire cache is being expired/cleared.
server/app/app.py
Outdated
@@ -143,6 +147,7 @@ def wrapped_function(self, dataset=None): | |||
data_adaptor.set_uri_path(f"{self.url_dataroot}/{dataset}") | |||
return func(self, data_adaptor) | |||
except (DatasetAccessError, DatasetNotFoundError, DatasetMetadataError) as e: | |||
expire_metadata_cache(self.url_dataroot, dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can call this itself raise an exception that should be handled? or is it reasonable to generate a 500 server error in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that cache items will only be evicted if write lock can be acquired. Not sure if that's a practical concern, but might the cache need to delay eviction after write lock has been released? (e.g. marking the entry as evict_requested
and using that upon lock release)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we get here, it means the dataset could not be loaded, and thus the other cache, matrix_data_cache_manager
, should also not have this data cached. Maybe add a comment to that effect, unless you think it's brutally obvious. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out-of-scope for this PR, but might a tombstoned dataset be removed from the matrix_data_cache_manager
? Any need for maintaining its data? I assume it's worth keeping the metadata cached for tombstoned datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re tombstoned datasets we want to store that information for redirects back to the dataportal in the case that someone finds the url for a deleted dataset from an external site (from a publication or other none data portal source)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re eviction with lock, the metadata cache only stores the path to the dataset so it is unlikely that it will be locked. If by some chance it is, this code path is used for every endpoint so it will basically be called immediately after. But let me know if you think this is an antipattern or has the potential to cause issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dontt think the call should raise issues but I can wrap it in a try except and log any errors just in case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx for explanations, no further concerns; adding the extra try/except sounds good.
def evict_by_key(self, cache_key: str): | ||
evict = self.data.get(cache_key, None) | ||
if evict: | ||
self.evict_data(to_del=[(cache_key, evict)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why does evict_data
need anything more than the cache_key
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could probably be refactored to only take the cache_key (and use that to lookup the cache item) but currently it calls the attempt_delete function on the cache item in order to call the delte/cleanup on the datasets
bad_response = self.client.get(url) | ||
self.assertEqual(bad_response.status_code, 404) | ||
self.assertEqual(mock_expire.call_count, 1) | ||
response_body_good = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this happy path test, below, not already being tested elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a newline above this to separate the logical tests, or introduce a new test method for clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to check that the metadata api is called after expiring the cache. This code path is probably tested elsewhere but I wanted to be explicit about the expected behavior when an error causes the cache to be expired
This happens when a dataset is replaced or deleted in revision and that revision is pubished |
70e4c55
to
5d8fb58
Compare
Reviewers
Functional:
Readability:
Changes
Update tests to use symlinks to allow for dataset cache key change