This repository has been archived by the owner on Aug 4, 2023. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes
We received an error during a recent run of the Science Museum DAG:
Description
We're currently needing to split Science Museum ingestion up into small sections by date ranges, to circumvent an issue where we receive unexpected results when querying a dataset larger than 50 pages. The above error message was added to warn us when one of our year ranges exceeds 50 pages, so that we can decrease the range.
For this PR, I did a quick audit of the page count for each of our existing year ranges and adjusted all of them to ensure that they all have less than 40 pages of results. This should hopefully prevent us from needing to tweak this again.
Testing Instructions
just test
should be sufficient.If you want to repeat my test for making sure each range has less than 40 pages, you can locally update
get_batch_data
:Checklist
Update index.md
).main
) ora parent feature branch.
errors.
Developer Certificate of Origin
Developer Certificate of Origin