-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEV-1125: duplicate holdings cleanup #302
Conversation
README.md
Outdated
@@ -18,7 +18,7 @@ bash bin/setup/setup_dev.sh | |||
|
|||
## Running the tests | |||
|
|||
`docker-compose run --rm dev bundle exec rspec` | |||
`docker-compose run --rm test` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might as well make it hyphenless: docker compose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mostly on board with the functional changes, but I don't understand how you would execute the script as written.
I was confused seeing some of (all?) the small unrelated (?) changes I already approved in PR #300 . Could those have been rebased away?
Not going to let my confusion stand in the way of an APPROVE, though.
@@ -0,0 +1,68 @@ | |||
# frozen_string_literal: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zero comments here. How do I run this?
bundle exec ruby bin/cleanup_duplicate_holdings.rb
does not work.
It does run fine from it's spec
.
Given that it does not require
Services or Cluster, should this be run from inside phctl pry
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, there's probably some missing stuff here that the specs hide. I'll address that and add a comment as to the purpose here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
@@ -0,0 +1,140 @@ | |||
# frozen_string_literal: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bundle exec rspec spec/cleanup_duplicate_holdings_spec.rb
runs well.
I don't see any missing tests.
Right, this needs to be rebased... |
Iterates through clusters and cleans them up one by one
* add brief documentation * ensure it runs
2cee8ac
to
96097f3
Compare
Rebased against main & added some fixes in the bin script for cleaning up duplicate holdings. Will merge after tests pass. |
Context
Many clusters have duplicate holdings for reasons that have previously been discussed; see DEV-1125
Description
This cleans up those clusters by iterating through them one by one, grouping them by update key the same way that
find_old_holdings
does (https://github.com/hathitrust/holdings-backend/blob/main/lib/clustering/cluster_holding.rb#L86-L91) and retaining the most recent in each equality group.@mwarin Mainly I'm interested if you can think of any other tests we should be doing to make sure this does what we want.
This can be reviewed now; once approved we should wait to merge it until after #301