You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, if a preservation bag exists on preservation storage, only changes in the Figshare data/metadata will result in detecting that a bag being processed is different than the corresponding bag on preservation storage (via the hash in the bag name).
This means that if there are any changes to any other part of the bagged content that is not coming from the Figshare side, (e.g., curation metadata), ReBACH will show a message saying that the bag being created is a duplicate of an existing bag and will not upload it to preservation storage. This is undesirable sometimes since curation files may be added/updated later. However, replacing the existing file when curation data changes is not desirable ALL the time (since it could be the result of an error)
Suggested Implementation
Implement in two phases 1. Add a check to see if the bag to be uploaded is a different size than the one in preservation if the hash in the bag name is the same. Display a warning if not (to allow checking the logs)
2. Add a config and/or commandline flag to enable overwriting existing bags with the same name
Edit: phase 1 isn't possible because Dart handles bag creation and upload so there is no easy way to check the bag size before it's uploaded. Therefore, the only way updated curation files can be uploaded is to overwrite the bag without the check (overwriting is already possible by setting the flag in the bagger config).
The text was updated successfully, but these errors were encountered:
Is there an existing issue for this?
Description
Currently, if a preservation bag exists on preservation storage, only changes in the Figshare data/metadata will result in detecting that a bag being processed is different than the corresponding bag on preservation storage (via the hash in the bag name).
This means that if there are any changes to any other part of the bagged content that is not coming from the Figshare side, (e.g., curation metadata), ReBACH will show a message saying that the bag being created is a duplicate of an existing bag and will not upload it to preservation storage. This is undesirable sometimes since curation files may be added/updated later. However, replacing the existing file when curation data changes is not desirable ALL the time (since it could be the result of an error)
Suggested Implementation
Implement in two phases
1. Add a check to see if the bag to be uploaded is a different size than the one in preservation if the hash in the bag name is the same. Display a warning if not (to allow checking the logs)2. Add a config and/or commandline flag to enable overwriting existing bags with the same name
Edit: phase 1 isn't possible because Dart handles bag creation and upload so there is no easy way to check the bag size before it's uploaded. Therefore, the only way updated curation files can be uploaded is to overwrite the bag without the check (overwriting is already possible by setting the flag in the bagger config).
The text was updated successfully, but these errors were encountered: