Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Export #1140

Merged
merged 9 commits into from
Feb 4, 2022
Merged

Batch Export #1140

merged 9 commits into from
Feb 4, 2022

Conversation

BryonLewis
Copy link
Collaborator

@BryonLewis BryonLewis commented Jan 27, 2022

fixes #812
fixes #925

Updated 20220131:

  • Removed the GET dive_annotation/{id}/export for an endpoint GET dive_annotation/export where an array of folderIds is passed in. Endpoint will now load the multiple folders and pass them to an updated export_datasets_zipstream. If there are multiple folders it names the output batch_export.zip if not it will use the single folder's name.
  • updated export_datasets_zipstream to export_datasets_zipstream to support multiple girder folders. It will then use the settings and zip up the multiple folders instead of a single folder. Just a minor restructuring of the function to support multiple datasets.
  • Client side platform/web-girder/views/Export.vue will now take in an array of folderIds instead of a single one.
  • Behavior for a single dataset is the same using a new Ref singleDatasetId for multiple datasets there is only the option to export everything.
  • Updated all references to the web-girder/Export.vue to pass in an array instead of a single datasetId.
  • Added some integration tests that will test the regular export zip and batch export. I avoided the testCharacters user because the names of the exported files don't match-up properly.

@subdavis
Copy link
Contributor

duplicate all of our existing export options (configuration, annotation, media, everything) for multiple datasets Or only allow full export when selecting multiple datasets?

full export only, IMO

If you select multiple datasets you have the same options as before except all data will be downloaded into a single zip file instead of just downloading configuration files or oannotations.

Yep

Behavior for selecting a single dataset should remain the same as it was before

Yes

Selecting a non-folder dataset will git the standard https://viame.kitware.com/api/v1/resource/download endpoint

Out of scope, IMO. We don't support this now and shouldn't add it. The thing it would download is almost certainly not what the user wants.

New endpoint

Yeah, just change the params of the existing one to accept an array, good call. I don't even think you need to rename it.

@BryonLewis
Copy link
Collaborator Author

I still believe that we need to have separate endpoints for exporting a single dataset vs multiple. Mostly because I like the clarity of GET /dive_dataset/{id}/export and then having an additional one with less options GET /dive_dataset/batch_export which takes an array of folders but doesn't have the other options.

I think in implementation I would modify the export_dataset_zipstream to export_datasets_zipstream where instead of a single girder folder it takes an array of folders and adds them to a single zip file.

@BryonLewis BryonLewis marked this pull request as ready for review January 31, 2022 13:43
Copy link
Contributor

@subdavis subdavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another small edge case:

Batch export for clones where the source media is missing will now fail in a confusing/unclear way.

Before, if you tried to export a dataset where the clone chain is broken it looked like this.

Screenshot from 2022-01-31 11-56-22

But bulk export will let you include a broken clone (expected). I think the resulting ZIP needs to have a "failures.txt" file or the failed dataset's folder needs to have a placeholder file that says what happened If you have a better idea, that's cool too.

client/platform/web-girder/views/Home.vue Show resolved Hide resolved
server/dive_server/views_dataset.py Outdated Show resolved Hide resolved
server/dive_server/crud_dataset.py Outdated Show resolved Hide resolved
subdavis
subdavis previously approved these changes Feb 4, 2022
Copy link
Contributor

@subdavis subdavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works perfectly.

One nit to remove some dead code.

Comment on lines -92 to -94
exportTargetId() {
return this.exportTarget?._id || null;
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was the last reference of exportTarget, so that can be removed as well.

@BryonLewis BryonLewis merged commit cf6222a into main Feb 4, 2022
@BryonLewis BryonLewis deleted the batch-export branch February 4, 2022 18:27
BryonLewis added a commit that referenced this pull request Feb 24, 2022
* templating batch export

* initial adding of batch download features

* making clear why not using testCharacters

* fix linting

* minor fixes

* mend

* fixing failed data

* addressing comments and adding in tests

* removing last exportTarget reference
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants