Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to find data to download. #1

Open
sofroniewn opened this issue Mar 5, 2020 · 3 comments
Open

Trying to find data to download. #1

sofroniewn opened this issue Mar 5, 2020 · 3 comments

Comments

@sofroniewn
Copy link

I was trying to explore this after finding it in zarr-developers/zarr-specs#50 (comment), but I havn't used Globus before and encounter this screen after following the link and making an account

image

If I poke around into the folders I just see h5 files not all the tiffs that the notebook seems to load. Are they in there too? Also do you know how big the data is (i.e. is it too big to download onto my laptop)?

@thewtex
Copy link
Owner

thewtex commented Mar 6, 2020

Hey @sofroniewn !

Globus can be a bit tricky to set up and also flaky in transfer.

If I poke around into the folders I just see h5 files not all the tiffs that the notebook seems to load.

It is a little deceiving -- the dataset I worked with is a folder that ends in .h5, not actually an HDF5 file -- I think it is an artifact of how the data is generated. Another example of why it is better to put metadata in the file metadata and not the file name ;-). Inside the folder are a series of TIFF files.

Also do you know how big the data is (i.e. is it too big to download onto my laptop)?

The entire Globus dataset is on the order of a terabyte. Just that volume is around 10 gigabytes.

However, the zarr-ified data is uploaded and available! 🎉

https://fiber-bed-zarr.netlify.com/

Here is a notebook that demonstrates how to load it with Xarray, and how easy it is to get the corresponding, xarray.Dataset, xarray.DataArray, dask.array, and numpy.ndarray:

https://github.com/thewtex/fiber-bed-zarr/blob/master/LoadAndView.ipynb

If you know if/how napari can handle xarray DataArray's, multiresolution pyramids, correctly handle the changing locations of pixels in pyramids, etc. it would be awesome to demo that in the notebook.

@sofroniewn
Copy link
Author

Just that volume is around 10 gigabytes.
However, the zarr-ified data is uploaded and available! 🎉
https://fiber-bed-zarr.netlify.com/

Perfect! Downloading 10GB is do-able. I click on that link though and I see

image

Should a download have started automatically?

If you know if/how napari can handle xarray DataArray's, multiresolution pyramids, correctly handle the changing locations of pixels in pyramids, etc. it would be awesome to demo that in the notebook.

Definitely want to give this a try!!

@thewtex
Copy link
Owner

thewtex commented Mar 9, 2020

Should a download have started automatically?

Nope, that is just a note to provide context. Download the metadata file with, e.g.:

wget https://fiber-bed-zarr.netlify.com/rec20160318_191511_232p3_2cm_cont__4097im_1500ms_ML17keV_6.zarr/.zmetadata

Thanks to Zarr ❤️ , there is not one big file, and we can just download what we need. :-D. The notebook shows how we can get a Python interface in just a few lines of code.

Definitely want to give this a try!!

Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants