-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to handle data with mixtures of Grib 1 and Grib 2? #244
Comments
@alxmrs, as you're probably aware, pangeo-forge-recipes/pangeo_forge_recipes/recipes/xarray_zarr.py Lines 293 to 297 in e6fdf87
but currently these kwargs are applied uniformly across all inputs. (So no way to vary I don't have first hand experience with pangeo-forge-recipes/pangeo_forge_recipes/recipes/xarray_zarr.py Lines 657 to 658 in e6fdf87
that is applied to every input pangeo-forge-recipes/pangeo_forge_recipes/recipes/xarray_zarr.py Lines 305 to 306 in e6fdf87
Do you think there's a way to get your desired filtering via something like def filter_grib(ds: xr.Dataset, filename: str):
vars_to_drop = dict(
grib_1= # iterable of vars to drop if input file is GRIB1 format
grib_2= # iterable of vars to drop if input file is GRIB2 format
)
if some_grib_1_identifier in ds.attrs:
ds = ds.drop(labels=vars_to_drop["grib_1"])
elif some_grib_2_identifier in ds.attrs:
ds = ds.drop(labels=vars_to_drop["grib_2"])
else:
raise ValueError("GRIB version not identifiable from `ds.attrs`")
recipe = XarrayZarrRecipe(..., process_input=filter_grib, ...) ? Depending on how many inputs you have and/or the information encoded in their filenames, rather than inferring the GRIB version from |
Quick note:
Some datasets cannot be loaded at all, because the different parts conflict in their coordinates definitions. Maybe that doesn't apply in this case, but I've certainly seen it. |
The PR I just stared in #245 should allow you to handle this use case by providing a custom "Opener" which would dispatch the correct options depending on the filename or any other information passed from the FilePattern. |
That's exactly the case that I'm running into – and is common with grib. #245 would definitely solve this issue! With that, we could prevent these kinds of error by suing |
I'm running the
XarrayZarrRecipe
on an internal Era 5 dataset. I just found out it uses a mixture of Grib 1 and Grib 2 standards within the same files. The simple way I can convert the corpus to Zarr would involve filtering out some of the data (e.g. ecmwf/cfgrib#2): The waycfgrib
works withxarray
is to get all the variables, we have to callopen_dataset
on the same file with differentfilter_by_key
arguments.Is there a clean way to work with mixed variable grib files today with pangeo-forge? If not, do we update the recipe to handle this use case?
xref:
CC: @rabernat @cisaacstern
The text was updated successfully, but these errors were encountered: