Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in renameDimension #1357

Open
veenstrajelmer opened this issue Aug 7, 2024 · 7 comments
Open

Error in renameDimension #1357

veenstrajelmer opened this issue Aug 7, 2024 · 7 comments

Comments

@veenstrajelmer
Copy link

veenstrajelmer commented Aug 7, 2024

  • netcdf version 1.6.5
  • Windows 10

Based on #817, but with the rename ordering switched just to be safe. There seems to be a bug in renameDimension:

import netCDF4 as nc

with nc.Dataset('test.nc', 'w') as fp:
    fp.createDimension('x', 3)
    ncvar = fp.createVariable('x', float, ('x',))
    ncvar[:] = [1.1, 2.2, 3.3]

with nc.Dataset('test.nc', 'r+') as fp:
    print(fp.variables['x'][:])
    fp.renameVariable('x', 'lon')
    fp.renameDimension('x', 'lon')
    print(fp.variables['lon'][:])

with nc.Dataset('test.nc', 'r') as nc:
    print(nc.variables['lon'][:])

Prints:

[1.1 2.2 3.3]
[1.1 2.2 3.3]
[-- -- --]

It seems that the data is corrupted upon saving the file. I would expect it would be just possible to rename a dimension without losing the data. My usecase can be found here: https://forum.ecmwf.int/t/new-time-format-in-era5-netcdf-files/3796/5?u=jelmer_veenstra

This only happens when the variable name is equal to the dimension name, and both have to be renamed. If we comment one of the rename actions, the data is preserved.

@jswhit
Copy link
Collaborator

jswhit commented Aug 8, 2024

possibly related to Unidata/netcdf-c#597

@veenstrajelmer
Copy link
Author

You are linking to an unresolved issue from 2017 that describes a fundamental problem. That is a bit unexpected to me for a well-known and widely-used package like this. It seems that if both dimensions and variables/coords are renamed at once, the issue does not appear. Is that possible via the python API or is there another workaround?

@jswhit
Copy link
Collaborator

jswhit commented Aug 9, 2024

maybe save the data from the variable before renaming, then copy the data back to the renamed variable?

@jswhit
Copy link
Collaborator

jswhit commented Aug 9, 2024

to answer your question, there's nothing in the C API that allows you to rename a dimension and a variable both at the same time.

@veenstrajelmer
Copy link
Author

veenstrajelmer commented Aug 9, 2024

Ok, for me it does not have to be at the same time, as long as the dataset is not messed up.. Your suggestion would be a bit cumbersome. An alternative would be to do the renaming with xarray, but this requires me to save it into as separate dataset as xarray does not change netcdf files inplace. Either way, I think it would still be valuable if this bug is fixed. Would this be possible or not to be expected?

@jswhit
Copy link
Collaborator

jswhit commented Aug 9, 2024

the other workaround mentioned in the issue is to convert the file to netcdf3, then do the reanaming, and convert back. Very cumbersome for sure. Unfortunately, this is not something we can fix here since it's not happening in the python API, but in the underlying C library. I would suggest you contribute your example to the netcdf-c issue and ask for a progress update from the developers.

@veenstrajelmer
Copy link
Author

Thanks, I have posted a reply in the issue you linked before. I was actually not paying attention before and did not notice the linked issue was about netcdf-c. Thanks for looking it up and understandable that if it does not work in netcdf-c, it can also not work in netcdf4-python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants