Replies: 19 comments 13 replies
-
In configure/CMake we need to check for zstandard library. If --enable-zstandard is used, and the library is not found, that's a configure error. Here's the CCR code, which turns on zstandard by default, but lets users build without it with --disable-zstd:
The HDF5 filter code is shipped with netCDF, compiled, and installed in the HDF5 filter directory (which can be overridden at configure). A copy of the code (one C file) can be found here: We add a new set of functions to the dispatch table. Here's what works in CCR:
These have to be used before enddef, after def_var. Similar functions as wrappers in F77. In F90 additional argument in def_var, and two in inq_var. Here's the implementation from CCR. This is working code that was recently extensively tested at NOAA. @DennisHeimbigner may wish to implement this in the Zarr layer as well.
And BTW I see that this is all Charlie's code! Thanks @czender ! |
Beta Was this translation helpful? Give feedback.
-
Once we start down this road, there is no reason to stop at zstandard. I plan to also add blosc |
Beta Was this translation helpful? Give feedback.
-
I agree that blosc is more complex and needs a second discussion. That's
why I proposed zstandard - it's a good fit for netCDF and will help a lot,
right now. It's the low-hanging fruit.
Once zstandard is in place, we can discuss blosc - as Ryan points out, that
may be more complex and need more work, for a smooth integration. Or that
may not seem like a good idea, once we have examined it.
WRT the nc_def_var_zstd()/nc_inq_var_zstd() functions, I believe they are
still necessary, for two reasons:
1 - putting together the correct filter arguments is complex and something
most science programmers will get wrong before they get right.
2 - the F77/F90 APIs really benefit from having simple C functions to wrap.
So I think we want a nc_def_var_zstandard()/nc_inq_var_zstandard()
function. It does not have to be in the dispatch table, since the filter
functions in the dispatch table already can be used. But it should be in
the API and the documentation.
…On Fri, Feb 11, 2022 at 11:38 AM Ryan May ***@***.***> wrote:
Blosc needs to have the same consideration as zstandard, so I would
caution to go a little more slowly than "I plan to also add blosc as well".
The technical burden for zstandard were light due to the availability of a
native Java implementation. Last we discussed, I believe the Java
implementation of Blosc relies on JNI, which means cross-platform
availability is an open question.
—
Reply to this email directly, view it on GitHub
<#2214 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCSXXF44NHC4FVPSUHT7ELU2VJUFANCNFSM5NQICZ6A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I guess I was not clear; I agree about having nc_def/inq_var_zstandard (or should it be called zstd to match |
Beta Was this translation helpful? Give feedback.
-
OK, quick work!
The Java requirement has never been the case previously, and indeed
netcdf-java can read many formats that netcdf-c cannot. ;-)
However, the most important thing, IMO, is to get some advanced compression
into netcdf-c/netcdf-fortran. If zstandard has achieved consensus, let's go
for that. If blosc needs more discussion, let's not hold up zstandard -
let's get it out there to the users, and work out the question of what to
do with blosc.
…On Fri, Feb 11, 2022 at 1:18 PM Dennis Heimbigner ***@***.***> wrote:
I guess I was not clear; I agree about having nc_def/inq_var_zstandard (or
should it be called zstd to match
NumCodecs?).
I already have blosc implemented for netcdf-c. So perhaps
we need to decide if always having a Java version is the limiting factor.
For example, is there a native Java SZIP implementation?
—
Reply to this email directly, view it on GitHub
<#2214 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCSXXH3UYIYPY26VLOEAADU2VVILANCNFSM5NQICZ6A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I certainly understand that blosc raises some serious issues, which is why
the current proposal focuses on zstandard.
Once zstandard is in, I would be happy to help with blosc in any way, if
and when it is considered suitable. There are some Java questions that no
doubt need to be sorted out. It may be that blosc also has an easy Java
solution, as zstandard does.
…On Fri, Feb 11, 2022 at 2:16 PM haileyajohnson ***@***.***> wrote:
I think blosc is going to be a necessary headache for the java library,
the feedback from the Zarr community is generally that the Zarr support
doesn't amount to much without blosc.
—
Reply to this email directly, view it on GitHub
<#2214 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCSXXCY4OGFMZTOVDZ3YWLU2V4DTANCNFSM5NQICZ6A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Lack of development effort on netcdf-java should not hold back C
development, which is where the data producers are. And it never has - when
netcdf-4 was first developed, netcdf-java could not read those files. It
took John some time to implement that code, but that was after netcdf-4
already had proved itself useful and data producers were switching to it.
I am happy to hear of efforts to bring zstandard to netcdf-java, and that
is a huge step forward. Let's not get hung up in arguments about possible
future additions - let's instead move forward on the consensus we have
achieved around zstandard, and get that out to the users. The inclusion of
zstandard will give us time to further examine blosc, including how
netcdf-java may handle it.
…On Fri, Feb 11, 2022 at 2:09 PM Ryan May ***@***.***> wrote:
If netcdf-java could write files that netcdf-c was unable to read, that
would be bad and I'd imagine many people here would have strong opinions
regarding that.
I'm always surprised that I have to sell the fact that netcdf-c
consciously making files that netcdf-java is unable to read is a bad thing.
—
Reply to this email directly, view it on GitHub
<#2214 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCSXXHVFIP6GBQCXWFXNH3U2V3KPANCNFSM5NQICZ6A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Ed is correct in that you (Ryan) are adding a new constraint to what filters we provide. |
Beta Was this translation helpful? Give feedback.
-
Ed, On a different note. |
Beta Was this translation helpful? Give feedback.
-
If you mean, do we need a --disable-zstandard/--enable-zstandard option
with configure, we don't really.
Instead, we could require zstandard, the same way zlib is required.
…On Fri, Feb 11, 2022 at 3:41 PM Dennis Heimbigner ***@***.***> wrote:
Ed, On a different note.
Do we need an option to disable zstandard?
Is there really any use case for this?
—
Reply to this email directly, view it on GitHub
<#2214 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCSXXCCPSDO5MWE3NFLESDU2WGCZANCNFSM5NQICZ6A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
So once we know that libzstd is installed, we need to have the HDF5/NCZarr wrapper for it. |
Beta Was this translation helpful? Give feedback.
-
The implementation I have created automatically disables zstandard if libzstd is not found. |
Beta Was this translation helpful? Give feedback.
-
Ryan, perhaps a good way forward would be to focus on what you can do to
help netcdf-java keep up. Do you have a plan to incorporate zstandard
compression in netcdf-java? How long will it take? (Or has this already
been done?) It should be no more work in Java than it was in C/Fortran - it
took me a few days work at most.
What's the progress on netcdf-java with zstandard?
|
Beta Was this translation helpful? Give feedback.
-
WRT enable/disable options.
1 - Dennis, it is not good to automatically detect the zstandard library
and add support for zstandard if the library is found, and not if not
found. This will hide the absence of zstandard from the builder of the
software. We need the builder to take responsibility one way or another.
2 - Should it be always enabled, as zlib is? That would be useful just as
it's very useful to know that zlib is available for all netCDF builds. I
think this is the best choice. If we don't find the zstandard library, we
stop the build and demand that the user install it. This is what we do for
zlib - we require it in all cases. There is no --disable-zlib option.
3 - If it cannot be always required, it should be enabled by default. This
would mean that if the zstandard library is not present, the build would
stop. The user must either install it, or rebuild with --disable-zstandard.
This way, the builder takes responsibility for building without zstandard,
and knows that.
4 - The least best option, IMO, is to have zstandard disabled by default,
as szip is. In this case, most users will not enable it, leading to all
kinds of support questions when those users try to open a file compressed
with zstandard.
Since zstandard is so widely available, it makes sense to assume it can
always be used.
|
Beta Was this translation helpful? Give feedback.
-
NOAA will probably be switching to zstandard sometime this year, in UFS R&D
work. In the course of time, this will become operational code. So
zstandard data files are coming, on their own schedule. (This can be done
with current releases of netCDF, HDF5, and CCR. See my AGU
extended abstract for details:
https://www.researchgate.net/publication/357001251_Quantization_and_Next-Generation_Zlib_Compression_for_Fully_Backward-Compatible_Faster_and_More_Effective_Data_Compression_in_NetCDF_Files?_sg%5B0%5D=WV1a3JRFVw4ijMCIZ_ArqVNDxV5a2BIyoF-JzFvPdibjT8risKO_vIvnjcgpR1YTPM4RDkhUklSyZPJoRGdJmCp2RWn4XvhvutOl0Qm7.paT902AXrqzvj6m7H-Tgii52biW3w6__E2OxZbCACTSIV-6ZbfR2-YNsTkfnJJBIdxskEae7MiiSxy_Wbuc2mA
)
As Ryan points out, if something better comes along, like BLOSC, that may
also be used. Once again, immediate operational requirements will drive
decisions. Ryan, I'm happy to hear you will be advocating for even better
compression, because that opportunity is probably coming.
However, *this* discussion is about adding zstandard, not those future,
perhaps even better, compression libraries. Perhaps someone should start a
separate discussion about BLOSC.
|
Beta Was this translation helpful? Give feedback.
-
Dennis, from your remarks about the dispatch table, I gather that the same
filter commands that turn on (for example) zstandard for netCDF/HDF5 files
will also turn it on for Zarr? If so, very neat.
That means that yes, the filter available call needs to be in the dispatch
table (maybe nc_inq_filter_avail()?). That's a great idea to avoid having
to change the dispatch table, but still being able to add convenience
functions like nc_def_var_zstandard()/nc_inq_var_zstandard(), which will
work for both netCDF/HDF5 files and Zarr. Clever!
|
Beta Was this translation helpful? Give feedback.
-
I do not like the idea of making zstandard required in order to deploy netcdf-c. |
Beta Was this translation helpful? Give feedback.
-
I have started separate discussions for LZ4 and BLOSC, so that this discussion can remain focused on Zstandard. |
Beta Was this translation helpful? Give feedback.
-
Possibly this discussion should be closed, since support for zstandard was included in the 4.9.0 release... |
Beta Was this translation helpful? Give feedback.
-
Creating this discussion thread to follow on from #2173.
Beta Was this translation helpful? Give feedback.
All reactions