Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetCDF 4.9.0: nc_def_var_deflate fails with string variables: Filter error: bad id or parameters or duplicate filter #2480

Closed
Alexander-Barth opened this issue Aug 19, 2022 · 2 comments

Comments

@Alexander-Barth
Copy link
Contributor

Alexander-Barth commented Aug 19, 2022

I got a bug report from a julia user that deflate does no longer work with strings in NetCDF 4.9.0:
JuliaGeo/NCDatasets.jl#186

  • environmental information (i.e. Operating System, compiler info, java version, python version, etc.)

Linux 5.15.0, gcc 5.2.0

  • a description of the issue with the steps needed to reproduce it

The following C code (adapted from your test suite) reproduce this error independently from julia:

#include <netcdf.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define ERR do {                                                        \
        fflush(stdout); /* Make sure our stdout is synced with stderr. */ \
        fprintf(stderr, "Sorry! Unexpected result, %s, line: %d\n",     \
                __FILE__, __LINE__);                                    \
        fflush(stderr);                                                 \
        return 2;                                                       \
    } while (0)

#define FILE_NAME "tst_strings.nc"
#define DIM_NAME "line"
#define VAR_NAME "measure_for_measure_var"
#define NDIMS 1
#define MOBY_LEN 16

int
main(int argc, char **argv)
{
     int ncid, varid, dimids[NDIMS];
      char *data[] = {"Perhaps a very little thought will now enable you to account for ",
		      "those repeated whaling disasters--some few of which are casually ",
		      "chronicled--of this man or that man being taken out of the boat by ",
		      "the line, and lost.",
		      "For, when the line is darting out, to be seated then in the boat, ",
		      "is like being seated in the midst of the manifold whizzings of a ",
		      "steam-engine in full play, when every flying beam, and shaft, and wheel, ",
		      "is grazing you.",
		      "It is worse; for you cannot sit motionless in the heart of these perils, ",
		      "because the boat is rocking like a cradle, and you are pitched one way and ",
		      "the other, without the slightest warning;",
		      "But why say more?",
		      "All men live enveloped in whale-lines.",
		      "All are born with halters round their necks; but it is only when caught ",
		      "in the swift, sudden turn of death, that mortals realize the silent, subtle, ",
		      "ever-present perils of life."};
      int i, status;

      if (nc_create(FILE_NAME, NC_NETCDF4, &ncid)) ERR;
      if (nc_def_dim(ncid, DIM_NAME, MOBY_LEN, dimids)) ERR;
      if (nc_def_var(ncid, VAR_NAME, NC_STRING, NDIMS, dimids, &varid)) ERR;

      status = nc_def_var_deflate (ncid, varid, NC_NOSHUFFLE, 1, 4);
      if (status != NC_NOERR) {
        fprintf(stderr, "%s\n", nc_strerror(status));
        exit(-1);
      }

      if (nc_put_var_string(ncid, varid, (const char **)data)) ERR;
      if (nc_close(ncid)) ERR;
}

Here is how it is compiled:

$ gcc tst_strings_deflate.c $(nc-config --cflags --libs) && ./a.out 
NetCDF: Filter error: bad id or parameters or duplicate filter

NetCDF is configured as:

./configure --prefix=/workspace/destdir --build=x86_64-linux-musl --host=x86_64-linux-gnu --enable-shared --disable-static --disable-dap-remote-tests --disable-plugins

Removing the option --disable-plugins did not make a difference. This the output of nc-config --all:

This netCDF 4.9.0 has been built with the following features: 

  --cc            -> cc
  --cflags        -> -I/workspace/destdir/include -I/workspace/destdir/include
  --libs          -> -L/workspace/destdir/lib -lnetcdf
  --static        -> -lhdf5_hl -lhdf5 -lm -lz -ldl -lxml2 -lcurl 

  --has-c++       -> no
  --cxx           -> 

  --has-c++4      -> no
  --cxx4          -> 

  --has-fortran   -> no
  --has-dap       -> yes
  --has-dap2      -> yes
  --has-dap4      -> yes
  --has-nc2       -> yes
  --has-nc4       -> yes
  --has-hdf5      -> yes
  --has-hdf4      -> no
  --has-logging   -> no
  --has-pnetcdf   -> no
  --has-szlib     -> no
  --has-cdf5      -> yes
  --has-parallel4 -> no
  --has-parallel  -> no
  --has-nczarr    -> yes
  --has-zstd      -> no
  --has-benchmarks -> no

  --prefix        -> /workspace/destdir
  --includedir    -> /workspace/destdir/include
  --libdir        -> /workspace/destdir/lib
  --version       -> netCDF 4.9.0

The deflate did work on strings in NetCDF 4.7 and 4.8.

Issue #2447 about compressing of vlen arrays seems to be related, but it appears to me that #2447 is rather about the wording of the error message, not a regression. Or was the compression just silently ignored in previous versions?

@DennisHeimbigner
Copy link
Collaborator

See PR #2231 .
Actually compression of vlens or string never worked correctly,
even if it did not complain. The problem is that a chunk of variable length
items is actually a chunk of pointers to the actual data. So unless a filter
handles this case specifically, it will end up compressing the pointers
rather than the data.
The PR mentioned makes it illegal to compress a variable length typed
variable.

@Alexander-Barth
Copy link
Contributor Author

OK, thank you confirming. I added a note to the documentation of my julia package that strings a vlen arrays cannot be compressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants