-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFP cuda produces 0 decompressed data #105
Comments
Dear Shelton,
This is a rather large data set. The uncompressed data is 5.2 GB and the compressed data is another 2.6 GB. zfp should definitely report an error if there is not enough GPU memory, but I think there may be sections of the CUDA implementation that assume that the uncompressed data can be addressed using only 32 bits (4 GB) and that can cause silent errors. We will revisit the CUDA implementation in October to address any such issues.
For now, can you try compressing the data in two or more pieces and see if that works? The easiest would be to partition the data along z into slabs that are 800+801 or even 400+400+400+401 elements wide. You can perform such partitioning using the Unix dd command and pipe the output of dd to the input of zfp to avoid temporary files, e.g.,
dd if=inputdata.dat bs=3510000 count=401 skip=1200 | zfp -i - -z output.comp -r 16 -x cuda -f -3 150 5850 401
would compress the last 401 "layers" of elements.
Best,
Peter
…--
Peter Lindstrom . [email protected]<mailto:[email protected]> . http://people.llnl.gov/pl . 925-423-5925
________________________________
From: Shelton Ma <[email protected]>
Sent: Monday, August 31, 2020 1:06 PM
To: LLNL/zfp <[email protected]>
Cc: Subscribed <[email protected]>
Subject: [LLNL/zfp] ZFP cuda produces 0 decompressed data (#105)
Dear ZFP developers,
I tested zfp with cuda option to compress and decompress a dataset of around 5G. But its decompressed dataset contains all 0. Here is the commands I used:
zfp -i inputdata.dat -z output.comp -r 16 -x cuda -f -3 150 5850 1601
zfp -z output.comp -o output.decomp -r 16 -x cuda -f -3 150 5850 1601
output.decomp contain all 0. It produced same result both on IBM Power8 (P100 GPU) and Dell X86_64 node (V100 GPU). However it will run correctly without "-x cuda" option (which means running on CPU)! Here are my environments:
compiler: gcc/7.3.0
CUDA: 10.1
zfp version: 0.5.5
Is there anything missing in my case?
Thanks in advance!
Best
Shelton Ma
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#105>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAJ6RV3HNVUCH5IUZHMZHEDSDP7EHANCNFSM4QQ4NKLA>.
|
Dear Peter, Thank you very much for your suggestions.
The compressed and decompressed data sizes now are: 750 M and 1.4G respectively. Best, |
Maybe a dumb question, but are you certain that the input data actually has nonzero values? Note that zfp assumes that the leftmost index varies fastest (aka. Fortran order). To partition the data along x like you've done, you would have had to piece together noncontiguous chunks of data. Partitioning along z (as in the example I gave) would be far easier. And given your choice of partitioning, I suspect that you may have transposed the dimensions (see this discussion). Such accidental transposition can lead to a nearly random sequence of values that is difficult to compress. That shouldn't result in all-zeros, but could still lead to unusually large errors in the reconstructed field. Before we speculate any further on what's causing this issue, may I suggest that you check out the develop branch and run the CUDA tests just to make sure that the CUDA implementation is working correctly on smaller data:
|
Dear Peter, Thank you very much for mentioning about the order to specify the data dimensions. The following commands
now produce correct results. As you mentioned earlier, it could not handle my 5G data example. Best |
I'm glad to hear this is working, though we need to look into what's causing the failure for the larger data set and why zfp is not reporting an error. I will keep this issue open until we've had time to take a closer look. |
@sheltongeosx Sorry for taking so long to get back to you regarding this issue. We're finally at a point where we have time to go over the CUDA implementation to make sure it's bug free. We fixed a related issue (#121) on the |
@sheltongeosx was this fixed for you? I've run some recent tests against our |
Going to close this for now, feel free to re-open if you are still seeing issues. |
Dear ZFP developers,
I tested zfp with cuda option to compress and decompress a dataset of around 5G. But its decompressed dataset contains all 0. Here is the commands I used:
output.decomp contain all 0. It produced same result both on IBM Power8 (P100 GPU) and Dell X86_64 node (V100 GPU). However it will run correctly without "-x cuda" option (which means running on CPU)! Here are my environments:
Is there anything missing in my case?
Thanks in advance!
Best
Shelton Ma
The text was updated successfully, but these errors were encountered: