-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different result of rangenet_lib #9
Comments
Hi @TT22TY, we are currently investigating the issue. Currently, we only know that it might occur on Nvidia RTX graphic cards. Do you also have an RTX model? (sorry for the late reply.) |
Thanks for your reply. :) @jbehley |
Okay, this seems to be really an issue with the Geforce RTX line. I experienced similar problems with my RTX 2070 and therefore we have to investigate what's the reason. The non-TensorRT part works under Pytorch with an RTX 2080 Ti, since this is the card we used for our experiments. |
Thank you very much! @jbehley In fact ,I use TensorRT 5.1.5.0, so I wonder which version do you installed? Thank you! |
Hi @TT22TY , I've tested on three different setups.
I hope it helps. |
Hi @TT22TY, and I run it under Ubuntu 18.04 using the following combination:
For TensorRT I downloaded: nv-tensorrt-repo-ubuntu1804-cuda10.1-trt5.1.5.0-ga-20190427_1-1_amd64.deb |
@jbehley @Chen-Xieyuanli ,Thank you very much, I am not sure which part lead to the wrong result, mine is For TensorRT I downloaded: Tensorrt-TensorRT-5.1.5.0. ubuntu-16.04.5x86_64-gnu.cuda-10.0.cudnn7.5.tar.gz Do you have any suggestions for which one should I update, I will try it again. Thank you very much! :) |
Hi @TT22TY , since you have the RTX graphic card, I would suggest you following @jbehley's setup. Please make sure that you build the system with the expected setup, because different versions of Cuda, Cudnn or TensorRT could exist at the same time. When compiling the system, It might still connect to the wrong versions. |
@Chen-Xieyuanli Thank you very much, I will try. :) |
In reference to issue #6, Below is my hardware setup. Just wanted to know, if this setup is compatible or should I switch to other setup. |
We currently only experienced problem with RTX models. However, I would currently suggest to use TensorRT 5.1, since this is the version we also tested the most and have had running on many other systems. If you go for TensorRT 6.0, we cannot guarantee that everything works as expected. |
Thanks for the update. I tried running with the TensorRT 5.1, but still getting the same result. Also, I tried the .bin file provided in the example folder and had no issues getting the expected result. It would be great, if possible, you guys can try this pcd file and see if getting the same results as mine. |
Hi, Besides, I am wondering when will you release the ROS interface for lidar-bonnetal~ Thank you very much! |
Hi,
I have also test on Driver 418.87(other same as above line), However meet run time error #15 again... If I train a new model from scratch on semantic kitti in the current environment, will this issue be solved, Thanks a lot~ |
You can always run the model without TensorRT (see lidar-bonnetal for the pytorch models), since this worked reliably on all our systems with different GPUs. We currently cannot do anything to solve or give advise, since it seems to be a problem with the RTX and specific versions. I don't know if you can turn-off some optimizations (fp16, etc.) and this hurts the result. @TT22TY: Good to hear that it works partially. You can also try to open an issue with Nvidida and TensorRT (https://github.com/NVIDIA/TensorRT/issues). They might have some suggestions. |
Hi @TT22TY, I'm glad that it works for you now, and we now also know that it's quite sensitive to the GPU and TensorRT versions. Andres @tano297 may later release a better version of C++ and ROS interface for LiDAR-bonnetal. Hi @LongruiDong, I'm sorry for this repo doesn't work properly for you. A more complete development version with different model formats may release later by Andres @tano297. Since we are almost clear that the problems are caused by the GPUs and TensorRT, which we cannot do anything, I will therefore close this issue. If there are other problems relating to this, please feel free to ask me reopening this issue. |
Hi. We test this configuration in a fresh machine, exactly followed your config. There is output, but it is not the proper segment result. Any other things I am missing? Thanks for help |
Hi, @Claud1234 |
Hey @kuzen, thank you very much for your feedback. We now find a new solution to the incompatible problem! :-) |
Hi @kuzen , thank you for sharing your solution, could you please also share you hardware setup, for example, the Cuda, and tensorrt version, thank you very much~ |
Hi, this is my software version |
Thanks. |
@kuzen Thank you very much. But it does not work for me, and I wonder whether it works for you @Claud1234 @LongruiDong |
@kuzen @TT22TY I have tried the approach to convert the opset version and optimize the ONNX model. But a very weird thing is I only succeed to get correct result for ONE time! After I delete the '.trt' file then re-try it, the results are always same and wrong! I did not change anything for dependencies at all. @kuzen Would you please re-try or explain more about the whole thing? Can you get the correct result every time if delete the '.trt' file? I have unified the version of libcublas and CUDA as you recommended, but I am not sure whether this is necessary. |
@balajiravichandiran Thank you very much for the feedback! |
@balajiravichandiran thank you very much, it works! amazing~ |
@balajiravichandiran @Chen-Xieyuanli @kuzen Hello, I can use the pre-trained model to get the correct prediction results, but when I use my own model to test, the above problems still occur. I tried the two methods you provided but they didn’t work for me. , Do you have any suggestions, thank you very much. |
Hey @balajiravichandiran, thanks for using our code. Are the range image results also from your method or rangenet++? They look good. This issue seems to be visited very frequently. I will then keep it open for more joining the discussion. |
The left side of Figure 2 is the result of the rangenet_lib test, and the right side is the prediction result output by lidar-bonnetal during the training process. Figure 3 is the result of the rangenet_lib test with the model after I trained kitti. It looks terrible.So I don't know why the pre-trained model has good results, but the effect of the model trained by myself is so bad. |
Okey, I got it. The problem is the model trained by you is not as good as the one trained by us. It is then not the problem of rangenet_lib. You may check your setup again and raise an issue in RangeNet++ repo |
@Chen-Xieyuanli Hi, I meet the same problem (different result of ranglib). This is my strategy. Due to FP16 with insufficient dynamic range, some intermediate layer outputs could be represented in FP16 precision with overflow/underflow. I find the layer, and set it mandatorily FP32. The problem is workarounded. (PS: The code is refractored to apply for ubuntu20.04, tensorrt 8.2, but I think the tensorrt's version is not a big deal) |
Hey @Natsu-Akatsuki, thanks a lot for your feedback! |
Hello, I have tested rangenet_lib, the result is as follows, which is different to the demo on the website(https://github.com/PRBonn/rangenet_lib). I use the pre_trained model provided on the website, I wonder why it is different and wrong. Could you please give some suggestions to find out the reason? Besides , I also opened an issue under the SuMa++
(https://github.com/PRBonn/semantic_suma/issues/6#issue-525720509)
Thank you very much.
The text was updated successfully, but these errors were encountered: