-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use aimet sdk to deploy quantized model in Qualcomm Neural Processing SDK #168
Comments
Hi @HaihuaQiu.. Please look at the documentation for the Snapdragon Neural Processing SDK to look at the detailed options. At a high-level, you use AIMET to optimize the model, you can also use AIMET QuantSim to simulate on-target accuracy and performance fine-tuning to make the model better. Once done, you export the QuantSim which results in a modified but a FP32 model. You can import this model into the Snapdragon Neural Processing SDK like a regular model. Using the dlc-convert and dlc-quantize tools. Hope that helps. |
Thanks, you means that we import onnx model created by AIMET QuantSim to SNPE SDK to indirectly achieve quantize optimization?? |
Yes. For example, you can
We have achieved pretty good results with the above recipe. |
Thanks for your answer! |
@HaihuaQiu closing this issue for now. Please open this if the issue persists. |
I found that AIMET QuantSim finally exports an float model and a json file with quantization encodings. However, there is no way to feed the json file into dlc-quantize tool of snpe. So, how can I use snpe to quantize model with the json file derived from AIMET? |
The short answer is that you don't need to feed the json file to SNPE (SnapDragon Neural Processing SDK).. SNPE will calculate equivalent encodings as AIMET via its dlc-quantize tool. For some future use cases, there may be a need to import AIMET encodings into SNPE (and there is an option to do that with the latest tool), but for now you don't need it. Hope that answers your question. |
Hi, have you ever met such error? it's a test code of 'QuantizationSimModel'. Thanks. TypeError: init(): incompatible constructor arguments. The following argument types are supported:
|
Trying to quantize YoloV5 model using aimet but not successfully. reproduce code: |
@tsangz189 Could you please share the full code you used for quantizing the model, before the export is performed? (details of attempt_load()) |
Hi , I realised that this error occurs because StaticGridQuantWrapper does not implement the following parameters that the yolo parser is expecting the model to have. Thrown on line 145 of yolo.py. I'm not too sure if amending hasattr() checks or modifying StaticGridQuantWrapper to contain this information is the right way to go. From yolo.py: |
sorry to bother you. After training according to this process and feed the json file into dlc-quantize tool of snpe. But the results are very poor, completely different from the results of the aimet model. What is the reason? |
I found the activation encodings calculate by dlc-quantize tool of snpe is quite different from that of AIMET. The accuracy of the detection model I trained is always several percentage points lower than that of the floating-point model. Could you please help me with that? I choose part of encodings and shows like followings. This is selected from "QAT.encodings.yaml" which is calculated by AIMET:
And the following is calculated by dlc-quantized tools. |
Can you share the flags you passed to SNPE? You want to select the options such that the encodings already determined by AIMET are imported into SNPE. SNPE will not recalculated those then. |
The flags I used are: And I found the min&max values are basically the same after activation layer between "Aimet" calculation and SNPE calculation, but are very different in other layers. Besides, I trained a "detection model", the accuracy of online evaluation of the quantsim model of AIMET is almost the same as the float model, but the accuracy of the post-static quantized model is always several percentage points lower than that of the floating-point model. I got the outputs using snpe-net-run and then parse the output to calculate the mAP. Could you help me with that? |
Actually, I find that, AIMET quantization accuracy using simulation model is higher than the post-static quantized model using SNPE. maybe 2~3 percentage points of mAP of an detection model. |
@ibei hi, did you solve the problem? it's so weird... |
@xmfbit Hey, let me know if you could solve this issue, we are facing the same error |
@benv2k6 - can you share how exactly you are instantiating the QuantizationSimModel? And share the stack trace. I think we should create a separate ticket for this. |
Hi @quic-akhobare , thank you for your quick response. The code to reproduce the error was:
I had this error when building the library myself, since we use Eventually those issues were resolved yesterday since your release of But from this experience it seems that currently, building from source is really specific to certain versions of python, ubuntu system libraries and c++ toolchains. |
@hasuoshenyun @quic-akhobare I also meet the same error : " the min&max values are basically the same after activation layer between "Aimet" calculation and SNPE calculation, but are very different in other layers.". |
Hi,
I'm not sure I remember correctly the details, but please make sure you
pass the correct flag to the dlc converter and the dlc quantizer. When you
use Aimet, you need to pass a flag to the dlc converter. In my question
I've asked about the quantizer, which might be the reason for the issue.
Hope it helps,
Ben
בתאריך יום ו׳, 30 בדצמ׳ 2022, 15:53, מאת Guan Dai ***@***.***
…:
Can you share the flags you passed to SNPE? You want to select the options
such that the encodings already determined by AIMET are imported into SNPE.
SNPE will not recalculated those then.
The flags I used are: snpe-dlc-quantize --input_dlc xx/QAT.dlc
--input_list xx/raw_list.txt --use_enhanced_quantizer --output_dlc
xx/QAT_quantized.dlc
And I found the min&max values are basically the same after activation
layer between "Aimet" calculation and SNPE calculation, but are very
different in other layers.
Besides, I trained a "detection model", the accuracy of online evaluation
of the quantsim model of AIMET is almost the same as the float model, but
the accuracy of the post-static quantized model is always several
percentage points lower than that of the floating-point model. I got the
outputs using snpe-net-run and then parse the output to calculate the mAP.
Could you help me with that?
Hi I also meet the same error : " the min&max values are basically the
same after activation layer between "Aimet" calculation and SNPE
calculation, but are very different in other layers.".
Please let me know if you have some advice~
—
Reply to this email directly, view it on GitHub
<#168 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA6SFY7BICUIO35Z2FXAHCLWP3SM7ANCNFSM4PXJSPZQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Have you fix this? i met the same problem, also i build the aimet myself.
I met the same problem when i use the aimet builded from source, and i use the source branch < release-aimet-1.26> and py38, could you tell me how to compile "AimetTensorQuantizer class into libpymo.so" as you mentioned. Thank you! |
@WithFoxSquirrel Hi, the problem was solved by itself when I used the release of Aimet with python 3.8, it was really some days after I posted the question . So essentially i didn't had to compile it myself |
I think this project is very nice, I also know how to quantize FP32model in pytorch, but I do not how to use this sdk to deploy quantized model through Qualcomm Neural Processing SDK to Qualcom DSP phone. Because Qualcomm Neural Processing SDK support self quantize tool。
The text was updated successfully, but these errors were encountered: