Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial the int8 quantize inference implement #487

Closed
wants to merge 3 commits into from

Conversation

BUG1989
Copy link
Contributor

@BUG1989 BUG1989 commented Jul 20, 2018

ncnn-int8 pull request readme

Int8 inference support layer

platform layer
x86 conv1x1s1、conv1x1s2、conv3x3s1、conv3x3s2、dwconv3x3s1、dwconv3x3s2
armv7a conv1x1s1

Summary change files

Function change or add files
example of int8 examples/CmakeLists.txt
examples/squeezenet-int8.cpp
examples/squeezenet_v1.1.table
parse int8 calibration table src/net.h
src/net.cpp
src/layer.h
src/layer.cpp
src/quantize.h
int8 implement of x86 src/layer/convolution_quantize.h
src/layer/convolution.h
src/layer/convolution.cpp
src/layer/x86/convolution_x86.cpp
src/layer/convolutiondepthwise.h
src/layer/convolutiondepthwise.cpp
src/layer/x86/convolutiondepthwise_x86.cpp
int8 & fusion-relu implement of armv7a src/layer/arm/convolution_quantize_arm.h
src/layer/arm/convolution_arm.cpp
src/layer/arm/convolution_1x1_int8.h
adapt to int8 framework src/mat.h
src/layer/concat.cpp
src/layer/eltwise.cpp
src/layer/arm/eltwise_arm.cpp
src/layer/pooling.cpp
src/layer/arm/pooling_arm.cpp
src/layer/arm/innerproduct_arm.cpp
tools tools/calibration2mem.cpp
compile file src/CMakeLists.txt
examples/CMakeLists.txt
tools/CMakeLists.txt

How to generate the Calibration table file

We provide a tool for generating the Int8 calibration table file:

caffe-int8-convert-tools

How to use Int8 inference

In the default set, the inference using the Float32 mode,If you want switch the inference to Int8 mode,just need add 2 lines code,more details please see the examples/squeezenet-int8.cpp file.

......
ncnn::Net squeezenet;
squeezenet.set_conv_model(CONV_INT8);               //set the Int8 mode
squeezenet.load_param("squeezenet_v1.1.param");
squeezenet.load_scale("squeezenet_v1.1.table");     //parse the Int8 calibration table,also it's the quantize scale value
squeezenet.load_model("squeezenet_v1.1.bin");
......

Attention above codes,the sequence of calling API must be the same !!!

How to manual control the int8 conv layer on/off

This implement is naive and ugly,need modify the ncnn.param file by manual.

  1. Editor the ncnn.param,add the paramter in the conv layer "7=1"or in the deconv layer "8=1",the default value is "0" ,the int8 conv default enable.

    Convolution      conv1            1 1 data conv1 0=8 1=3 2=1 3=2 4=1 5=1 6=216 7=1
    ConvolutionDepthWise conv2_depthwise  1 1 conv1 conv2_depthwise 0=8 1=3 2=1 3=2 4=1 5=1 6=72 7=8 8=1
    Convolution      conv2_pointwise  1 1 conv2_depthwise conv2_pointwise 0=16 1=1 2=1 3=1 4=0 5=1 6=128 7=1
    ConvolutionDepthWise conv3_depthwise  1 1 conv2_pointwise conv3_depthwise 0=16 1=3 2=1 3=1 4=1 5=1 6=144 7=16 8=1

Result Accuracy

Type Note
Dataset of Calibration ILSVRC2012_img_test   1k
Dataset of Test ILSVRC2012_img_val    5k
Framework ncnn-int8
Support Layer conv3x3、conv1x1、convdw3x3
FP32 INT8
NETWORK Top1 Top5 Top1 Top5
SqueezeNet v1.1 57.86% 79.86% 57.36% 79.84%
MobileNet v1 67.78% 87.62% 64.92% 85.22%
MobileNet v2 70.20% 89.20% 68.94% 87.90%
GoogleNet v1 67.70% 88.32% 67.64% 88.26%
ResNet-18 65.50% 86.46% 65.48% 86.44%
ResNet-50 71.68% 89.94% 71.38% 89.52%
NETWORK Top1 Top5 Diff Top1 Diff Top5
SqueezeNet v1.1 57.86% 79.86% 0.50% 0.02%
MobileNet v1 67.78% 87.62% 2.86% 2.40%
MobileNet v2 70.20% 89.20% 1.26% 1.30%
GoogleNet v1 67.70% 88.32% 0.06% 0.06%
ResNet-18 65.50% 86.46% 0.02% 0.02%
ResNet-50 71.68% 89.94% 0.30% 0.32%

Result Performance

Type Note
Hareware Hisi3519([email protected])
Test App ncnn-int8 benchncnn
Unit ms
Models Float32(winograd on) Int8 Ratio
SqueezeNet_v1.1 315 242 x1.30
MobileNet_v1 523 369 x1.41
MobileNet_v2 401 311 x1.28
MobileNet_v1-SSD 1034 701 x1.47

TODO

  • Support conv3x3s2 int8 in arm
  • Support conv1x1s2 int8 in arm
  • Support convdw3x3 int8 in arm
  • Support x86 platform fusion ReLU
  • Support aarch64 platform Int8 and fusion ReLU
  • Optimize the streamline
  • Add the result accuracy of detection task

Thanks

The original author of the int8 code : fu1899

The original author of the algorithm code : JansonZhu

@nihui
Copy link
Member

nihui commented Aug 1, 2018

framework int8 a169cec

x86 int8 4be27a0

armv7 int8 e34aa77

@nihui nihui closed this Aug 1, 2018
@BUG1989
Copy link
Contributor Author

BUG1989 commented Feb 21, 2019

new int8 implement is work in process,better accuracy and speed.
new int8 pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants