Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected ncnn benchmark performance #281

Closed
llhe opened this issue Mar 1, 2018 · 5 comments
Closed

Unexpected ncnn benchmark performance #281

llhe opened this issue Mar 1, 2018 · 5 comments

Comments

@llhe
Copy link

llhe commented Mar 1, 2018

Hardware: Snapdragon 625 (similar result with Snapdragon 835)
Build:

cmake -DCMAKE_TOOLCHAIN_FILE=../android.toolchain.cmake -DANDROID_ABI="armeabi-v7a" -DANDROID_ARM_NEON=ON -DANDROID_PLATFORM=android-14 ..
make -j4
make install

Result:

# adb shell "cd /data/local/tmp/ncnn/ && ./benchncnn 10 8 0"
loop_count = 10
num_threads = 8
powersave = 0
      squeezenet  min =  180.00  max =  188.97  avg =  182.44
       mobilenet  min =  269.38  max =  286.74  avg =  274.53
    mobilenet_v2  min =  258.27  max =  320.87  avg =  267.85
      shufflenet  min =   93.65  max =   99.58  avg =   95.68
       googlenet  min =  575.78  max =  590.70  avg =  583.12
        resnet18  min =  716.49  max =  725.30  avg =  720.36
         alexnet  min =  339.43  max =  351.07  avg =  342.73
           vgg16  min = 4108.81  max = 4238.54  avg = 4197.91
  squeezenet-ssd  min =  339.44  max =  347.57  avg =  342.97
   mobilenet-ssd  min =  320.38  max =  332.81  avg =  324.91

However, in this article in zhihu:

平台:[email protected],Single-Thread,SqueezeNet-v1.1,Float32
ncnn:73 ms(稍微调整了下框架结构,未优化前91 ms)

There is a huge difference, what could be the cause?

@llhe
Copy link
Author

llhe commented Mar 1, 2018

After switch from armeabi-v7a to armeabi-v7a with NEON:

adb shell "cd /data/local/tmp/ncnn/ && ./benchncnn 10 8 0"
loop_count = 10
num_threads = 8
powersave = 0
      squeezenet  min =   64.70  max =   67.36  avg =   65.85
       mobilenet  min =   90.11  max =  104.90  avg =   92.96
    mobilenet_v2  min =  114.83  max =  125.85  avg =  117.29
      shufflenet  min =   40.93  max =   42.20  avg =   41.38
       googlenet  min =  171.80  max =  181.18  avg =  175.35
        resnet18  min =  197.73  max =  203.97  avg =  199.43
         alexnet  min =  175.38  max =  192.03  avg =  178.28
           vgg16  min = 1617.39  max = 1696.84  avg = 1647.92
  squeezenet-ssd  min =  121.85  max =  131.11  avg =  124.26
   mobilenet-ssd  min =   96.02  max =  106.27  avg =   98.15
adb shell "cd /data/local/tmp/ncnn/ && ./benchncnn 10 1 0"
loop_count = 10
num_threads = 1
powersave = 0
      squeezenet  min =  237.79  max =  241.50  avg =  238.94
       mobilenet  min =  362.04  max =  365.23  avg =  363.43
    mobilenet_v2  min =  352.78  max =  358.15  avg =  354.49
      shufflenet  min =  134.91  max =  135.85  avg =  135.32
       googlenet  min =  831.19  max =  836.29  avg =  833.26
        resnet18  min = 1100.67  max = 1119.56  avg = 1108.67
         alexnet  min = 1134.11  max = 1140.10  avg = 1137.29
           vgg16  min = 4221.63  max = 4272.38  avg = 4250.14
  squeezenet-ssd  min =  460.39  max =  465.86  avg =  463.41
   mobilenet-ssd  min =  406.68  max =  411.81  avg =  408.93

And the ST result still has a gap (although A53 is lower-end than A57).

@nihui
Copy link
Member

nihui commented Mar 1, 2018

A53 is over 50% slower than A57 according to DMIPS/MHz value in [1]
in addition, you could try the aarch64 build, which may be over 20% faster than armv7 build

[1] https://en.wikipedia.org/wiki/Comparison_of_ARMv8-A_cores

@llhe
Copy link
Author

llhe commented Mar 1, 2018

Thanks.

@helloearth012
Copy link

After switch from armeabi-v7a to armeabi-v7a with NEON:

如何做这个switch,你最初的指令里面不是已经有-DANDROID_ARM_NEON=ON了吗,还需要加什么指令?谢谢

@helloearth012
Copy link

After switch from armeabi-v7a to armeabi-v7a with NEON:

adb shell "cd /data/local/tmp/ncnn/ && ./benchncnn 10 8 0"
loop_count = 10
num_threads = 8
powersave = 0
      squeezenet  min =   64.70  max =   67.36  avg =   65.85
       mobilenet  min =   90.11  max =  104.90  avg =   92.96
    mobilenet_v2  min =  114.83  max =  125.85  avg =  117.29
      shufflenet  min =   40.93  max =   42.20  avg =   41.38
       googlenet  min =  171.80  max =  181.18  avg =  175.35
        resnet18  min =  197.73  max =  203.97  avg =  199.43
         alexnet  min =  175.38  max =  192.03  avg =  178.28
           vgg16  min = 1617.39  max = 1696.84  avg = 1647.92
  squeezenet-ssd  min =  121.85  max =  131.11  avg =  124.26
   mobilenet-ssd  min =   96.02  max =  106.27  avg =   98.15
adb shell "cd /data/local/tmp/ncnn/ && ./benchncnn 10 1 0"
loop_count = 10
num_threads = 1
powersave = 0
      squeezenet  min =  237.79  max =  241.50  avg =  238.94
       mobilenet  min =  362.04  max =  365.23  avg =  363.43
    mobilenet_v2  min =  352.78  max =  358.15  avg =  354.49
      shufflenet  min =  134.91  max =  135.85  avg =  135.32
       googlenet  min =  831.19  max =  836.29  avg =  833.26
        resnet18  min = 1100.67  max = 1119.56  avg = 1108.67
         alexnet  min = 1134.11  max = 1140.10  avg = 1137.29
           vgg16  min = 4221.63  max = 4272.38  avg = 4250.14
  squeezenet-ssd  min =  460.39  max =  465.86  avg =  463.41
   mobilenet-ssd  min =  406.68  max =  411.81  avg =  408.93

And the ST result still has a gap (although A53 is lower-end than A57).

@llhe 这个switch,你是通过改动makefile吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants