This is a train / prediction system for vision-networks which is originally posted on imagenet-multiGPU.torch
- includes prediction code with threading
- includes code for load vgg16 from Caffe model zoo (with loadCaffe)
- includes code for Kaiming initialization Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
- includes residual learning idea in the inception-v3 Deep Residual Learning for Image Recognition
- includes absorbing BN parameters into convolutional parameter(in prediction step, all of the nn.(Spatial)BatchNormalization layers are removed so that elapsed time is impressively reduced) How does it works?
- In our experiment, best accuracy was reached around top1: ~75% on ILSVRC2012 val. set with single-crop, single-model, and resception-net
- The google brain team's experiment result on inception-ResNet: open-review, ICLR, 2016
- cudnn-v4 supports (in our case, 1.6x faster in conv. than cudnn-v3's conv. with cudnn.fastest=true, cudnn.benchmark=true)
- cudnn-v5 supports
- We think google's inception style net is more efficient than MSRA's residual shortcut net in terms of both processing time and memory consumption. Their representation power is almost tie.
- Soumith's great works imagenet-multiGPU.torch
- Elad Hoffer's great works ImageNet-Training
- e-lab@Purde Univ. torch-toolbox
- cudnn-v4, cudnn.torch
- FAIR's ResNet with multi-gpu, cudnn-v4