Skip to content

Latest commit

 

History

History
62 lines (50 loc) · 4.01 KB

fractional-max-Pooling-paper-Summary.md

File metadata and controls

62 lines (50 loc) · 4.01 KB

Fractional Max Pooling

INTRODUCTION

  • Link To Paper
  • Here is the implementation in theano.
  • Covolutional Networks have been evolved overtime. Many researchers had put their efforts in designing Different kinds of Activation Layers, Different Size of Convolution Layers, reducing overfitting by Dropout, BatchNormalization.
  • However, very little thought has been putup into updating traditional MaxPooling Layers.
  • Pooling Layers are building blocks of the CNN
  • It reduces spatial dimension of the data. Like if we have Nin x Nin Matrix, and we apply MaxPool Layer on that it spatial dimensions will reduce to Nout x Nout. Where reduction factor α = Nin / Nout

Traditional MaxPool Layer

  • Traditionally 2 x 2 MaxPool layer has been used for Spatial Pooling

Advantages

  • Fast, reduces size of hidden layer quickly.
  • Encodes degree of invariance with respect to translations and elastic distortions.

Disadvantages

  • Disjoint nature of Pooling operations can reduce generalization.
  • MaxPooling reduces size so quickly that to build a deep network stack of Convolution Layers are required.

Alternatives Proposed Before

Fractional Max Pooling

  • Reduces spatial size of Image by a factor of α, where 1 < α < 2
  • Introduce randomness like Stochastic Pooling
  • Overlapping pooling regions

How to design it?

  • Input : Nin x Nin, Output : Nout x Nout, reduction factor α = Nin / Nout
  • General Idea if to divide Nin x Nin square into Nout^2 pooling regions (Pi,j)
  • Outputi,j = max(k,l) ∈ P i,j Inputk,l
  • To do this, generate two increasing subsequences (ai) and (bi), 0 <= i <= Nout, starting with 1 and ending at Nin and with an icrement of 1 or 2.
  • Now, we can generate two kind of Pooling regions
  • Disjoint Overlapping
    P = [ai-1, ai-1] x [bj-1, bj -1] P = [ai-1, ai] x [bj-1, bj]
  • To generate integer sequence two different approaches
    • random = increments are obtained by random permutations of appropriate number of 1 and 2
    • pseudorandom = increments are obtained by ai = ceiling(α(i+u)), with α ∈ (1,2) and u ∈ (0,1)
  • While training or testing, whenever CNN with FMP is applied on the dataset, we can generate different integer sequences and then average it to generate ensemble of it.

Which limitations does it overcome over traditional MP?

  • Disjoint as well as Overlapping Pooling Regions.
  • Randomness included like Stochastic Pooling
  • Reduction factor α reduced to α ∈ (1,2), so now we can generate deep network without

Key Points Observations

  • Random Fractional Max Pooling may undefit when combined with DropOut.
  • Improvement over traditional MP is substantial.
  • Overlapping FMP better than Disjoint FMP

Notable Results

Further possible improvement

  • Fractional Max Pooling
  • Looking at the distortions, it is decomposible in both x and y directions. Can we explore pooling regions which are different than the regions given by equation above?