data for convolutional networks
grid-like topology (1D time series and 2D images)
distinguishing feature of convolutional networks
CNNs use convolution (and not matrix multiplication) in some layer
convolution function
integral of the product two functions (after one is reversed and shifted)
(f * g)(t) = ∫ f(a)g(t-a) da
think of f as a measurement and g as a weighting function that values the most recent measuremnts
parts of convolution
main function: input, n-dimension array of data
weighting function: kernel, n-dimension array of parameters to adjust
output: feature map
computational features of convolutional networks
stacked convolutional layers
receptive fields of deep units is larger (but also indirect) compared receptive field of shallower units
if layer 2 has a kernel width of 3, then each hidden unit receives input from 3 units.
if layer 3 also has a kernal width of 3, then these hidden units here receive indirect input from 9 inputs
stages of a convolutional layer
pooling and translation
small changes in location won’t make big changes to the summary statistics in the regions that are pooled together
pooling makes network invariant to small translations
what convolution hard codes
the concept of a topology
(non convolutional models would have to discover the topology during learning)
local connection (as opposed to convolution)
like a convolution with a kernel width (patch size) of n, except with no parameter sharing.
each unit has a receptive field of n, but the incoming weights don’t have be the same in every receptive field.
iterated pixel labelling
suppose convolution step provides a label for a pixel. repeatedly applying the convolution on the labels creates a recurrent convolutional network.
repeated convolutional layers with shared weights across layers is a kind of recurrent network.
why convolutional networks can handle different input sizes
each convolution step scales the input. if you repeat the convolution an appropriate number of times, you can normalize the size.
convolutions for 2D audio
convolutions over time: invariant to shifts in time
convolutions over frequency: invariant to changes in frequency.
primary visual cortex
differences between human vision and convolutional networks
regularization
dataset augmentation strategies
noise robustness
semi-supervised training
multi-task learning
early stopping
parameter tying and sharing
sparse representations (regularization)
bagging