What should the number of bias’ be equal to?
The number of output neurons
How do we count the number of layers in a neural network?
What is the formula for calculating the number of parameters for a convolutional layer:
For an input image of 32x32 with convolutional layer of 5x5x6 filters, what is the number of parameters used?
(filter_width * filter_height* input_depth + 1 for bias) * filter depth
(551 + 1) * 6
What is the formula for calculating the number of parameters for a fully connected layer:
For 120 input nodes with 84 output nodes, what is the number of parameters used?
(Input nodes + 1 for bias) * output nodes
(120 + 1) * 84 = 10164
How many layers does the leNet5 model have?
5:
3 convolution layers and 2 fully connected layers
How many layers does the AlexNet model have?
8:
5 convolution layer and 3 fully connected layers
How many layers does the VGG-16 model have?
16:
13 convolution layers and 3 fully connected layers
Why do we decrease the feature map size and increase the depth of the channels/filters for each stage?
This is done to maintain the content.
What is the advantage of using small filters/kernel size?
Stacking two 3x3 conv (stride 1) layers has the same receptive field as one 5x5 conv layer, however the stacked smaller filters use fewer parameters for convolution hence save memory.
What is the difference in parameter size by using a 3x3 filter rather than a 5x5 filter?
2*(3^2) = 18 vs.
5^2 = 25 (supposing we don’t calculate bias). So we save memory by using the smaller filter
Why use a deeper neural network over a shallow one?
How does VGG-net different from AlexNet and leNet?
VGG-Net makes use of the ReLu layer
How many layers does GoogLeNet have and what makes it different to other models?
It has 22 layers. It uses efficient modules such as the inception model
- 12 times less parameters than AlexNet
What are auxiliary classifiers?
Additional classifiers placed at earlier layers of the network:
- solving vanishing gradient issue
- form of regularisation: earlier layers get evaluated sooner and thus their parameters can be updated to be more accurate
How many auxiliary classifiers does GoogLeNet use?
How do inception module work?
What are the two types of inception module:
Naive and dimension reduction
Very deep CNNs are prone to degredation, what can be used to avoid/handle this?
Using Residual blocks
What does ResNet use that makes it different to other NNs
Residual blocks
What are residual blocks?
What do residual networks solve?
The vanishing gradient problem, as all information is propagated to the final layer
What is DenseNet’s key feature?
What is dense connectivity?
Why use dense connectivity over resnet blocks?