Convolution is a mathematical operation applies a small matrix (kernel) to an image to extract features.

understanding CNN Flashcards by Toby Beckett

What is a convolution neural network (CNN)

deep neural network designed for preprocessing visual data
they detect patterns in images by using convolutions layers to detect spatial hierarchies of patterns

How well did you know this?

Not at all

Perfectly

what is spacial locality in images

images contain spacial structure and therefore nearby pixels are highly correlated

features therefore exist in local regions

How well did you know this?

Not at all

Perfectly

How do CNN exploit spacial locality

local receptive fields

How well did you know this?

Not at all

Perfectly

What is convolution

Convolution is a mathematical operation
applies a small matrix (kernel) to an image
to extract features.

How well did you know this?

Not at all

Perfectly

What mathematical operation is involved in convolution

A mathematical operation where:
A small matrix (kernel/filter) slides over the image
Performs:
Element-wise multiplication
Summation
Produces a feature map

How well did you know this?

Not at all

Perfectly

What does a kernel detect

Filters detect edges, textures, and objects.

How well did you know this?

Not at all

Perfectly

What are the stride controls

Stride controls movement of the filter, padding
preserves spatial dimensions.

How well did you know this?

Not at all

Perfectly

What does padding do

Controls output size:
valid → no padding
→ Output smaller than input
same → zero padding
→ Output same size as input

How well did you know this?

Not at all

Perfectly

Define convolution layers

Extract feature maps from images.

How well did you know this?

Not at all

Perfectly

define pooling layers

Reduce spatial size (dimensionality) and retain key information.

How well did you know this?

Not at all

Perfectly

What do activation functions do

Activation Functions to introduce non-linearity, or generating probabilities.

How well did you know this?

Not at all

Perfectly

What do fully connected layers do

Perform final decision-making.

How well did you know this?

Not at all

Perfectly

What is step one of CNN back propagation

CNNs process input images using filters to
find patterns/features. (convolution+pooling)

How well did you know this?

Not at all

Perfectly

What is step 2 of CNN back propagation

Loss Function to measure difference between
predicted and actual class.

How well did you know this?

Not at all

Perfectly

What is step 3 of CNN backpropagation

CNN compute gradients using chain rule
(backpropagation).

How well did you know this?

Not at all

Perfectly

What is the final step of CNN backpropagation

Study These Flashcards

CNNs adjust filters weights by minimising
error using Optimisation algorithms (SGD,
Adam).

Explain convolution layers in more detail

Study These Flashcards

A kernel (filter) slides over the image, performing element-wise multiplication and summation.

Define stride

Study These Flashcards

stride refers to the step size by which the convolutional filter (or kernel) moves across the input
image during the convolution operation.

What does padding=’valid’ mean

Study These Flashcards

padding=’valid’ means that the convolution is only computed where the input and the filter fully overlap, and
therefore the output is smaller than the input.

what does padding=’same’ mean

Study These Flashcards

padding=’same’ means that we have an output that is the same size as the input, for which the area around the input
is padded with zeros.

show a CNN architecture example

Study These Flashcards

Simple CNN Architecture Example
Given:
Input: 32×32 image
No padding
Stride = 1
Layer 1:
3×3 kernel
16 filters
Output: 30×30×16
Parameters:
3×3×16 = 144
Layer 2:
3×3 kernel
16 input channels
32 filters
Output: 28×28×32
Parameters:
3×3×16×32 = 4,608
Fully Connected Layer:
28×28×32 = 25,088 inputs
10 outputs
Parameters:
25,088 × 10 = 250,880
Total Parameters:
144 + 4,608 + 250,880 = 255,632

What is a DCNN

Study These Flashcards

A DCNN is simply a CNN with many convolutional layers stacked together.

What is the key idea of DCNN

Study These Flashcards

Key Idea:
Early layers → detect simple features (edges)
Middle layers → detect shapes/textures
Deep layers → detect high-level objects
👉 Depth = ability to learn hierarchical features.

What is the purpose of the pooling layer

Study These Flashcards

Purpose:
Reduce spatial dimensions
Reduce computation
Improve generalisation
Control overfitting

What are types of pooling

Types: Max Pooling Takes maximum value Keeps strongest features Average Pooling Takes mean value

What is batch normalisation

A technique to stabilise deep learning training by working as a slight regulariser, reducing overfitting. * Normalises the output of each layer to ensure steady learning. * Reduces sensitivity to weight initialisation.

What is the problem without batch normalisation

Outputs of each layer can become too large or too small vanishing gradients exploding gradients

Explain vanishing gradients

tiny updates cause the network to learn too slowly

Explain exploding gradients

large updates make training unpredictable

What are techniques to improve generalisation

Hyper parameter optimisation: - Learning rate - batch size - Number of filters - kernel size

understanding CNN Flashcards

(31 cards)