Define batch normalization.
A technique to normalize inputs of each layer in a neural network to improve training speed and stability.
What is the impact of batch normalization on model generalization?
It can improve generalization by reducing overfitting.
Define transposed convolution.
A convolution operation that increases the spatial dimensions of the input feature map.
Epoch
One full pass through your entire dataset
A step / iteration
Processing just one small batch of the entire data
5000 photos
100 batch
=>50 steps
RNN PARAMETER CALCULATIONS
Input dimension = 128
Hidden units = 50
Wx
12850=
Wh
5050=
Wb
50
Reason for RNN vanishing gradient during bptt?
The gradient signal repeatedly gets multiplied by the recurrent weight matrix ( Wh) and activation function derivative as it flows backward through each timestep
RNN a skip connection
Yes through time. Not the one that passes the layer itself
Language model
The model that can generate the probability of next word given the previous word. Or the probability that can generate the probability of a sentence
Conditional probability
Gives us a way to formally calculate our new belief after a piece of evidence has been involved updating our old belief
Perplexity
(1/p(sentence)) ^1/N
Rnn
E^loss