Foundations Flashcards

Question 1

Q

Even with Xavier and Kaming initialization, it can occur by chance that the weights of a neural network are initialized in such a way that the network is unable to learn anything useful.

Question 2

Q

If a pre-trained model is used and no new weights are added, we do not need Xavier and Kaming initialization at all.

Question 3

Q

It is satisfactory to have the mean and variance of the distribution of output values average out to zero and one, respectively, across multiple initializations. In individual cases, these values may deviate.

Question 4

Q

Which tensors can be added to each other?

Answer

A

Same shape, not caring about commas or spaces

Question 5

Q

All standard weight operations can be expressed as matrix multiplications. This makes neural network operations so efficient when executed on GPUs.

Question 6

Q

Gradient of bias

Answer

A

2⋅(out−target)⋅1/n

Question 7

Q

Gradient of weight

Answer

A

Gradient of bias x input

Question 8

Q

Gradient of input

Answer

A

Gradient of bias x weight

Question 9

Q

Elementwise arithmetics for tensors

Answer

A

with tensors all basic operators (+, -, *, /, >, <, ==) are applied elementwise
both tensors involved in the calculation need to have the same shape
operation is executed for every pair of elements at the same positions in the two tensors and thus, the result is a tensor with same shape as input tensors

Question 10

Q

Kaming Initialization

Answer

A

When using a ReLU activation the scaling factor of √(2/n_input) preserves the standard deviation

Question 11

Q

Xavier Initialization

Answer

A

suitable scaling factor (1/√n_input)
Passing the values through the activation function may alter the mean and standard deviation again causing the values to vanish or overflow

Question 12

Q

What is calculated during backpropagation?

Answer

A

Gradients

Question 13

Q

What gradients show?

Answer

A

Gradients indicate how the network should adjust its parameters to minimize the loss, not directly the quality of the network.

Question 14

Q

What is forward pass?

Answer

A

process of passing input data through the layers of a neural network to produce an output (e.g., predictions or logits).

Question 15

Q

What is backpropagation?

Answer

A

process of computing gradients of the loss function with respect to the network’s parameters (weights and biases) using the chain rule of calculus

Foundations Flashcards

(15 cards)