Back Propagation Flashcards

Question 1

Q

What are the steps for training a multilayer neural network using back propagation?

Answer

A

Set random weights for each weight
Forward pass (compute all the outputs of every neuron)
Calculate error
Back propagation
Update weights
GOTO Step 2

Question 2

Q

What is the process for the forward pass?

Answer

A

Calculate input neurons
a. Calculate the sum of weights and inputs for each neuron
b. Pass into the function you choose (sigmoid or other) to get result
Calculate hidden neurons
a. Calculate the sum of weights and inputs for each neuron
b. Pass into the function you choose (sigmoid or other) to get result
1. Calculate hidden neurons
  a. Calculate the sum of weights and inputs for each neuron
  b. Pass into the function you choose (softmas or sigmoid or other) to get result

Question 3

Q

What is the mean squared error function?

Answer

A

E(X) = 0.5 * ∑(y_n - t_n )^2

Question 4

Q

What is the differential of the sigmoid equation?

σ = 1 / 1 + e^-βx

Answer

A

σ = 1 / 1 + e^-βx = (1 + e^-βx)^-1

y = u^-1
u = 1 + e^-βx

dy/du = -u^-2
du/dx = -βe^-βx

σ’ = -βe^-βx / -(1 + e^-βx)^-2

σ’ = βe^-βx / (1 + e^-βx)^-2

1 - σ = 1 - 1 / 1 + e^-βx
1 - σ = (1 + e^-βx - 1 )/ 1 + e^-βx
1 - σ = (e^-βx)/ 1 + e^-βx

σ’ = σ(1 - σ)

Question 5

Q

What is the differential of the mean squared error function with respect to the output neuron?
E(X) = 0.5 * ∑(σ(a) - t_n )^2 [Image 6]

Answer

A

E(X) = 0.5 * ∑(σ(a) - t_n )^2

y = 0.5 * u^2
u = σ(a) - t_n

dy/du = u
du/da = σ(a)(1 - σ(a))

dE/da = σ(a)(1 - σ(a))u

dE/da = σ(1 - σ(a))(σ(a) - t_n)

Question 6

Q

What is the differential of the output of the weight-input sum (a) with respect of a output neuron weight (w_n)?
[Image 6]

Answer

A

da/dw_n = w0z0 + w1z1 + … + w_nz_n

da/dw_n = d/dw_n (w_n z_n) = z_n

Question 7

Q

What is the differential of the error with respect to an output neuron (b)?
Reminder:
> dE/da = σ(1 - σ(a))(σ(a) - t_n)

[Image 6]

Answer

A

da/db = w0z0 + w1z1 + … + w_nz_n

da/db = d/db(w_1σ(b))

da/db = w_1σ(b)(1 - σ(b))

da/db = w_1z_1(1 - z_1)

dE/db= dE/da × da/db

dE/da = σ(a)(1 - σ(a))(σ(a) - t_n)

dE/db = σ(a)(1 - σ(a))(σ(a) - t_n)w_1z_1(1 - z_1)

Question 8

Q

What is the differential of the error with respect to an output neuron weight (v_n)?
Reminder:
> dE/db = σ(a)(1 - σ(a))(σ(a) - t_n)w_1z_1(1 - z_1)

[Image 6]

Answer

A

db/dv_n = d/dw_n (v_n x_n) = x_n

dE/dv_n = dE/db × db/dv_n

dE/dv_n = σ(a)(1 - σ(a))(σ(a) - t_n)w_1z_1(1 - z_1)x_n

Question 9

Q

For this example, what is the range differential of the error with respect to a_k?
[Image 7]

Answer

A

E(X) = 0.5 * ∑(z_k - t_n )^2  
E(X) = 0.5 * ∑(σ(a_k) - t_n )^2

y = 0.5 * u^2
u = σ(a_k) - t_n

dy/du = u
du/da_k = σ(a_k)(1 - σ(a_k))

dE/da_k = σ(a_k)(1 - σ(a_k))u

dE/da_k = σ(a_k)(1 - σ(a_k))(σ(a) - t_n)

dE/da_k = σ(a_k)(1 - σ(a))(σ(a) - t_n)

Question 10

Q

What is the symbol for the differential of the error with respect to a_k?

Question 11

Q

For this example, what is the range differential of the error with respect to w_jk?
[Image 7]

Answer

A

dE/dw_jk = dE/da_k × da_k/w_jk

da_k/w_jk = d/dw_jk (w_0kz_0 + w_1kz_1 + ... + w_nkz_j +)
da_k/w_jk = z_j

dE/da_k = δ_k

dE/dw_jk = δ_kz_j

Question 12

Q

How do you apply gradient descent to the output layer?

[Image 7]

Answer

A

w_jk+1 = w_jk - ηδ_kz_j

Question 13

Q

How do weights propagate through the hidden layers?

Answer

A

It passes through multiple neurons which connect to each other. It does not propagate through one path instead through multiple paths.

Question 14

Q

What is the equation for the error with respect to a_j?

[Image 7]

Answer

A

dE/da_j = ∑_k dE/da_k × da_k/da_j

dE/da_j = ∑_k δ_k × da_k/da_j

da_k/da_j = d/(da_j ) ∑_j (w_jk σ(a_j))

da_k/da_j = ∑_j w_jk σ(a_j )(1-σ(a_j ))

da_k/da_j = ∑_j w_jkz_j(1-z_j)

dE/da_j = ∑_j w_jkz_j(1-z_j)δ_k

Question 15

Q

What is the equation for the error with respect to weights w_ij?
[Image 7]

Answer

A

dE/dw_ij = dE/da_j × da_j/dw_ij

da_j/dw_ij = d/dw_ij (∑_i (w_ij z_i))

da_j/dw_ij = z_i

dE/dw_ij = δjzi

Question 16

Q

How do you apply gradient descent to the hidden layer?

[Image 7]

Answer

Study These Flashcards

A

w_ij+1 = w_ij - ηδ_kz_i

Question 17

Q

Why might the output value change every time the AI is trained?

Answer

Study These Flashcards

A

Because of the AI finding local minma

Question 18

Q

What should the weights of the neural network be set to initially?

Answer

Study These Flashcards

A

> Set randomly

> Close to 0

Back Propagation Flashcards

(18 cards)