Perceptron Learning Algorithm P2 Flashcards

Question 1

Q

How do you find the best set of weights (W’)

Answer

A

Define the cost function J(W’)
Apply the gradient descent algorithm (iterative algorithm that finds the minimum of the cost function)

Question 2

Q

Define the cost function J(W’)

Answer

A

It is the sum of losses L(W’) for the misclassified samples
If the sample is misclassified as positive and is actually negative, it is multiplied by -1 and then added to the sum

Question 3

Q

Is J(W’) always positive or negative

Answer

A

Positive. as L(W’) is always positive because if the sample being added is negative, it is *-1

Question 4

Q

Define L(W’)

Answer

A

This function measures the distance between a misclassified sample and the decision boundary

The first variable on the RHS is either 1 or -1 depending on the sign of the misclassified sample

Question 5

Q

Describe gradient descent to minimise the cost function

Answer

A

The gradient of the cost function dJ/dw1 shows the slope of J at any point w1

If the gradient is positive we move left, towards the negative

If the gradient is negative, we move right towards the positive

We are trying to find the minimum of the cost function

Question 6

Q

What is the formula for the new set of weights using the gradient descent rule

Answer

A

= W’(t) - rate of learning * gradient of the cost function

The gradient of the cost function is a vector of partial derivatives that points in the direction of the steepest increase in cost

Question 7

Q

What is the main effect of choosing a too large learning rate in gradient descent

Answer

A

The updates may overshoot the minimum and bounce back and forth

Question 8

Q

Advantage and disadvantage of having a small learning rate

Answer

A

It is more stable but converges more slowly

Question 9

Q

Formula for the new set of weights using gradient dexcent

Answer

A

Old weights - learning rate * derivative of loss function

Perceptron Learning Algorithm P2 Flashcards

(9 cards)