Logistic Regression Flashcards

(14 cards)

1
Q

What is logistic regression? What is it used for?

A
  • most widely used algorithm for classification
  • used to classify data into two categories, like malignant (1) and benign (0)
  • fits an S-shaped curve to the dataset, predicting probabilities rather than direct class labels
    -> says if what class is more likely
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an alternative name for logistic function used in logistic regression?

A
  • Sigmoid Function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does the Sigmoid Function look like?

A
  • horizontal axis labeled z and takes on a range of positiv and negative values
  • outputs values between 0 and 1
  • g(z) = 1 / (1+e**-z) with 0<g(z)<1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can the logistic regression be expressed?

A
  1. z = w*x + b with w and x being vectors
  2. pass z to sigmoid function g(z) = 1/(1+e-z)
    together form
    f_wb(x) = g(w
    x+b) = 1 / (1+e**[w
    x+b])
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can the output of logistic regression be understood?

A
  • “probability” that class is 1

example: x is tumor size
y is 0 (not malignant) or 1 (malignant)

f_wb(x) = 0.7
–> 70% chance that y is 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the decision boundary/threshold in logistic regression?

A
  • is the value at which y is assumed to be 1
  • below that it is assumed to be 0
  • because the sigmoid function calculates probability this is basically saying, when the probability is X like 70% we assume the classification to be 1
  • common threshold is f_wb_(x) >= 0,5 is y_^= 1
  • that means is the same as when g(z) >= 0.5
    That is the case when z >= 0 and thus w*x + b >= 0 then y is 1
    else y = 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the decision boundary mathematically?

A
  • the values for w,x and b where z = w*x + b = 0
  • that means where the y is neutral, neither 1 or 0
  • can assume complex forms for polynomial equations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does the squared error Cost function hold up for logistic regression? Why is that?

A
  • not good
  • the regular squared error cost function for logistic regression results in a graph that is like a squiggly line with multiple local minima -> non-convex
    -> gradient descent would possibly get stuck in a local minima
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the loss function? How does the choice of how this loss function is calculated affect the cost function?

A
  • loss function is the part inside the summation of the cost function
  • meaning the cost function is the summation of all loss over all features and examples divided by the number of examples
  • choosing the loss function affects how the graph of the cost function looks like
  • for instance for logistic regression the squared error cost function 1/m * Summation of 1/2 * ((f_wb_xi)-yi)² the graph would be non-convex
  • replacing 1/2 * ((f_wb_xi)-yi) with a different loss function results in a different graph
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the loss function for logistic regression?

A

if y(i) = 1 : -log(f_wb(xi))
if y(i) = 0: -log(1-f_wb(xi))

-> convex (inverse hill)
-> can reach a global minimum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the better cost function for logistic regression?

A

J_wb = 1/m * Sum over (L(f_wb_xi, y(i))
With L(f_wb_x
i, y(i)) being the loss function for logistic regression
if y(i) = 1 : -log(f_wb(xi))
if y(i) = 0: -log(1-f_wb(xi))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between Loss and Cost?

A
  • Loss is the measure of difference of a single example to its target value
  • Cost is a measure of the losses over the training set
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the basis for the simpler equation of the cost function for logistic regression? How does it look?

A
  • because y can only be 0 or 1
  • Loss function:
    L(f_wb(xi),yi) = -yi * log(f_wb(xi)) - (1-yi) * log(1-f_wb(xi))

Cost function: - (1/m) * Sum of [yi * log(f_wb(xi)) - (1-yi) * log(1-f_wb(xi))]
-> - of the y is being pulled out

loss function is based on maximum likelihood from statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly