Logistic Regression Flashcards

Question 1

Q

What is logistic regression? What is it used for?

Answer

A

most widely used algorithm for classification
used to classify data into two categories, like malignant (1) and benign (0)
fits an S-shaped curve to the dataset, predicting probabilities rather than direct class labels
-> says if what class is more likely

Question 2

Q

What is an alternative name for logistic function used in logistic regression?

Answer

A

Sigmoid Function

Question 3

Q

How does the Sigmoid Function look like?

Answer

A

horizontal axis labeled z and takes on a range of positiv and negative values
outputs values between 0 and 1
g(z) = 1 / (1+e**-z) with 0<g(z)<1

Question 4

Q

How can the logistic regression be expressed?

Answer

A

z = w*x + b with w and x being vectors
pass z to sigmoid function g(z) = 1/(1+e-z)
together form
f_wb(x) = g(wx+b) = 1 / (1+e**[wx+b])

Question 5

Q

How can the output of logistic regression be understood?

Answer

A

“probability” that class is 1

example: x is tumor size
y is 0 (not malignant) or 1 (malignant)

f_wb(x) = 0.7
–> 70% chance that y is 1

Question 6

Q

What is the decision boundary/threshold in logistic regression?

Answer

A

is the value at which y is assumed to be 1
below that it is assumed to be 0
because the sigmoid function calculates probability this is basically saying, when the probability is X like 70% we assume the classification to be 1
common threshold is f_wb_(x) >= 0,5 is y_^= 1
that means is the same as when g(z) >= 0.5
That is the case when z >= 0 and thus w*x + b >= 0 then y is 1
else y = 0

Question 7

Q

What is the decision boundary mathematically?

Answer

A

the values for w,x and b where z = w*x + b = 0
that means where the y is neutral, neither 1 or 0
can assume complex forms for polynomial equations

Question 8

Q

How does the squared error Cost function hold up for logistic regression? Why is that?

Answer

A

not good
the regular squared error cost function for logistic regression results in a graph that is like a squiggly line with multiple local minima -> non-convex
-> gradient descent would possibly get stuck in a local minima

Question 9

Q

What is the loss function? How does the choice of how this loss function is calculated affect the cost function?

Answer

A

loss function is the part inside the summation of the cost function
meaning the cost function is the summation of all loss over all features and examples divided by the number of examples
choosing the loss function affects how the graph of the cost function looks like
for instance for logistic regression the squared error cost function 1/m * Summation of 1/2 * ((f_wb_xi)-yi)² the graph would be non-convex
replacing 1/2 * ((f_wb_xi)-yi) with a different loss function results in a different graph

Question 10

Q

What is the loss function for logistic regression?

Answer

A

if y(i) = 1 : -log(f_wb(xi))
if y(i) = 0: -log(1-f_wb(xi))

-> convex (inverse hill)
-> can reach a global minimum

Question 11

Q

What is the better cost function for logistic regression?

Answer

A

J_wb = 1/m * Sum over (L(f_wb_xi, y(i))
With L(f_wb_xi, y(i)) being the loss function for logistic regression
if y(i) = 1 : -log(f_wb(xi))
if y(i) = 0: -log(1-f_wb(xi))

Question 12

Q

What is the difference between Loss and Cost?

Answer

A

Loss is the measure of difference of a single example to its target value
Cost is a measure of the losses over the training set

Question 13

Q

What is the basis for the simpler equation of the cost function for logistic regression? How does it look?

Answer

A

because y can only be 0 or 1
Loss function:
L(f_wb(xi),yi) = -yi * log(f_wb(xi)) - (1-yi) * log(1-f_wb(xi))

Cost function: - (1/m) * Sum of [yi * log(f_wb(xi)) - (1-yi) * log(1-f_wb(xi))]
-> - of the y is being pulled out

loss function is based on maximum likelihood from statistics

Question 14

Q

Logistic Regression Flashcards

(14 cards)