What is the distribution of a logistic regression?
Bernouilli
What’s the formula that is the essence of logistic regression?
log[p/(1-p)]
also, odds = p/(1-p)
What is the range of odds?
0 to infinity
What is the range of log(odds)?
So why do we use logit?
Because it enables to go from probabilities (0,1) to the range of log, that are infinite both ways, making this a more natural space for a linear model
(we’re not transforming the outcome, but the probability)
Key assumptions of logistic model (2)
- Observations are independent
Probability =
exp(b0+bx)/[1+exp(b0+bx)]
How do we estimate the coefficients?
With a maximum likelihood function, which looks the set of coefficients that make the observed responses maximally likely
Unlike least squares, no closed-form solution to problem, so it’s found by trial and error
When does the explanatory variable as a significant effect?
When the beta is more than 2 standard errors away from 0
- but since p is not linear with X, the same change has a more drastic impact on p towards the center of the p-range than in the extremes
The deviance of a model is…
-2*loglikelihood of the data under the model considered
We can get the RR by…
Predicting the risk of the outcome