What is the equation for multiple linear regression and what does it mean?
πΜ=π_π+π_π πΏ_π+π_π πΏ_π+π_π πΏ_π+π_π πΏ_π
YΜ (Y hat) = The predicted outcome based on the regression model
X1, X2, X3, X4 = The predictor variables (the factors influencing Y)
π_0 “(betaβzero)” = The intercept, representing the predicted value of Y when all predictors are 0
π_1 (beta-one), π_2 (beta-two), π_3 (beta-three), π_4 (beta-four) represent the change in the predicted Y for each 1-unit increase in each predictor
all the assumptions for multiple linear regression are the same, however there is one new one. What is that?
No multicollinearity
Multicollinearity = when predictor variables are strongly correlated with one another, means they are measuring the same thing
Can have little or moderate multicollinearity, but not strong
How can we assess or detect multicollinearity?
What is the equation for VIF?
VIF = 1/(1 βπ π2)
Where Ri2 is the R squared value obtained by regressing the ith predictor variable on the remaining predictor variables
How do we calculate the VIF by hand?
we run a linear regression model for each of the predictors
What are the key concepts?
R-squared
F-value
Regression coefficient
What does R-squared measure?
The proportion of variance in the outcome variable that is predictable from the predictor variable
- ranges between 0 and 1
What is adjusted R-squared?
adjusts R-squared for the number of predictors in the model
- provides a more accurate measure of the goodness of fit/tells us more about model accuracy when more than one predictor variable is used
- always less than or equal to R squared
What is Cohen’s f2 (f squared)
The effect size
How to interpret the effect size?
small = 0.02
medium = 0.15
large = 0.35