interpret constant & coeffcient
constant = when education equals 0, income is 457
coeffcient = with every year of education the mean income increases by 104
What is a prediction error?
Lin regression assumptions:
How does OLS work?
What is r^2?
How is R^2 calculated?
What is the loss function for linear regression?
What does a lin regression model predict?
The mean value of y for a given value of x
(No probabilities, it’s a model of the mean)
How can we make a constant more meaningful? (1)
centering: usually mean centered (subtract -12,5 years from years of education)
How can we make a constant more meaningful? (2)
standardizing: subtract mean / by SD
1) For every 1-SD of education, mean of income rises by 402
2) For every 1-SD of education, mean of income rises by 0,3 SD of mean income
Why would you want to standardize?
Allows comparison
How are the standardized coeffcients also called?
What is true about correlations?
1) Standardizing gets rid of scale –> whole point
3) Perfect correlation = 0 error
4) just not linear -> just a measure for linear relationships!
Why would we even need a regression, why not only calculate the conditional means?
1) reduce noise -> virtue of abstraction
2) prediction even for data that is not there
3) allows for more control i.e. mediation, moderation, controls, etc.
Why do we square residuals in r^2?
1) prevent cancelling out
2) bigger penalty for large residuals
Is my R² too low?
Low R-Squared is often good BUT also a limitation
Is my R² too high?
High R-Squared is often not good
BUT can be
Why a Low R-Squared is often good
Why a Low R-Squared is also a limitation
Why a High R-Squared is often not good
Why a High R-Squared can be good
very accurate prediction if really captures the relationship
What do we need to control for?