Population vs sample
What do we measure?
Why?
We measure the info that is contained in the data.
Measure only x and y
U is not observable
–> if it would be, we could determine the coeff exactly
How do we measure b0 and b1?
We only have estimated values of b0 and b1
–> determine the estimators
Whats OLS?
Ordinary Least Squares
–> choosing both coeff in a way that the sum of the squared residuals is minimized.
Estimator of the slope coeff
Only correct if….
b1
Only correct if denominator is +
OLS summary
OLS summary
Is it better to have more of the structural or the stochastic term?
The more of the variables in the structral term the better
Variance decomposition
Total variance: actual yi - mean y
Explained v: predicted yi - mean y
Residual v : actual yi - predicted yi
Total = explained + residual
R square?
How well does the empirically tested model fot the data?
R-square = 1 - total var/residual var
Interval [0,1] thehigher the better
Assumptions of the simole regression model?
Needed
1-6
Whaz happens if not fulfilled?
If not fulfilled, results are biased
Assumptions of simple regression model
Not needed
7-8
Heteroscedasticity
effect
differing variance across all values of an IV
e.g. age with income
Homoscedasticity
variance is the same across all values of IV
Dealing with heteroscedasticity
-
Solution
+ ols estimates are UNBIASED
Solution : use robust estimates
Testing for significance of the coeff?
Is there a …. between x and y?
H0: is it ….. 0 ?
Distribution of predicted b1:
In large samples: b1 …. to a …. distribution
In small samples: b1 is ….. distributed
Testing for significance of the coeff?
Is there a RELATIONSHIP between x and y?
H0: is it UNEQUAL 0 ?
Distribution of predicted b1:
In large samples: b1 CONVERGES to a NORMAL distribution
In small samples: b1 is NORMALLY distributed
Multiple regression
“How …is the … size of x1, provided that x2 remains constant?”
Multiple regression
“How LARGE is the EFFECT size of x1, provided that x2 remains constant?”
Problems in multiple regression?