Mean independent:
Independent:
E(u|X) = E(u) and E(u) = 0
Eg(u)|X) = E(g(u))
E.g. variance spreading as X increases
Assumptions for consistency vs unbiasedness
Consistency: OR cov(e,X) = 0
Unbiasedness: mean independence
- E(e|X) = E(e) = 0
Regressions in both directions implications
Run regression both directions
Define descriptive interpretation
“on average a unit increase in X1 is associated with a b* increase in Y, holding X2..Xk constant”
Standard error of regression
s = square root (SSR/(n-k-1)
Least squares assumptions
1) error term is conditional mean 0
2) X Y are iid draws from joint dis
3) non-finite fourth moments - large outliers are unlikely
4) no perfect multicollinearity
Talk about consistency
Means sample β is very close to u with high probability
Talk about asymptotic normality
- β - β => N(0, σ^2)
Talk about asymptotic variance of beta / se(B hat)
w^2 = σu^2 / Var (X) (write as sum)
Talk about imperfect multicollinearity
Hypothesis testing steps
1) state null and alternative
2) get t stat
3) Under the null t -> N(0,1)
4) Decision rule
5) Outcome
Talk about one-sided tests
Pval definiton and usefulness
Confidence interval interpretation
The collection of null hypothesised values for β that would be accepted (by a 2-sided t test) at significance level ∝
- set of null hypothesis that i couldn’t reject if I do a 1% confidence test
Polynomial in regression vs linear
Causes of endogeneity
1) omitted variable bias
2) measurement error
3) simultaneity
OBV formula and usefulness
β’ = β + yCov(X1, X2) / Var (x1)
Impact of measurement in error in Y
Example for IV for demand elasticity of cigarettes (why a good one)
General sales tax:
Solutions to bad controls
1) Find an instrument for education and estimate model via 2SLS
2) omit from regression
- Interpretation: ‘total effect’ of labour market discrimination inclusive of its effects of educational attainment
Why 2SLS less efficient than OLS
- only looking at part of X explained by D so less precise
Tension in choosing instruments
- requiring variables to be exogenous Cov(Z,u) = 0
Test for relevance
F > c = 10
Test for exogeneity
- Descriptive exogeneity
Can’t test for one Z
- Test for more than one Z:
H0: cov(z1, u) =..= cov(z2,u) = 0
F test F-> Fm-1, infinity
Z correlated with other unobserved determinants of Y