Parameter Def
FIxed number describing the population
Estimate
To estimate a parameter means to create an estimate, which is a value used to infer what the parameter is
Estimator
Method (formula) to create an estimate
e.g. sample average (summation of y values / n) creates an estimate
What is OLS, what are the estimates from it?
OLS is an estimator
We get B0, and B1
Consider the equation
y_i = B_0 + B_1X_1i + B_2X_2i + U_i
Explain each item in it
Y is the outcome variable
x1i, x2i: regressors, independent variables. We are primarily interessted in one s0pecific regressor and it’s effect on the outcome, known as regressor of interest
B0; constant, intercet when all regressors are 0
Ui: Error term, residuals (difference between OLS/predicted Y and observed Y)
What is the effect of random assignment
There will be no confounders as nothing is correlated with randomised treatment
What is the difference between treatement that is as good as randomly assigned but not directly randomised
As good as randomly assigned is if we do not directly randomise treatment, but the treatment is still uncorrelated with all other determinants of the outcome.
e.g. rainfall is as good as random by nature
Variance and standard deviation are what
Measures of spread
standard deviation i ssquare error of the variance
what does the var and standard deviation of y measure
Measure the spread of the values of that variable either in the sample or in population
What does the var and standard deviation of an estimate (e.g. sample average, regression coefficient)
Measure the spread of that estimate across repeated samples
sd(B1 hat); ‘if we repeatedly drew data and created an estimate B1 hat using each, how spread out would those estimates be#
standard error meaning
estimate of the standard deviation of an estimate
Calculating standard eror requires plugging in sample estimates
What is a counterfactual
Counterfactual is the outcome that would have been observed under another treatment status that didn’t happen.
This is unobserved e.g. the treatment group in the absence of treatment
Define Bias
Difference between our estimate of the causal effect and the true causal effect
avg. potential outcome in absence of treatment for the treated - avg potential outcome in absence of treated for the control
(ybar 0,d=1 - ybar 0,d=0)
What is the ‘Sample Average Treatment Effect on the Treated’
Avg. Potential outcome of treatment for treated individuals - Avg. Potential outcome in absence of treatment for treated individuals
(Ybar 1,d=1) - (Ybar 0,d=1)
What is the bias if we randomlly asign treatment?
Bias in sample is expected to be 0
Thus
E(potential outcome in absence of treatment given treated indv.) = E (potential outcome in absence of treatement, given control group)
E(y0i I D=1) = E(y0i given d=0)
What is the notation for a potential outcome?
y1i = potential outcome if treatment 1
y0i = potential outcome if treatment 0
Given no bias, what is the Average Treatment Effect (ATE)
ybar1i - ybar0i is the average treatment group for everyone in the sample
E( potential outcome of treatment - potential outcome in absence of treatment)
ATE=ATT
How do we justify that treatment is randomly assigned
Comparing characteristics of treatment and control individuals
IF they look the same on observable characteristics, so its reasonable to claim they would be similar on unobservable characteristics –> thus no bias
How do we compare characteristics of treatment and control individuals
Through a T-test
e.g. Xbar,1t - Xbar0t
evaluate if treated and control look the same
How would you do a comparison of means of treatment and control?
Null Hypoth: Diff in means is 0
T test done by dividing the difference in means by standard error for the difference in means
What does Var (X-Y) = ?
If so calculate the the SE(Xbar1 - XBar)
Interpret all the terms in linear regression:
yi - B0 +B1x1i + B2x2i +ui
yi - outcome for observation i
B0 - expected value of y if x1=0 and x2=0
π½1 is the average change in π¦π associated with π₯1π increasing by 1, holding fixed all other π₯ (in this example, just π₯2).
π½2 is the average change in π¦π associated with π₯2π increasing by 1, holding fixed all other π₯ (in this example, just π₯1).
π’π is the effect of all factors other than π₯1 and π₯2 on π¦.
How do we estimate parameters of our linear regression form?
We use OLS, which chooses coefficients that minimise the sum of squared residuals
with a single regressor we have formulas for parameters that solve the minimisation problem
What are the formulas that solve the OLS minimisation, when we have a single regressor
π1 that solves this is π½1 Μ = πΆππ£(π₯1,π¦)/ πππ(π₯1)
π0 that solves this is π½0 Μ= π¦Μ βπ½1 Μπ₯Μ .
These formulas only apply if there is a single regressor.