Lecture 1: Panel Regression Flashcards

(18 cards)

1
Q

What is the linear regression formula?

A

Yi = a + ß1X1i + ß2X21 + ….,
Where:
Yi = Dependent variable for observation i
Xti = Independent variable for observation i
A,ß = coefficients to be estimated

Example: If independent variables are # of experience and job description:
Wage = a + ß# of years of experience + ß job description

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 5 outputs in R of linear regressions?

A
  • Estimate: point estimate of ß
  • Std. Error: Precision of estimate
  • T value = Estimate / std error
  • Pr (>|t|) = p-value (significance)
  • R-squared: fraction of variance explained
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are three types of data in panel regression?

A
  1. Cross-section
     Structure: many units, one period
  2. Time series
     Structure: one unit, many periods
  3. Panel
     Structure: many units, many periods
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a balanced panel?

A

same number of observations per unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an unbalanced panel?

A

not same number of observations per unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a pooled regression model?

A

running a linear OLS regression model with panel data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the two main problems with pooled regression models?

A
  • Technical: violation of uncorrelated errors of OLS is violated as observations within units are often correlated. Leads to biased coefficients and standard errors
  • Conceptual: pooled OLS estimates bunches together across and within unit variation of variables.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Give two reasons errors are often correlated within units in a pooled regression model?

A
  • Unobserved unit- or time-constant variables impacting the dependent variable and possibly the independent variables
  • Unobserved unit- or time-variables: Variables that have a constant impact over time within units but are not measured in the dataset.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two types of variation within panel data?

A
  • Across unit variation: different values for x and Y with a certain relationship
  • Within unit variation: relationship between X and Y within each unit.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which type of variation is more credible to establish causality?

A

Within unit variation is more credible to establish how X causes & to change because, comparing the same unit against itself over time is more comprehensive to establish causal effects of X on Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the regression formula look when using only within unit variation?

A

Yi = a + ß1X1i + … ßnXn1 + ci + ei, where:

Ci = a constant that is different for each unit to be considered an intercept different for each unit. This is called the fixed effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are three ways of runing regressions using within unit variation?

A
  • Estimate separate dummy variables for each unit
  • Demean Y and X variables by using unit level averages (within transformation)
  • Estimate statistical distribution for fixed effects (Random effect)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do regressions of panel data work when using seperate dummy variables?

A

If one uses separate dummy variables for each unit, each dummy is just another line in the R output with the Estimate, Std. Error, t value and significance. This shows the effect of the variation within units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the main drawback of using seperate dummy variables for each unit in panel data regressions?

A

Drawback of using separate dummy variables for each unit: not suitable for larger datasets. That is what you would use within transformation for.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the steps of within transformation regressions? And give three benefits of within transformation method?

A
  • Determine the mean value of dependent variable (Y) over the different time t units
  • For each time unit: Y – Mean(Y)
     This will remove from your dataset all units that have only one observation (time t) as they do not contain any variation
     Time fixed effects are accounted for with wave (all different time units observed in dataset)
     You can combine different fixed effects too in order to capture effects that are constant within both parameter 1 and parameter 2.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is meant with clustering standard errors?

A

Clustering standard errors: this is a solution for accounting for correlation in error terms. Clusters should be made on the same level as your fixed effects.

17
Q

What are three characteristics of panel regression outcomes?

A
  • Controls for unobserved unit-specific effects that are constant over time
  • Looks only at within-unit variation
  • Makes panel regression better at answering questions such as ‘did X cause Y’ compared to standard OLS regression
18
Q

Give two things panel regression cannot be used for?

A
  • Control for unobserved unit-specific effects that vary over time
  • Solve all issues of answering ‘did X cause Y’?