IntrEcon Flashcards

(81 cards)

1
Q

Econometrics

A

The science (and art) of using economic theory and statistical techniques to analyse economic data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Causality

A

A specific action leads to a specific, measurable consequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Flow

A

measured over a period of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Ceteris paribus

A

Other (relevant) factors being equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Stock

A

measured at a particular point in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A cross-section of data

A
  • Is collected across sample units in a particular time period
  • The observations relate to discrete units - individuals, households, enterprises or countries – and a variable is some characteristic of these units
  • The ordering of the observations in the sample does not matter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sample units

A
  • Individual entities
  • May be firms, people, households, states, or countries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Time series data

A
  • A sequence of data points, measured typically at successive times spaced at uniform time intervals (e.g., daily, weekly, monthly, quarterly, yearly).
  • data collected on the same observational unit at multiple time periods. The time dimension imposes a natural ordering.
  • The underlying process is typically continuous not discrete.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A panel of data/ longitudinal data

A
  • Has observations on individual micro-units who are followed over time
  • We observe same micro-unit (individuals, firms, or counties) for a number of time periods.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A balanced panel

A

If we have the same number of time period observations for each micro-unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The slope

A

The slope of the (population) regression line is the expected effect on Yi of a unit change in Xi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Regression R²

A
  • Measures the fraction of the variance of Yi that is explained by Xi
  • It is unitless and ranges between 0 (no fit) and 1 (perfect fit)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Standard error of the regression (SER)

A
  • Measures the spread of the observations around the regression line, measured in the units of the dependent variable
  • Measures the average ‘size’ of the OLS residual
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

OLS residual

A

The average ‘mistake’ made by the OLS regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The root mean squared error (RMSE)

A
  • Measures the same thing as the SER, the minor difference is division by 1/n instead of 1/(n-2)
  • When n –> infinite, the difference is neglible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

2 steps to figure out the sampling distribution (steekproefverdeling) of the OLS estimator

A
  1. Probability framework for linear regression
  2. Distribution of the OLS estimator
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Probability framework for linear regression

A

Is summarised by the three least squares assumptions:
1. Random variables: Y, X
2. Joint distribution of (Y, X)
3. Data collection by simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Population

A

Group of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

E(ß^1)

A

If E(ß^1) = ß1, then OLS is unbiased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

var(ß^1)

A
  • Is a measure of sampling uncertainty
  • We need to derive a formula to compute the standard error of ß^1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Distribution of ß^1 in small samples

A

It is very complicated in general

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Distribution of ß^1 in large samples

A

In large samples, ß^1 is normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

The law of large numbers

A

As n increases, the distribution of Y- becomes more tightly centred around µ_Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

The central limit theorem

A

As n increases, the sampling distribution of Y- is approximately normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Consistent
An estimator is consistent if the probability that it falls within an interval of the true population value tends to one as the sample size increases
26
Convergence in probability/ consistency
The property Y- is near µ_Y with increasing probability as n increases
27
5 steps to learn about the slope of the (population) regression line
1. State the population parameter of interest 2. Provide an estimator of this population parameter 3. Derive the sampling distribution of the estimator, this requires certain assumptions. In large samples this sampling distribution will be normal by the CLT. 4. The square root of the estimated variance of the sampling distribution is the standard error (SE) of the estimator 5. Use the SE to construct t-statistic, for hypothesis tests, and confidence intervals
28
Type I error
The null hypothesis is rejected when in fact it is true
29
Type II error
The null hypothesis is not rejected when in fact it is false
30
Significance level
The prespecified rejection probability of a statistical hypothesis test when the null hypothesis is true
31
Critical value of the test statistic
The value of statistic for which the test just rejects the null hypothesis at the given significance level
32
Rejection region
The set of values of the test statistic for which the test rejects the null hypothesis
33
Acceptance region
The values of the test statistic for which it does not reject the null hypothesis
34
size of the test
The probability that the test actually incorrectly rejects the null hypothesis when it is true
35
power of the test
The probability that the test correctly rejects the null hypothesis when the alternative is true
36
p-value
- The probability of obtaining a test statistic, by random sampling variation, at least as adverse to the null hypothesis value as is the statistic actually observed, assuming that the null hypothesis is correct. - The smallest significance level at which you can reject the null hypothesis
37
95% confidence interval for ß1
- An interval that contains the true parameter value of ß1 in 95% of all possible randomly drawn samples - An interval that contains the true parameter value of ß1 with 95% probability - Is the set of values of ß1 that cannot be rejected using a two-sided hypothesis test with a 5% significance level
38
var(ui|Xi = xi) = s²_u is constant
The variance of the conditional distribution of ui given Xi does not depend on Xi
39
Homoskedasticity
If var(ui|Xi = xi) = s²_u is constant, then ui is said to be homoskedastic, otherwise ui is heteroskedastic
40
Implications of heteroskedasticity
- OLS estimator is unbiased, consistent, and asymptotically normal - Homoskedasticity-only standard errors are inappropriate - The t-statistic computed using homoskedasticity-only standard errors does not have a standard normal distribution, even in large samples
41
Homoskedasticity-only standard errors
These are only valid if the errors are homoskedastic
42
Heteroskedasticity-robust standard errors
These are valid whether or not the errors are heteroskedastic
43
Gauss-Markov theorem
States that, under a set of conditions known as the Gauss-Markov conditions, the OLS estimator ß^_1 has the smallest conditional variance given X1, ..., Xn of all linear conditionally unbiased estimators of ß_1; that is the OLS estimator is BLUE
44
Limitations of the Guass-Markov theorem
- The condition of homoskedasticity often doesn't hold - THe result is only valid for linear estimators - OLS is more sensitive to outliers
45
Omitted variable bias
The bias in the OLS estimator that arises when one or more included regressors are correlated with an omitted variable
46
2 conditions omitted variable bias
- At least one of the included regressors must be correlated with the omitted variable - The omitted variable must be a determinant of the dependent variable, Y
47
Implications omited variable bias
- Ovb is a problem, whether the sample is large or small - The size of the bias depends on corr(Xi, ui) = pXu. --> the larger, the larger the bias
48
Population regression line
The relationship that holds between Y and the X's on average in the population
49
Coefficient
The expected change in Yi resulting from changing Xi by one unit, holding the rest constant.
50
Intercept
The expected value of Y, when all X's equal 0
51
OLS estimators
The values of b0, b1, ..., bk that minimise the sum of squared prediction mistakes
52
Frisch-Waugh theorem
- States that the OLS coefficient in the third regression equals the OLS coefficient on X1 in the multiple regression - The third regression estimates the effect of X1 on Y using what is left over after removing (controlling for) the effect of the other X's
53
(multi)collinearity
When data are the result of an uncontrolled experiment, many of the economic variables may move together in systematic ways
54
Imperfect multicollinearity
Two or more of the regressors are (highly) correlated
55
Ways to address multicollinearity
- Obtain more information - Introduce nonsample information - Variable reduction via Principal Component Analysis (PCA)
56
Joint hypothesis
A hypothesis that imposes two or more restrictions on the regression coefficients
57
Large-sample distribution of the F-statistic
The distribution of the average of two independently distributed squared standard normal random variables
58
Chi-squared distribution with q degrees of freedom
The distribution of the sum of q independent squared standard normal random variables
59
The 'overall' regression F-statistic
Tests the joint hypothesis that all the slope coefficients are zero
60
The F-statistic when q=1
- When q=1, the joint hypothesis reduces to the null hypothsis on a single regression coefficient - + The F-statistic is the square of the t-statistic
61
A 95% confidence set for 2 (+) coefficients
A set that contains the true population values of these coefficients in 95% of randomly drawn samples
62
Confidence ellipse
A fat sausage with the long part of the sausage oriented in the lower-left/ upper-right direction
63
If the errors are homoskedastic
There should be no patterns of any sort in the residuals
64
If the errors are heteroskedastic
They may tend to exhibit greater variation in some systematic way
65
White-test for heteroskedasticity
The LM-statistic for testing all of the 8k in the equation are zero, except for the intercept
66
Control variable
- Is not the object of interest in the study... - ... rather it is a regressor included to hold constant factors that, if neglected, could lead the estimated causal effect of interest to suffer from omitted variable bias
67
Conditional mean independence
- Requires that the conditional expectation of ui given X1i and X2i is independent of X1i although it depends on X2i - E(ui|X1i,X2i) = E(ui|X2i)
68
Base specification
A core or base set of regressors should be chosen using a combination of issues of omitted variable bias, economic theory, expert judgement, and knowledge how the data was collected
69
Alternative specification
Alternative set of regressors
70
(Adjusted) R²
Tell you whether the regressors are good at predicting, or 'explaining', the values of the dependent variable in the sample of data on hand
71
What (adjusted) R² doesn't tell
whether: - An included variable is statistically significant - The regressors are a true cause of the movements in the dependent variable - There is omitted variable bias - You have chosen the most appropriate set of regressors
72
'Nonlinear'
Refers to models that are nonlinear in X's but linear in the beta's so they can be estimated via OLS
73
Polynomials
The population regression function is approximated by a quadratic, cubic, or higher-degree polynomial
74
Logarithms
Y and/or X is transformed by taking its logarithm, which provides a 'percentages' interpretation of the coefficients that makes sense in many applications
75
Polynomials: a general strategy
- Identify a possible nonlinear relationship - Specify a nonlinear function and estimate its parameters by OLS - Determine whether the nonlinear model improves upon a linear model - Plot the estimated nonlinear regression function - Estimate the effect of a change in X on Y
76
Xi * Di
Through the use of this interaction term, the population regression line relating to Yi and the countinuous variable Xi can have a slope that depends on the binary variable Di
77
RESET
- REgression Specification Error Test - It adds polynomials in the OLS fitted values to the equation to detect general functional form misspecification
78
3 estimation methods
- n - 1 binary regressors OLS model - Fixed effects (FE) regression model - Difference estimator, without an intercept
79
'Clustered' standard error formula
Is needed because observations for the same entity are not independent, even though observations across entities are independent if entities are drawn by simple random sampling
80
Autocorrelated
Correlated with itself
81