Adjusted R²
The coefficient of determination adjusted for degrees of freedom.
Akaike’s Information Criterion (AIC)
A figure holistically representing the prediction power of a model; a model with a lower AIC will provide a more accurate estimation.
Schwarz’s Bayesian Information Criterion (BIC)
A figure holistically representing the parsimony of a model; a model with a lower BIC has a better fit.
Joint test of hypotheses
What are its degrees of freedom? What are its conclusions?
A form of hypothesis test that determines whether a set of independent variables in a restricted model has significant power to explain changes in the dependent variable relative to the unrestricted model.
The test is F-distributed with q and n - k - 1 degrees of freedom.
H₀: b₁ = b₂ = … = bₖ = 0
Hₐ: At least one b ≠ 0
Rejecting H₀ means that the restricted model has significant explanatory power relative to the unrestricted model.
Failing to reject H₀ means that the restricted model does not have significant explanatory power relative to the unrestricted model.
General linear F-test
What are its degrees of freedom? What are its conclusions?
A form of hypothesis test that determines whether an entire regression has significant power to explain changes in the dependent variable.
The test is F-distributed with k and n - k - 1 degrees of freedom.
H₀: b₁ = b₂ = … = bₖ = 0
Hₐ: At least one b ≠ 0
Rejecting H₀ means that the model is well fit.
Failing to reject H₀ means that the model is not well fit.
Model specification
The set of independent variables in a regression model as well as its functional form; in order for a model to be correctly specified, it must:
1.) Be grounded in economic reasoning
2.) Be parsimonious
3.) Perform well out-of-sample
4.) Have the appropriate functional form
5.) Satisfy regression assumptions
Omitted variable bias
What can it lead to?
A functional form misspecification in which one or more independent variables with significant explanatory power as to the dependent variable are missing from the regression; may lead to heteroskedasticity and/or serial correlation.
Inappropriate form of variables
What can it lead to?
A functional form misspecification in which a nonlinear relationship between the independent and dependent variables is ignored; may lead to heteroskedasticity.
Inappropriate scaling of variables
What can it lead to?
A functional form misspecification in which variables must be transformed before estimating the regression; may lead to heteroskedasticity and/or multicollinearity.
Inappropriate pooling of variables
What can it lead to?
A functional form misspecification in which the regression pools observations from different contexts (e.g. fiscal regime, recession) leading to data clustering; may lead to heteroskedasticity and/or serial correlation.
Breusch-Pagan test
What is it? How is it calculated? What are its conclusions?
A form of hypothesis test that determines whether or not conditional heteroskedasticity exists in a regression model.
BP = nR²
H₀: no conditional heteroskedasticity exists
Hₐ: conditional heteroskedasticity exists
Durbin-Watson test
What is it? What are its degrees of freedom? What are its conclusions?
A form of hypothesis test that determines whether a model exhibits first-order serial correlation.
The test is DW-distributed with n and k degrees of freedom.
H₀: DW = 2
Hₐ: DW < 2
Rejecting H₀ means that the model exhibits positive serial correlation.
Failing to reject H₀ means that the model does not exhibit positive serial correlation.
Breusch-Godfrey test
What is it? What are its degrees of freedom? What does it conclude?
A form of hypothesis test that determines whether a model exhibits serial correlation up to an order p.
The test is F-distributed with n - p - k - 1 and p degrees of freedom.
H₀: no pth-order serial correlation exists
Hₐ: pth-order serial correlation exists
Variance inflation factor
What is it? How is calculated? What are its conclusions?
A figure representing the magnitude of multicollinearity.
VIF = 1/(1-R²)
VIF > 5: Further investigation into the independent variable is warranted
VIF > 10: A serious multicollinearity issue is present with regard to the independent variable
Maximum likelihood estimation
a.k.a. MLE
A method that estimates values for the intercept and slope coefficients in a logistic regression; the logit equivalent of ordinary least squares (OLS).
Likelihood ratio test
What is it and how is it calculated?
A joint test of hypotheses for a logit regression which uses the chi-squared distribution.
The closer the LR to 0, the better the model fits the data.
LR = -2(LLR - LLU), where LLR = log-likelihood of restricted model and LLU = log-likelihood of unrestricted model
Time series
A regression in which time is the independent variable.
Studentized residual
A t-statistic which is used to determine whether an observation is an outlier.
Trend
A long-term pattern of the dependent variable’s movement in a particular direction.
Leverage (regression)
What is it? How is it used to determine influence?
A measure of the influence of a high-leverage point on a regression.
If the leverage of an observation > 3[(k+1)/n], then the observation is potentially influential
Linear trend
A trend in which the dependent variable moves at a constant rate with respect to time.
Studentized deleted residual
What is it? How is it used to determine influence?
A figure that quantifies the effect of removing an observation from a regression on the residuals of that regression.
If |studentized deleted residual| > 3, the observation is an outlier
If |studentized deleted residual| > critical t-value with n-k-2 degrees of freedom at a selected significance level, the observation is potentially influential
Log-linear trend
A trend in which the dependent variable moves at an exponential rate with respect to time.
Autoregressive model
a.k.a. AR model
A type of time-series regression in which the dependent variable is modeled to be explained by previous values of itself.