Predicted Values
Values of the dependent variable based upon estimated regression coefficients and the prediction about the value of the independent variable.
Y = b0 + b1*X
Confidence interval
Estimate a prediction interval around a predicted value
Standard error of the estimate (SEE)
An estimated regression does not describe the relationship between the dependent and independent variables perfectly.
The SEE is the standard deviation of the error term.
Confidence interval of Predicted Y
Y +- (t * s)
Where
s^2 = SEE^2 [ 1 + 1/n + (X-\bar(X))/(n-1)s^2
Degrees of freedom is n - 2
Functional forms
When the relationship between X and Y is not linear, fitting a linear model would result in biased predictions.
Natural log transformation examples.
Log-lin: Log(Y) = bX
Lin-log: Y = bLog(X)
Log-Log = Log(Y) = bLog(X)
Simple linear regression
The variation in a dependent variable in terms of the variation in a single independent variable.
Dependent variable
The variable whose variation is explained by the independent variable. Sometimes also referred to as the explained variable, endogenous variable, or the predicted variable.
Independent variable
The variable used to explain the variation of the dependent variable. Sometimes referred to as the explanatory variable, exogenous variable, or the predicting variable.
Ordinary Least of Squares formula
Y = b0 + b1X + e
Regression Coefficients
b1 = Cov(X,Y)/\sigma^2
b0 = Y - \hat(b1)*X
Assumptions of Linear Regression
1) There is a linear relationship between the dependent and independent variables.
2) Variance of the error terms is constant (homoskedasticity)
3) Error terms are independently distributed.
4) Error terms are normally distributed.
Homoskedasticity
The case where prediction errors all have the same constant variance
Heteroskedasticity
The variance of the error terms not being constant.
Analysis of Variance (ANOVA)
SSE + SSR = SST
SST = \sum(Y -\bar(Y))^2
SSE = \sum(Y -\hat(Y))^2
SSR = \sum(\hat(Y) -\bar(Y))^2
Mean Square Regression (MSR)
MSR = SSR/ K
Mean Square Error (MSE)
MSE = SSE (n-k-1)
Coefficient of Determination (R^2)
R^2 measure the percentage of total variation in Y explained by the variation in X (SSR/SST)
Standard Error of the Estimate (SEE)
Measures the accuracy of predicted values from the regression equation.
SEE = (SEE/(n -2))^.5 = MSE^.5
The F-Statistic
Tests whether the independent variables explain the variation in the dependent variable.
Notes:
1) One-tailed test
2) 2 degrees of freedom
F-Statistic Formula
F = MSR/MSE = (SSR/k)/(SSE/(n-k-1))
Regression Coefficient t-Test
t = (\hat(b) - b)/s
where,
s= SEE/\sum(X-\bar(X))