Module 8 - Regression Analysis Flashcards

Question

What does **SSE** stand for in regression analysis?

Answer 1

Sum of Squared Errors ## Footnote SSE provides a metric for the total amount of uncertainty surrounding the regression line.

Answer 2

TRUE ## Footnote Residuals can have both positive and negative values, so squaring them ensures they contribute positively to the sum.

Answer 3

The point where the line passes through the mean of the X and Y data ## Footnote It is calculated using the formula: ( hat{a} = ar{Y} - hat{b} ar{X} ).

Answer 4

Statistical uncertainty surrounding the equation ## Footnote It represents the residuals and indicates that the relationship is statistical rather than purely functional.

Answer 5

* It is independent with X * It has a probability distribution ## Footnote These assumptions are critical for the validity of the regression model.

Answer 6

To estimate the range within which the true parameter values lie ## Footnote Confidence intervals provide insight into the reliability of the predictions made by the model.

Answer 7

An error term that will always persist despite improvements in data collection and normalization ## Footnote It indicates that some level of error is inherent in the data.

Answer 8

* Independence with respect to X (homoscedasticity) * Normally distributed with a mean of zero and constant variance ## Footnote These assumptions are crucial for the validity of Ordinary Least Squares regression.

Answer 9

The assumption that the error term is independent and shows no correlation with X ## Footnote This means that the spread of the residuals should be constant across all levels of the independent variable.

Answer 10

To check the OLS assumptions by examining a residual plot ## Footnote It helps determine if the regression model is appropriate for the data.

Answer 11

* Fitted slope (b hat) * y-intercept (a hat) ## Footnote These coefficients are derived from minimizing the sum of squared errors (SSE).

Answer 12

TRUE ## Footnote The error term accounts for the variability in the data that cannot be explained by the model.

Answer 13

Symmetry about the x-axis with constant variability ## Footnote This indicates that a normal distribution with a mean of 0 is a reasonable assumption.

Answer 14

* Sum of Squares Total (SST) * Sum of Squared Errors (SSE) * Sum of Squares Regression (SSR) ## Footnote These measures help assess the goodness of fit of the regression model.

Answer 15

SST = SSE + SSR ## Footnote This equation shows how total variation is partitioned into explained and unexplained variation.

Answer 16

Total variation in Y ## Footnote It is calculated as the sum of the squared differences between each data point and the mean of the y-values.

Answer 17

Unexplained variation around the regression line ## Footnote It is calculated as the sum of the squared differences between each observed y-value and its corresponding predicted y-value.

Answer 18

Explained variation from the average to the regression line ## Footnote It is calculated as the sum of the squared differences between predicted y-values and the mean of the y-values.

Answer 19

* SST: n-1 d.f. * SSR: k-1 d.f. * SSE: n-k d.f. ## Footnote Degrees of freedom are crucial for determining the statistical significance of the regression model.

Answer 20

Larger average variation due to loss of degrees of freedom ## Footnote This reflects the trade-off between model complexity and explanatory power.

Answer 21

MSE = SSE / (n - k) ## Footnote Where SSE is the sum of squared errors, n is the number of data points, and k is the number of parameters.

Answer 22

The value of the denominator for each mean measure of variation ## Footnote It imposes a penalty for each new parameter introduced in the model.

Answer 23

Uncertainty in the estimated regression coefficients and the regression equation ## Footnote It indicates how much variability exists in the estimated values.

Answer 24

t-distribution ## Footnote This occurs when estimating the mean of a normally distributed population with a small sample size and unknown population standard deviation.

Answer 25

The square root of the Mean Squared Error (MSE) ## Footnote It provides a quantitative measure of the variability of an estimate or forecast made using the regression equation.

Answer 26

CV = SEE / mean of y-values ## Footnote It expresses the SEE as a percentage of the mean and indicates the percent error associated with an estimate.

Answer 27

0.05 (5%) ## Footnote This means the assumption of a significant cost driver when there isn't one only happens 5% of the time.

Answer 28

H0: b = 0; H1: b ≠ 0 ## Footnote The null hypothesis states that there is no relationship, while the alternative suggests there is a significant relationship.

Answer 29

t = Estimated Coefficient / Standard Error ## Footnote This statistic is used to determine the statistical significance of the regression coefficients.

Answer 30

The probability that the correlation between x and y at its observed strength would arise if H0 is true ## Footnote A small p-value (less than significance level) indicates statistical significance.

Answer 31

p-value ## Footnote A significance level of {displaystyle alpha = 0.05} is used to determine statistical significance.

Answer 32

TRUE ## Footnote If the p-value is greater than 0.05, the coefficient is not statistically significant.

Answer 33

Regression analysis ## Footnote They help determine whether to accept the relationship in question.

Answer 34

Good predictor of independent variable ## Footnote They assess the significance of each independent variable.

Answer 35

Good model for regression as a whole ## Footnote They evaluate the overall significance of the regression model.

Answer 36

Same result ## Footnote This indicates the relationship is assessed consistently.

Answer 37

Indicator for goodness of fit ## Footnote It shows how much variability in the data is accounted for by the regression.

Answer 38

Better fit of the regression ## Footnote Values closer to 1.0 are preferred.

Answer 39

Not statistically significant ## Footnote It is greater than the significance level of 0.05.

Answer 40

Check the functional form ## Footnote This may involve analyzing residual plots for better models.

Answer 41

R² = explained variation / total variation ## Footnote It can also be expressed as R² = 1 - SSE/SST.

Answer 42

Sum of squares due to regression ## Footnote It measures the explained variation in the dependent variable.

Answer 43

Total sum of squares ## Footnote It measures the total variation in the dependent variable.

Answer 44

Sum of squares due to error ## Footnote It measures the unexplained variation in the dependent variable.

Answer 45

R² = SSR / SST ## Footnote For the toy problem, R² = 30/48 = 0.62.

Answer 46

Strong relationship ## Footnote It suggests a good fit for the regression model.

Answer 47

Larger blue arrow ## Footnote It represents the total variability in the Y data.

Answer 48

Variability after regression line ## Footnote It shows the variability in Y data after accounting for the regression.

Answer 49

Sum of Squares due to Regression ## Footnote SSR is a measure of the variation explained by the regression model.

Answer 50

Total Sum of Squares ## Footnote SST measures the total variation in the dependent variable.

Answer 51

R² ## Footnote R² indicates the proportion of variance in the dependent variable that can be explained by the independent variable.

Answer 52

FALSE ## Footnote The correlation coefficient ( r ) is always positive since it is derived from a square root.

Module 8 - Regression Analysis Flashcards

(76 cards)