How does linear regression help decision makers?
What does linear regression do?
It analyses the linear relationship between two or more variables (attempts to fit a straight line through the points on a chart between DV and IVs).
What measurement level is used in linear regression? (dependent variable)
Interval or ratio scales
What is the difference between a simple (bivariate) and a multiple linear regression?
Simple analyses the linear relationship between one DV and one IV
Multiple analyses the linear relationship between one DV and multiple IVs
What is the basic function of linear regression?
What does a conceptual chart of a multiple linear regression look like? (online banking case)
What is the first step to carrying out a linear regression (in SPSS)? (online banking case)
Select customer profitability as DV and other variables as IVs
What does this mean?
R-squared value: All five variables explain 5.7% of the variation in customer profitability
What is the difference between adjusted R square vs normal R square?
R squared refers to how much of the total variance in the DV can be explained by the IV(s). We tend to inflate the R squared with the more variables we have, so the adjusted R square corrects for that and is a more conservative measure - USE ADJUSTED R SQUARE!
What can you tell from this?
That the model is statistically significant, you can reject the null hypothesis that no relationship exists between the variables.
What is the difference between unstandardised and standardised beta?
Unstandardised β: change in Y (raw units) for a 1-unit increase in X - good for interpretability.
Standardised β: change in Y (SD units) for a 1 SD increase in X (good for comparing predictors).
What can you tell about the significance of the coefficients?
Statistical significance of all regression coefficients except “district”.
(so all other variables do make statistically significant unique contribution to the DV).
Knowing “district” is insignificant, what should we do?
The fewer variables you have, the better is your predictive model. Given district code is insignificant, you can remove it and run again! Doing so does not change the R squared (but still better model).
When removing “district” from the linear regression, what should you check?
That the R squared value did not change (it is possible it worsens, in that case.. don’t remove it!)
What can you tell from the estimated equation? (about the profitability)
When a customer uses the online (as compared to offline) channel, the profitability will increase by 18.240
What can you tell from the estimated equation? (about age)
When a customer’s age increases by one unit (?), the profitability will increase by 18.279.
HOWEVER, watch OUT!!!
Always look at how variables are coded, the age increasing by 1 unit is referring to going up an age group, its not by year!