does having a correlation means causation?
no, even some correlation means nothing
Confounding variables?
in observational studies, variable that influence one or both or our variables of interest can give the appearance of a casual relationship (an unknown effecting factor)
Experimental artifacts?
in experimental studies, create conditions that can impact the variables of interest, bias the measurements and modify the relationship
what are x and y in regression?
x is the explanatory, y is the response variable. we don’t have them in correlation
does correlation yield a formula for predict x and y?
no but regression does
correlation ?
a statistical technique for measuring the association between two variables (both are numerical)
what is a bivariate normal distribution?
when both variables have a normal distribution
5 parameters explain the bivariate distribution?
mean and variance of Y1, mean and variance of Y2, Rho(p)
what is Rho(p)?
population correlation coefficient(strength and direction of association)
what is the difference between ANOVA and Linear regression?
in ANOVA we test categorical variable vs numerical
but for linear regression is numerical vs numerical
what is estimation of the line of best fit?
relationship between the X(independent/explanatory) and Y (dependent/response) variables in our sample
what are the characteristics of an ideal residual plot?
-horizontal band of points
-centred on zero
-linear
-contant width (no funnel shape, homoscedastic)
-no outliers
what are considered as outliers?
observation of Y that’s very different from all others, often influential for the line of best fit
what can we do about outliers?
try to redo the analysis without the outliers and compare, to omit the outlier we should have an independent reason to do so