statistical inference
why do we use random sampling?
what are the three main explanations for any observed effect?
bias vs confounding
bias is a systematic error in the design or implementation of a study: creates an association which is not true
confounding is an association that is true, but potentially misleading
hypothesis testing
involves choosing between 2 propositions:
we are looking to “reject” the null hypothesis (we want to show that the observed effect is greater than what we would expect based on chance alone)
a null hypothesis may occur when you find an observed effect in the sample population, but there is no effect on the entire population (that you are trying to represent)
P value
P value and significance
P > 0.05
confidence intervals
null value for difference in means? relative risk? odds ratio?
difference in means = 0
RR = 1
OR = 1
confidence intervals around differences between groups?
confidence interval vs P-value?
confidence interval is more informative than a p-value
in addition to statistical significance (given by P value), CI also gives you an idea of how large or how small the effect is likely to be
what are the two types of quantitative variables?
continuous (measurement) and discrete
what are the two types of qualitative variables?
nominal and ordinal
what type of variable is age?
continuous (quantitative)
what type of variable is a score of 1-5?
discrete (quantitative) (it is a number, but not continuous)
what type of variable is sex?
nominal (qualitative)
what type of variable is age category?
ordinal (qualitative) (some ordering of things)
what types of variables are categorical?
discrete, nominal, ordinal
what are dichotomous variables?
a type of categorical variable that is binary (eg have outcome or not)
what descriptive statistic tests can be used for continuous variables?
- variance, range (for spread or distribution)
what inference/hypothesis testing can be used for continuous variables?
variance and standard deviation are what?
statistics that tell you how tightly data are clustered around the mean:
standard error
the standard error of the mean tells us how VARIABLE these means are likely to be from one sample to the other (if you were to do repeated sampling)