Statistical Inference
Want to extend implications to the setting where we only have sample data
Want to make a statement about a feature of the population
Steps for Hypothesis Testing
Type I Error
Rejecting a true null hypothesis (false positive).
Significance level measures this
Type II error
Failing to reject a false null hypothesis (false negative)
Beta measures this
p-value
the probability of getting a test statistic as extreme or more extreme (in the direction of the alternative hypothesis), assuming the null hypothesis is true.
confidence intervals
use when we have 2 competing theories that we are trying to choose between
Confidence interval steps
Confidence level interpretation
“If we gathered repeated random samples of the same size and calculated a CL% confidence interval for each, we would expect CL% of the resulting confidence intervals to contain the true parameter of interest.”
we expect CL% of our intervals to be correct
name scenario: 1 categorical
proportion (p)
name scenario: 1 quantitative
mean (mu)
name scenario: 2 categorical
proportion_1 - proportion_2
name scenario: 1 categorical and 1 quantitative
mean_1 - mean_2
name scenario: 2 quantitative
simple linear regression
name scenario: 3 categorical
multiple logistic regression
name scenario: 3 quantitative
multiple linear regression
confidence interval interpretation
“We are CL% confident that are true parameter of interest lies in (a,b).”
describe distribution of simulated sampling distribution
Shape
Outliers
Center
Spread
p-value interpretation
The p-value represents the probability of observing a sample as extreme as the one calculated (or more extreme), assuming the null hypothesis is true.