What is the core problem in statistics about never knowing the truth?
We do not know whether a hypothesis is actually true or false (e.g., Jinder’s coin is or not); we only observe sample data and make a decision based on those data.
What does statistics provide a framework for?
Statistics therefore provides a framework for reasoning and making decisions under uncertainty, not a way of directly discovering the truth.
In statistics, two things exist simultaneously, what are they?
1) The real truth about the hypothesis (which we cannot observe).
2) Our decision based on the data (reject or not reject H₀). Because we do not know the truth, our decision can be correct or incorrect. By rejecting null hypothesis, you aren’t providing evidence against the null hypothesis
What is a type 1 error?
Type I error (false positive): Null is correct, but reject the null
What is a type II error?
Type II error (false negative): Null is wrong, but the null is not rejected
True or false: statistical testing guarantees correct conclusions
FALSE: statistical testing cannot guarantee correct conclusions
What does it mean when the p-value is large?
When the P-value is big = fits with the null hypothesis
What does it mean when the p-value is small?
When P-value is small = demonstrates significance meaning reject the null hypothesis
Is statistical testing qualitative or quantitative?
Statistical hypothesis testing is a quantitative inference framework.
What does statistical hypothesis testing evaluate?
It evaluates how compatible the data are with an assumed model = the null hypothesis (HO)
–> Core idea: we evaluate how surprising the observed data would be if the null hypothesis (H₀) were true.
How does one decide on when to reject the null hypothesis?
The significance level, denoted by 𝛼 (alpha), is the threshold we set before analyzing the data to decide how much incompatibility with the null model we are willing to tolerate before rejecting it.
What does estimation ask and what does hypothesis testing ask?
1) Estimation asks - How large is the effect?
2) Hypothesis testing asks - Is there any effect at all?
Why do people prefer statistical hypothesis testing over estimation?
What does statistical hypothesis testing focus on?
Statistical hypothesis testing does not focus on the exact proportion value, but on whether there is evidence that the proportion differs from a specified value
What is the P-value?
The p-value is the probability, calculated under the assumed null hypothesis (H₀), of observing a value of the test statistic (θ) as extreme as, or more extreme than, the one actually observed
Why we are the ones that set type I error (alpha) but not type II error (beta)?
Because α is defined from the probability (sampling) distribution assuming H0 is true, it represents the probability of committing a Type I error (a false positive); that is, rejecting H0 when H0 is actually true.
What are the highlighted tails of the t-distribution related to statistical hypothesis testing?
*The highlighted areas under the curve are values that are rare that aren’t likely to occur and thus demonstrate that there might be an effect, so you reject the null hypothesis
In statistical hypothesis testing, we construct the sampling distribution under the assumption that…
…the null hypothesis (H0) is true.
–> This means that all values in that distribution, including the observed sample value, are outcomes that could occur if H0 were true.
Where must a value lie on the distribution such that we reject the null hypothesis?
If the observed sample value lies in a region of the distribution that is sufficiently unlikely under H0, we conclude that the result is improbable under the null model and reject H0.
What is the protection against incorrectly rejecting a true null hypothesis (a Type I error) determined by?
The protection against incorrectly rejecting a true null hypothesis (a Type I error) is determined by the chosen alpha level (the significance level).
How can you reduce the probability of failing to reject a false null hypothesis (Type II error)?
Typically requires increasing statistical power, often by increasing the sample size.
When does a Type I error occur?
A Type I error occurs when a true null hypothesis is incorrectly rejected (i.e., rejecting the null hypothesis when it should not be rejected). Its probability is the significance level (α), which is determined by us and remains unaffected by the sample size (n)
When does a Type II error occur?
Type II error is failing to reject a false null hypothesis (i.e., do not reject the null hypothesis when you should not have). Its probability is β and is more complex to estimate (advanced stats). This probability decreases as sample size increases.
What is the power of a test (1 - β)?
The power of a test (1 - β) is the probability of correctly rejecting the null hypothesis when it is truly false. This probability increases as the sample size grows.