What are the 5 steps in statistical hypothesis testing?
1) How the research hypothesis should be transformed into a statistical question.
2) State the null and alternative hypotheses.
3) Compute the observed value for a particular metric of interest
4) Compute the P-value
5) Draw a conclusion by contrasting the p-value against the significance level (๐ผ). If the p-value is greater than ๐ผ, then do not reject H0; if P-value is smaller or equal than ๐ผ, then reject H0.
In the following example: define the null and alternative hypotheses:
Example; Normal human body temperature, as kids are taught in North America, is 98.6oF. But how well is this supported by data?:
1) H0 (null hypothesis): the mean human body temperature is 98.6oF
2) HA (alternative hypothesis): the true population is different from 98.6oF.
What are the effects of increasing sample size on hypothesis testing?
By increasing the sample size, the probability of the p-value of observing a value that is equal or more extreme is much smaller because the t-distribution becomes more narrow
–> increasing sample size likely provided more statistical power such that we may be able to reject the null hypothesis
Can we say with 100% confidence that increasing sampling size increased statistical power (precision)?
NO: Statistical power is defined relative to a true effect, and since the truth is unknown in real data, we cannot directly observe or confirm power; we can only infer that, if a real effect exists, a larger sample size would make it more likely to detect it.
What occurs to the p-values are sample size increases?
Sample size decreases the standard error, which makes the t value (test statistic) increase, which in turn leads to smaller p-values.
–> Smaller P values allows rejecting the null hypothesis
What is the power of a test (1-ฮฒ) and what occurs to it as sample size increases?
It is the probability of rejecting the null hypothesis when it is truly false.
–> This probability increases as sample size increases.
What are the 2 important assumptions of the one-sample t-test?
1) The data are assumed to represent a random and independent sample from the population = unbiased
2) The variable of interest (e.g., human body temperature) is assumed to follow an approximately normal distribution in the population. This assumption is particularly important for small sample sizes, as it justifies the use of the t-distribution.
Scientific question: Does clear-cutting a forest affect the number of salamanders present: is this a one or two-sample hypothesis testing and why?
Statistical hypothesis testing for comparing two sample means: Because there are two treatments: clear cutting / no clearcutting (control).
In the following example, what are the 2 types of variables seen?
Scientific question: Does clear-cutting a forest affect the number of salamanders present?
1) Treatment is a categorical variable
2) Number of salamanders is a numerical variable.
Scientific question: Does clear-cutting a forest affect the number of salamanders present?
There are two main alternative study designs that affect the choice of statistical test, what are they?
1) In the two-sample design, each treatment group is composed of an independent, random sample unit.
2) In the paired design, both treatments are applied to every sampled unit (here - forest plots)
What is the advantage of a paired design when doing the comparison of 2 means?
The advantage of a paired design is that it minimizes the impact of variability among sampling units that is unrelated to the treatment, thereby increasing the precision of the results (e.g., local environmental differences among observational units). It reduces confounder variables.
What is an example of a confounder variable that could be present in the following example?
Scientific question: Does clear-cutting a forest affect the number of salamanders present?
Notice that clear-cutting occurred more frequently in wet soils, whereas areas without clear-cutting were predominantly dry. If soil moisture plays a critical role for salamanders, this non-random distribution of sampling units could bias the results and affect the conclusions.
–> By doing paired tests, we can eliminate confounding effects since the effects will likely affect either treatment areas
What is a confounding variable?
A confounding variable is an unmeasured third variable that influences both the supposed cause and the supposed effect
What are 5 other examples of paired study designs?
1) Comparing patient weight before and after hospitalization.
2) Comparing fish species diversity in lakes before and after heavy metal contamination.
3) Testing effects of sunscreen applied to one arm of each subject compared with a placebo applied to the other arm.
4) Testing effects of smoking in a sample of smokers, each of which is compared with a non-smoker closely matched by age, weight, and ethnic background.
5) Testing effects of socioeconomic condition on dietary preferences by comparing identical twins raised in separate adoptive families that differ in their socioeconomic conditions.
What is a previously seen example of paired design?
The same spider male that self amputates, speed before and speed after = paired design
What is the H0 and HA in the following example:
Are males with high testosterone paying a cost for this extra mating success in other ways (trade-offs)? OR Is avian humoral immunocompetence (i.e., ability of the immune system to produce antibodies to defend against pathogens) suppressed by testosterone?
1) H0: The mean change in antibody production in the population after testosterone implants is zero.
2) HA: The mean change in antibody production in the population after testosterone implants is different from zero.
What is ๐d?
๐d: is the population mean difference between treatments
What is the difference between paired observations between two samples equal to (property of the means)?
Differences between paired observations between two samples is equal to the differences between means (this is a property of means)
Differences between paired observations between two samples is equal to the differences between means (this is a property of means): what does this property allow for?
This property allows us to analyze within-pair differences directly using a one-sample t-test, which properly accounts for the pairing and reduces variability.
–> Importantly, if we instead treated the data as two independent samples, we would lose this pairing information and use fewer degrees of freedom (df), resulting in a less powerful test.
What are the 3 assumptions for paired t-tests (paired comparison of 2 means?
1) The observational units are randomly and independently sampled from the population because thatโs how the t-distribution was built
2) The observations are paired (i.e., measurements are meaningfully linked within units = have the same units).
3) The differences within pairs are approximately normally distributed in the population (especially important for small sample sizes).
What is an example doing a comparison of 2 independent sample means?
Do the spikes of horned lizards provide protection against predation from loggerhead shrikes?
In the following example, what are the HA and H0?
Do the spikes of horned lizards provide protection against predation from loggerhead shrikes?
1) H0: Lizards killed by shrikes and living lizard do not differ in mean horn length (i.e., ๐1 = ๐2).
2) HA: Lizards killed by shrikes and living lizards differ in mean horn length (i.e., ๐1 โ ๐2).
What is the quantity Sp^2?
The quantity ๐ p2 is called the pooled sample variance and is the average of the sample variances weighted by their degrees of freedom (related to sample sizes).
What is pooled sample variance?
Pooled sample variance is a weighted average of individual sample variances, where groups with larger sample sizes (higher degrees of freedom) receive greater weight.