Lecture 15; Two-sample testing under heteroscedasticity: Flashcards by im dying Unknown

What is non-significance in statistics?

Indicates the result does not reach the threshold for statistical significance (e.g., p > 0.05).
Means there’s not enough evidence to reject the null hypothesis within the set confidence level (alpha).
Does not imply the absence of an effect; rather, it may reflect that the effect is not statistically detectable given the sample (i.e., a potential Type II error).

How well did you know this?

Not at all

Perfectly

What is insignificance in statistics?

Implies a lack of importance or relevance, which is not the intended message in statistics (more data, the potential for new discoveries).
Even non-significant results can be meaningful, especially in exploratory research.

How well did you know this?

Not at all

Perfectly

What are the 4 assumptions of a two-sample comparison of means?

1) Each of the two samples is a random sample from their population
2) The variable (e.g., horn length) is normally distributed for each population.
3) The standard deviation (and variance) of the variable is the same in both populations.
4) The theoretical sampling distribution of the differences between sample means assuming H0 as true follows a t-distribution only if the samples are drawn from populations with equal variances.

How well did you know this?

Not at all

Perfectly

If the null hypothesis is true and variances are different, what is the effect on alpha?

When the assumption of equal variances is violated (heteroscedasticity), the actual Type I error rate of tests that assume equal variances (e.g., the classical Student’s two-sample t-test) can deviate from the nominal α. This deviation can go in either direction: the test can become liberal (inflated Type I error) or conservative (deflated Type I error).

How well did you know this?

Not at all

Perfectly

Define heteroscedasticity

Unequal variances

How well did you know this?

Not at all

Perfectly

Define homoscedasticity

equal variances

How well did you know this?

Not at all

Perfectly

What is the intuition underlying a two-sample test of variances?

Their ratios are F-distributed
For each sample pair, calculate the ratio of their variances = divide the largest variance by the smallest variance

How well did you know this?

Not at all

Perfectly

How does the F-distribution differ from the T-distribution?

Note that the F-distribution changes with the sample size (df) of the numerator and denominator.
The f-distribution has a degrees of freedom that goes in the numerator (larger) and a degrees of freedom in the denominator (smaller)

How well did you know this?

Not at all

Perfectly

What do we do because the F-distribution is asymmetric?

We typically place the larger variance in the numerator so that the F-statistic is ≥ 1 and easier to interpret.

–> Using the reverse ratio leads to a different numerical value, but it corresponds to the opposite tail of the same distribution, resulting in an equivalent test and conclusion.

How well did you know this?

Not at all

Perfectly

How do you calculate F for the f-distribution?

F = Pop variance(1)/Pop variance(2)
–> pop variance is pop sd^2

How well did you know this?

Not at all

Perfectly

For distribution, how is the exact p-value calculated?

Multiplying the p-value by 2 gives the exact p-value provided that the largest variance is in the numerator.

How well did you know this?

Not at all

Perfectly

What are the 2 assumptions for the two-sample comparison of variances?

1) Both samples are independently drawn at random from their respective statistical populations (live and dead).
2) The variable (e.g., horn length) is normally distributed in each statistical population (live and dead).

How well did you know this?

Not at all

Perfectly

When the variances of two samples are unequal, a different version of the t-test should be used to compare their means. What is it?

Welch’s t-test

How well did you know this?

Not at all

Perfectly

Why isn’t heteroscedasticity an issue for paired t-tests?

Because it operates on a single sample of differences between paired observations rather than on two separate samples.

How well did you know this?

Not at all

Perfectly

When is Welch’s t-test used?

If variances differ

How well did you know this?

Not at all

Perfectly

For the following example: Research question: Does the presence of brook trout affect the survivorship of salmon?

What is the H0 and HA?

Study These Flashcards

1) H0: The variance of the proportion of chinook salmon surviving is the same in streams with and without brook trout (i.e., 𝜎12 = 𝜎22).
2) HA: The variance of the proportion of chinook salmon surviving differs in streams with and without brook trout (i.e., 𝜎12 ≠ 𝜎22).

What are the 2 differences between standard t-test and welch’s t-test?

Study These Flashcards

1) Standard t-test for comparing two-sample means use a common variance estimator (i.e., pooled variance). Welch’s test does not assume equal population variances, so each sample provides its own estimate of variability.
2) Overall uncertainty depends on how reliable each variance estimate is. When sample sizes are small or variances are very different, this uncertainty increases, effectively reducing the degrees of freedom in comparison to the standard two-sample t-test

Describe the Welch’s t-Test degrees of freedom

Study These Flashcards

In Welch’s t-test (and other tests), degrees of freedom can be nonwhole numbers.
This happens because Welch’s test uses an adjusted formula to better handle differences in group variances, rather than assuming equal variances.

Why non-whole numbers for degrees of freedom for Welch’s t-test and what is the effect of this?

Study These Flashcards

The adjustment in Welch’s formula results in a fractional degree of freedom, reflecting the sample sizes and variances of both groups more accurately.
Non-whole degrees of freedom in Welch’s test help provide a more accurate result by accounting for unequal variances between groups.

TRUE or FALSE: When the null hypothesis for means is true (equal 𝜇) but the variances differ, the risk of false positives exceeds the pre-established alpha level (in general).

Study These Flashcards

TRUE

With smaller degrees of freedom, the p-value for Welch’s t-test tends to be larger than that of the standard t-test. How is this remedied?

Study These Flashcards

As a result, Welch’s t-test adjusts the p-value, making it more difficult to reject the null hypothesis.
This adjusted p-value ensures that the risk of committing a Type I error (false positive) aligns with the original significance level (alpha)

Why is there such emphasis on ensuring that the Type I error rate (α) is properly controlled?

Study These Flashcards

For example, Welch’s t-test maintains the correct Type I error rate when variances differ, and procedures addressing multiple testing or p-hacking aim to prevent an inflated risk of false discoveries.

When you reject H0, you are making a positive claim. If that claim is wrong (Type I error), what can it create?

Study These Flashcards

If that claim is wrong (Type I error), it can: create false knowledge, propagate through future studies, and lead to poor decisions or wasted resources.

–> In contrast, a Type II error is more conservative: you simply fail to make a claim. While this can also be costly (e.g., missing a real species), it does not introduce false conclusions into the scientific record.

What are the 3 potential impacts of a Type I error?

Study These Flashcards

1) Wastes time and resources: Pursuing a non-existent effect.
2) Can cause harm: Approving an ineffective drug or treatment.
3) Loss of credibility: Damages trust in scientific findings.

What are 3 reasons why Type I errors are considered to be worse than type II errors?

1) False Hope or Danger: Imagine a new drug is approved but it doesn't work—this could lead to serious consequences. 2) More Difficult to Detect: Once published, Type I errors may persist longer in the scientific record. 3) Damage to Reputation: Especially in fields where public safety or health is involved.

Welch’s t-test: comparing two sample means when their variances are significantly different: What are 3 relevant issues with it?

1) The sample size may be too small to detect meaningful differences. 2) Differences in variances reduce the degrees of freedom, which significantly lowers the statistical power. 3) Even if the means are not truly different (i.e., H0 is true, something we cannot verify directly), it is important to recognize when the variances differ statistically. Such differences can be meaningful in their own right and may have important implications, particularly in contexts such as conservation.

Note that we use the terms “assume” homoscedasticity and “assume” heteroscedasticity: WHY?

Because we cannot know with certainty whether the true population variances differ. Our inference is based solely on the F-test outcome, which either leads us to reject or fail to reject the null hypothesis.

Lecture 15; Two-sample testing under heteroscedasticity: Flashcards

(27 cards)