What is a controlled (experimental) study?
Often discrete (categorial) variation in the predictor variable, with no manipulation or assignment of treatments by the researcher.
What is a uncontrolled (observational) study?
Often continuous variation in the predictor variable, with no manipulation or assignment of treatments by the researcher.
What are the 4 classes of statistical design?
1) Continuous Dependent + Continuous Independent = Regression
2) Continuous Dependent + Categorical Independent = t-tests and ANOVA
3) Categorical Dependent + Continuous Independent = Logistic Regression
4) Categorical Dependent + Categorical Independent = Tabular
A study once suggested that shining light on the back of the knees could reset the human circadian clock and potentially prevent jet lag, sparking excitement among scientists, entrepreneurs, and the public.
Subjects may have been exposed to light reaching their eyes indirectly, even though the light was aimed at the back of the knees.
What were the 3 treatments to test “the knees who say night”?
Created 3 treatments; 22 people to one of the 3 treatments:
1) Control: light device present without producing light
2) Goggles that blocked all light from eyes and light shone at the knees
3) Light in the eyes
What is ANOVA as demonstrated by the knee study?
A statistical method used to compare the means of three or more independent groups to determine if at least one group mean is statistically different from the others.
Can samples from different statistical populations have similar means? and why is this important?
YES: This is important because statistical populations can have identical means but differ in their variances (and other aspects).
- This indicates that samples could be drawn from distinct populations, yet their means may not differ. This aligns with the null hypothesis (H0), which assumes no difference in means between groups.
What are generalized forms of H0 and HA preferred by the professor?
What is sampling error due to?
Remember: Sampling error is due to sampling variation, i.e., samples that come from the statistical populations sharing the same mean.
What variables are involved in an ANOVA?
An ANOVA always involves one continuous response variable (e.g., shift in circadian rhythm) and one categorical predictor variable.
–> The categorical variable (predictor) is divided into groups which are often referred to as treatments or factor levels → dead or alive
When studying only a single factor, what type of ANOVA is used?
One-way ANOVA
What does ANOVA analyze? How is this achieved by the F-statistic?
How does HETEROSCEDASTICITY (unequal variances) influence the F-ratio in the F-statistic?
HETEROSCEDASTICITY reduces the F-ratio ability to differentiate among differences in means among groups
What is the formula for the ratio of variances to calculate the F-statistic?
F = Variance among group means (due to “treatment”)/Variance within groups (referred to as error or residual variation, it represents the variation not explained by the differences in means among groups)
F = Group mean square/Error mean square
What does a lower F-statistic mean?
Heteroscedastic = increased variance
What 2 things does the F-statistic detect?
1) Variation among means of different groups → numerator
2) Variance within groups → denominator
We require a test statistic that can detect variations in means across multiple groups: what achieves this and how?
The F-statistic achieves this by evaluating the ratio of two variance components.
What is the role of degrees of freedom for estimating variance in an unbiased manner?
What does the numerator and denominator degrees of freedom for F-statistic depend on?
The numerator degrees of freedom depends on the number of groups (g1) and the denominator degrees of freedom depends on the total number of observations (N-g)
TRUE or FALSE: ANOVA is directional
FALSE: non-directional
What does it mean when we say ANOVA is non-directional?
How is the p-value estimated from the F-distribution seen in ANOVA?
The p-value is estimated as the number of F-values in the null distribution equal or greater than the observed F-value (i.e., one tailed-test).
What are the 3 assumptions of ANOVA?
1) Observations are independent random samples from their respective populations.
2) The response variable (e.g., shift in circadian rhythm) is approximately normally distributed within each treatment group (we will revisit this assumption later).
3) Variances are equal across groups (homoscedasticity). Unequal variances can distort F-values and lead to unreliable inference about differences among means (discussed in a later lecture).
What does Levene’s test do?
Statistical inferential method used to assess the equality (homogeneity) of variances across two or more population samples.