MANOVA
Multivariate Analysis of Variance
Extension of the ANOVA
ANOVA
k independent samples
k related samples
Is there a difference between the population means?
Extension of the t-test
Key Assumptions
• between-subjects analysis: involves only comparisons of groups of subjects.
• Only one dependent variable from each subject.
• Univariate analysis.
Univariate tests can be used in certain circumstances.
Need to meet certain assumptions
• Independence of observations
• Normality of distribution (assume that the data comes from a population that is normally distributed)
• Homogeneity of Variance (drawn same from the distributed groups)
The distinction between design and analysis
• Design: Between-subjects
• Analysis: ANOVA
Types of ANOVA
Repeated Measures ANOVA
Factorial ANOVA
Mixed Design ANOVA
- Repeated and between-subjects component
Can you provide an example of designs where these analyses would be appropriate
Multivariate
A number of measurements taken on each subject (generally will be correlated).
Together give more / better information than separately
No assumptions that variables come from the same distributions
When sampling, subjects must constitute a homogeneous collection with respect to all characteristics which may affect the values of the variate.
MANOVA is
Multiple dependent or outcome variables and you are interested in group differences.
MANOVA asks have mean differences among groups on a combination of response variables occurred by chance?
Actually compare differences between a new response variable that is a linear combination of the observed response variables, where the linear combination is chosen so as to maximize the difference
between the groups.
Hypotheses about the means are tested by comparing variances.
• Hence, MANOVA(riance).
MANOVA vs ANOVA
ANOVA is better because: –
• simpler analysis.
• MANOVA assumes more.
• interpretation of the effects of explanatory variables on any single response
variable difficult in MANOVA.
• often MANOVA is less powerful than ANOVA.
MANOVA is better because: –
• more outcome variables increases the chance of finding what really changes as a result of different treatments and their interactions.
• may be an overall difference, though no difference in separate univariate tests. Thus MANOVA may be more powerful.
Statistical Issues
Variables for MANOVA
Hypothesis Testing
t-test
μ1 = μ 2
ANOVA
μ1 = μ 2 = μ 3 = μ n
MANOVA
μ1 =μ2 =μ3 =μn for DV1 &
μ1 =μ2 =μ3 =μn for DV2
– the alternative hypothesis is that there are at least 1 differences (across groups) in at least 1 of the DVs or in the DV composite
Testing the null
In ANOVA, variance is partitioned into:
• SS total = SS between + SS within
• so if SS between is much larger than SS within the null is rejected
• Similar approach in MANOVA
• however, SS (which are scalars) are replaced by sums of squares and cross product (SSCP) matrices because we need to take correlations (covariances) of
the DVs into account
• we use determinants to get a summary index of variance in these matrices
• c.f. mean square in ANOVA
MANOVA Assumptions
• Multivariate Normality
• The sampling distributions of the DVs and all linear combinations of them are normal.
• Homogeneity of Variance-Covariance Matrices
• Box’s M tests this but it is advised that p<0.001 is used as Criterion
• Linearity
• It is assumed that linear relationships between all pairs of DVs exist
• Multicollinearity and Singularity
• Multicollinearity – the relationship between pairs of variables is high
(r>.80)
• Singularity
• A variable is redundant; a variable is a combination of two or more of the other variables.
Generalised Linear Model
outcome = (model) + error
Rationale and Procedure
Ways to calculate the F Statistic
• Wilk’s λ (lamda)
Single Factor MANOVA test: between-subjects effects, aka one-way MANOVA
Factorial MANOVA test: Also between-subjects effects
Repeated measures MANOVA test: Without any between-subjects factors*
Unequal Cell Sizes
The impact of having more people in one condition over another condition
Missing Data
MANOVA: How much missing data
There are techniques that help alleviate this.
problematic when there is a lot
The type of missing data is also dependent? is it systematic, if so, there is a problem in the design