What case study was used for an EFA?
Shopping Behaviour
12 variables measured on a likert scale (1-7),
n = 490 (respondents)
Two of the variables had an * next to them, to indicate they were reverse coded, why is reverse-coding carried out?
Two main reasons:
What are the 3 assumptions we have to be aware of?
What are the 3 preliminary analyses that determine whether it makes sense to conduct an EFA?
What is the scaled to determine whether the variables are sufficiently correlated with one another?
> 0.5: moderately high
0.6: high
0.7: very high
If there is no correlation between the variables, why does an EFA not make sense?
Well an EFA’s entire purpose is GROUPING variables into underlying factors, if there is no correlation between variables… it undermines the entire purpose of this analysis!
What is the scale used for “Sample Adequacy” (preliminary analysis)?
Kaiser-Meyer-Olkin KMO, (scale as listed in picture)
What does Bartlett’s test need to be?
Significant!
Does this KMO and Bartlett’s preliminary analysis show an EFA makes sense?
YES.
Because Bartlett’s test is significant
and
KMO is at 0.82, which is >0.8, and therefore “deserving” according to the scale
What is step 2 (following the preliminary analysis)?
DETERMINING the NUMBER of FACTORS
What are the two main ways the number of factors are determined?
What is the “Kaiser Criterion”?
Helps us determine number of factors
Select all factors with an eigenvalue GREATER than 1
What is the problem with factors that have an eigenvalue SMALLER than 1?
They are NO better than a SINGLE variable - since each variable has a variance of 1.0 due to standisation
What is a “Scree Plot”?
Helps us determine number of factors
Plot of Eigenvalues against the number of factors in order of extraction. Select one factor LESS than the ELBOW.
Why is it said that the number of factors should exceed a “cumulative percentage variance” of 50% (preferably 75%)?
So the rule-of-thumb “≥50% (and ~75% is nicer)” is basically saying:
If your factors explain very little (e.g., 20–30%), you’ve reduced the data too aggressively and the factors may not represent the variables well.
If they explain a lot, the factor solution is doing a decent job summarizing the dataset
Using the “Kaiser Criterion”, how many factors should you choose?
3 factors, because they are each >1
After this, the three together have a cumulative % of 59.774%
Using the “Scree Plot”, how many factors should you choose?
1 LESS THAN ELBOW
so… in this case elbow is at 3, so only 2 factors
If the “Kaiser Criterion” shows to use 3 factors, but the “Scree Plot” shows to use 2, what should you do?
COMPARE THE 3-FACTOR SOLUTION, AND 2-FACTOR SOLUTION, determine WHICH is more SUITABLE!
What is done to facilitate the interpretation of a factor solution?
ROTATING THE FACTORS (using VARIMAX)!
Varimax makes a factor solution easier to interpret by “cleaning up” the loadings so each variable tends to belong clearly to one factor.
What does “VARIMAX” do?
Varimax doesn’t change the overall fit/communality much—it mainly redistributes how variables load across factors to make the pattern clearer.
Before rotation: a variable loads .45 on Factor 1 and .40 on Factor 2 (ambiguous).
After varimax: it might load .70 on Factor 1 and .10 on Factor 2 (clear).
What is this called?
CROSS-LOADING
What is “CROSS-LOADING”?
Cross-loading is when one item/variable loads noticeably on more than one factor—so it doesn’t clearly “belong” to a single factor.
Is this problematic?
Whilst cross-loading is problematic, the secondary loading is really low (only 0.2), so its not really problematic in this case
What are the steps to interpreting (and comparing 2 vs 3 factor in this case)?