latent variables.
Latent variables are constructs that are hidden or inferred rather than directly observed (self-esteem,
intelligence, motivation, anxiety).
Classical Test Theory (CTT)
Classical Test Theory (CTT) offers a simple but powerful framework that explains why the reliability
and validity concepts you have previously learned about matter:
Observed Score (X) = True Score (T) + Error (E)
Random Error
Definition: Fluctuations due to chance (mood, distractions, test conditions)
Effect: Lowers reliability (the consistency you learned to assess in PSYC206)
CTT insight: Averaging across multiple items or occasions reduces random error impact
Systematic Error
Definition: Consistent bias in one direction (poor translation, culturally inappropriate wording,
social desirability)
Effect: Threatens validity (especially construct validity from your PSYC206 framework)
CTT insight: Unlike random error, systematic error doesn’t cancel out with more items—it
requires theoretical and empirical investigation
Cronbach’s α in terms of internal consistency
Estimates proportion of observed score variance that is “true score” variance, but
assumes all items contribute equally (this assumption is called tau-equivalence)
McDonald’s ω in terms of internal consistency
A more sophisticated alternative to alpha based on factor analysis principles. Omega does not assume tau-equivalence.
Practical guidance: Report α and ω both when
possible; omega is increasingly preferred in methodologically rigorous research
why isn’t higher reliability always better?
Reliability follows the Goldilocks principle: not too low (< .70 suggests measurement problems), not too high (> .95 might indicate redundant items rather than
excellent measurement), but “just right” for your research context. There is no universally “perfect” value
for reliability coefficients.
Test-Retest Reliability
CTT logic: If the latent variable is stable and measurement error is truly random, scores should
remain consistent over time.
Poor test-retest reliability? This could indicate measurement error OR a genuine change in the
construct.
Inter-Rater Reliability
CTT logic: Different raters should reach similar conclusions if they’re observing the same true
score with minimal measurement error.
Systematic differences between raters suggest systematic error, not just random error.
Psychometrics
The science of developing (and evaluating) reliable and valid measures of unobservable variables.
Variables
Observable characteristics that differ between or change within our object of study
Values
the actual measurements or observations of these levels
score (X)
represents one specific observed value of the variable of a participant
Data
Groups of scores across one or more variables
Constructs
concepts we infer exist from our theories (e.g. depression, attitudes, self-esteem, personality)
Latent variable = constructs = variables we cannot directly observe
scale
a collection of related items designed to measure a latent variable
bivariate regression
Effect size
The degree to which the phenomenon is present in the population.
the strength of relationship between 2 or more variables
The amount of anything that’s of research interest
Raw vs Standardised effect size
Confidence interval (CI)
An estimated range of values that seem reasonable based on what we’ve observed. the center is still the sample mean, but we’ve got some room on either side for our uncertainty.
95% CI tells us that if you calculated the confidence interval from 100 different samples about 95 of them would contain the true population mean.