What is validity in psychological testing?
How well a test measures what it purports to measure in a particular context.
What assumption does validity rely on?
That the test is reliable — it must measure something consistently before validity can be established.
Is validity a property of the test itself?
No — it depends on the application, population, and purpose of the test.
Why might test users conduct their own validation studies?
To ensure the test is valid for their specific group or context.
What is a local validation study?
A study done when a test is adapted for a specific group, e.g., converting a standardized test into Braille for visually impaired examinees.
What are the seven dimensions of validity?
What is a criterion in validation research?
An external indicator of what the test measures
EX’S. - Another test score
- An observable behavior
- A physiological finding
- A future event or outcome
- Performance, achievement in a relevant task
- A “gold standard” score or rating
- An expert’s appraisal
- An informant’s rating
How is criterion validity typically assessed?
Using correlation coefficients between test scores and criterion measures (or correct classification rates for categorical outcomes).
What is concurrent validity?
When a test and a criterion are measured essentially at the same time
EX. a psychological test is used to measure a current state such as stress, and blood pressure or heart rate variability is measured
What is predictive validity?
When a test is used to forecast a criterion observed well after the test was given (but can also be synonymous with criterion/concurrent validity)
(When a test predicts a future criterion)
EX. an entry test is used to predict success in grad school and validated by examining future grades, publications, and completion
What is incremental validity?
The added validity/predictive power when we add a new measure to an existing battery (i.e. how much more do I know about the construct when I add another measure?)
EX. if I add a clinician rating of affect to my assessment, which already includes a depression self-report measure, can I more accurately identify
What is face validity?
How relevant or appropriate test items appear to be “on the face of it.”
When might you not want a test to have face validity?
→Measuring something unflattering
→Testing things you know people are lying about (malingering or factitious disorder)
Is face validity scientific evidence of test validity?
No — it’s based on judgment or appearance, not empirical data.
What is content validity?
A judgment of how adequately a test samples behavior representative of the universe of behavior it was designed to measure.
What does content validity depend on?
Accurate domain sampling — ensuring test items cover the entire construct being measured.
->can affect reliability and validity
->Also requires non-confusing items, excluding irrelevant items
EX. A math test that only includes geometry questions and ignores algebra or statistics.
What is construct validity?
A judgment about whether test scores appropriately reflect an individual’s standing on the construct being measured.
->Umbrella term for all kinds of validity
What is a construct?
A scientific concept or idea used to describe or explain behavior (e.g., intelligence, anxiety, motivation).
What provides evidence for construct validity?
Anything that supports the idea that the test accurately captures the intended construct — correlations, group differences, developmental trends, etc.
Types of Evidence for Construct Validity?
Convergent evidence
Discriminant evidence
What is convergent evidence?
Scores on the test tend to correlate highly in the positive direction with scores on other tests designed to measure the same (or a similar) construct, and in the negative direction with those measuring opposite construct
EX. have a depression test so should be highly correlated with other depression test, know depression and anxiety often go together so middle to moderate correlation with anxiety tests
What is discriminant evidence?
Validity coefficient showing little relationship between test scores and/or other variables with which scores on the test should not theoretically be correlated.
→Don’t want your test to be correlated with other tests measure something unrelated to what your measuring
→Happiness/motivation isn’t discriminant evidence when looking at depression because depression is a lack of happiness so they are negatively correlated but still correlated
What are some other forms of evidence?
-Homogeneity
-Changes with age
-Pretest-posttest changes
-Distinct groups