What is validity?
A judgment or estimate of how well a test measures what it’s supposed to measure in a particular context
What is the relationship between validity and reliability?
Reliability is required by not sufficient for validity
What is validation? Who plays a role in it?
The process of gathering and evaluating evidence about validity
This can be done by both test developers and test users
What is local validation?
When test users aim to determine the validity of a test within their own local settings or conditions, using their own group of test takers
What are the 3 main categories of validity (from easiest to hardest to establish)?
Content –> Criterion-Related –> Construct
What is Content Validity?
How well a test samples behaviors that are representative of the broader set of behaviors that it’s designed to measure
In other words, it measures how well test items/topics adequately represent the content that should be included based on the operational definition being used
What is Face Validity?
A form of content validity, it is a judgment concerning how relevant the test items appear to be on the face of it
This is the simplest form of validity to prove, but some tests are intentionally designed to have low levels of it
What is a Test Blueprint?
Part of the process of creating content validity, it is a plan regarding the types of information covered by the items, the number of items tapping into each area of coverage, and the organization of the items in the test
How do we typically establish content validity?
Expert panels: obtain expert ratings on the degree of item importance and scrutinize what is missing from the measure
Focus Groups: having the general population react to the measure
What is Criterion-Related Validity?
Evaluates the relationship between scores obtained on one test and scores obtained on other tests or measures
What is a criterion? What does it need in order to be adequate?
A standard against which a test or test score is evaluated
Must be…
1- relevant to the matter at hand
2- valid for the purpose that it’s being used
3- uncontaminated, as in it cannot be a part of the predictor
What ways can we establish criterion-related validity (in order from easiest to hardest to establish)?
1 - Concurrent validity
2 - Predictive validity
3 - Incremental validity
What is Concurrent Validity?
The degree to which a test score is related to some criterion measure obtained at the SAME time
What is Predictive Validity?
The degree to which a test score predicts some criterion measure (or outcome) obtained at a FUTURE time
What is a Base Rate? How does it influence predictive validity?
The extent to which a phenomenon exists in the population
The less frequent it is, the more difficult it would be to show predictive validity
What is the Hit Rate when establishing predictive validity? What are the two kinds?
The ability of the measure to accurately predict results
Two possibilities…
1- true positive
2- true negative
What is the Miss Rate when establishing predictive validity? What are the two kinds?
Failure to identify something accurately
Two possibilities…
1- False positive or Type I error: saying that something will happen and then it does not
2- False negative or Type II error: saying that something will not happen and then it does
What is the Validity Coefficient? What is it affected by?
A correlation coefficient between test scores and scores on the criterion measure
Affected by restriction or inflation of range
What is Incremental Validity?
The degree to which an additional predictor explains something about the criterion measures that is not explained by predictors already being used
Essentially saying “this test predicts the criterion better than other tests”
What is Construct Validity? How do we acquire evidence for it?
The ability of a test to measure a theorized construct. Essentially, does the measure map onto the THEORY the way we would expect it to (as in, do high scorers and low scorers behave as theorized?)
Establishing content and criterion-related validity will also provide evidence for construct validity, but it requires a little bit more than just that
What are the different forms of evidence for construct validity? (7 things)
1- evidence of homogeneity
2- evidence of changes
3- evidence of retest changes
4- evidence from distinct groups
5- convergent evidence
6- discriminant evidence
7- factor analysis
What is evidence of homogeneity?
How uniform a test is in measuring a single construct (established using evidence from internal reliability)
Ex: if I believe that my construct is narrow, then my internal consistency should be high
What is evidence of changes?
How the construct changes over time in the way it’s expected to, established using evidence from test-retest reliability
What is evidence of posttest or retest changes?
Test scores change as a result of some kind of experience or intervention between pretest and posttest, established using evidence from dynamic assessment