Why is it not strictly accurate to talk about the validity of a test (hint: one test could be used in more than one context)?
It is not strictly accurate to talk about the validity of a test because test scores are able to be used and interpreted in more than one way, therefore making validity evaluations more about the context in which you are using a test, along with the persuasiveness of the test in the context measuring what it is said it is.
What are constructs?
Constructs are unobservable, underlying, hypothetical traits or characteristics that we can try and measure indirectly using tests.
What is construct underrepresentation?
The part of variation in the underlying construct being measured that is not captured.
What is construct irrelevant variance?
The part of variation that a measure captures that is not related to the construct.
What’s the difference between content and face validity?
There is no true difference between content and face validity – the only difference is that content validity is often based on the opinions of experts whereas face validity tends to be based on the opinion of the person taking the test.
Give your own example of a testing situation where content validity would be of central importance.
For a competence test – a hazard perception competence test for example
How could I create a university examination that had great empirical validity but poor content validity?
By including questions that successfully predict students understanding of the course, without being about the content from the lectures.
Why would it not be okay for the PSYC3020 quizzes to have poor content validity (even if they could reliably tell apart high and low scoring students)?
Because it isn’t fair
How could you go about evaluating content validity?
By asking a group of experts their opinions as to if the test measures what it is supposed to.
List five types of validity.
Content validity, face validity, convergent validity, discriminant validity, and criterion validity
What does hypothesis testing have to do with evaluating the validity of a test?
To design an empirical validation study, hypotheses need to be created regarding how measures ought to perform if it is valid.
Is reliability necessary for validity?
Yes
Is validity necessary for reliability?
No
Explain why reliability is necessary for validity but validity is not necessary for reliability.
It is problematic if there is no reliability, as there is nothing stable enough to be valid or not valid, however, reliability is not affected by validity.
Give two examples of things that might restrict the range of scores in a test and indicate what influence this could have on the validity coefficient.
Non-random attrition and self selection
Give an example of how self-selection during participant recruitment might restrict the range of test scores.
Self-selection may restrict the range of test scores, as there is a possibility for there not to be a distribution which mimics the real world (missing a part of the population)
How big does a validity coefficient have to be for a test to be considered valid?
The required magnitude depends entirely on the context – there is no specific cutoff.
What is criterion validity?
Criterion validity is the evidence that a test score corresponds to an accurate measure of interest (the criterion).
Give your own example of criterion validity.
Giving a new test to 50 surgeons, get them to do it, find real world patient outcomes for each surgeon, and compare the test score to real life.
Give three examples of criterion measures that could be used to evaluate the criterion validity of relevant tests.
GPA at end of 1st year to measure university admissions test, supervisors ratings of job performance to measure clerical aptitude, panel rating of creativity displayed in artistic products to measure creative thinking.
What is the criterion variable used to evaluate criterion validity?
The criterion is the standard against which the test is evaluated.
What is the method of contrasted groups?
An approach to criterion validity, where it is determined if test scores of groups of people vary as expected.
What is criterion contamination?
Where the criterion used to assess the validity of a test is pre-determined by the test, thereby undermining the logic of criterion validity.
Give an example of criterion contamination when evaluating criterion validity.
Validating a test of schizophrenic people by seeing if it can tell those diagnosed from those not, and then discover that those people have been diagnosed using the same test looking to be validated (circular).