Reliability, Validity, Utility Flashcards

(48 cards)

1
Q

TRUE OR FALSE?
Psychological assessments are only useful if the tests we use are consistent, accurate, and practical.

A

TRUE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

THREE MAJOR CONCEPTS
Consistency

A

Reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

THREE MAJOR CONCEPTS
Accuracy

A

Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

THREE MAJOR CONCEPTS
Practical usefulness

A

Utility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Consistency of measurement; the degree to which test scores are stable, dependable, and free from random error.

A

Reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Key Idea: If I measure the same thing again, will I get the same result?

A

Reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

TYPES OF RELIABILITY
Same test given at two different times; should produce similar scores.

Example: Taking an IQ test in January and again in February.

A

Test-Retest Reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

TYPES OF RELIABILITY
Agreement between different scorers/observers.

Example: Two clinicians rating the same patient’s behavior.

A

Inter-Rater Reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

TYPES OF RELIABILITY
Two different but equivalent versions of a test; should give similar results.

Example: Version A and Version B of an exam.

A

Parallel-Forms Reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

TYPES OF RELIABILITY
Consistency of items within the same test; measured by Cronbach’s alpha.

Example: All items on a depression scale should relate to depression.

A

Internal Consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Accuracy of measurement; the degree to which a test measures what it is supposed to measure.

A

Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Key Idea: Am I measuring the right thing?

A

Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

TYPES OF VALIDITY
Does the test cover all relevant aspects of the concept?

Example: A math test that only asks about addition has poor content validity.

A

Content Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

TYPES OF VALIDITY
How well a test predicts performance on an external standard.

A

Criterion-Related Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

TYPES OF VALIDITY | CRITERION-RELATED VALIDITY
Measured at the same time (e.g., depression test vs. clinical diagnosis).

A

Concurrent Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

TYPES OF VALIDITY | CRITERION-RELATED VALIDITY
Predicts future performance (e.g., SAT predicting college GPA).

A

Predictive Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

TYPES OF VALIDITY
Does the test really measure this theoretical concept?

Example: Does an anxiety scale truly capture anxiety and not just stress or shyness?

A

Construct Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

TYPES OF VALIDITY | CONSTRUCT VALIDITY
Test correlates with similar measures.

A

Convergent Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

TYPES OF VALIDITY | CONSTRUCT VALIDITY
Test does not correlate with unrelated measures.

A

Discriminant Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Practical value of the test; practicality. The usefulness of a test in a real-world setting, considering both benefits and costs.

21
Q

Key Idea: Is this test worth using?

22
Q

FACTORS AFFECTING UTILITY
A test that isn’t consistent or accurate won’t be useful.

A

Reliability & Validity

23
Q

FACTORS AFFECTING UTILITY
Time, money, effort vs. value gained.

A

Costs vs. Benefits

24
Q

FACTORS AFFECTING UTILITY
Accessible and fair across groups.

25
**FACTORS AFFECTING UTILITY** Easy to administer, score, and interpret.
Practicality
26
**FACTORS AFFECTING UTILITY** *Identify whether this statement relates to Reliability/Validity, Costs vs. Benefits, Fairness, or Practicality:* An IQ test that gives widely different scores to the same student each time won't be useful.
Reliability
27
**FACTORS AFFECTING UTILITY** *Identify whether this statement relates to Reliability/Validity, Costs vs. Benefits, Fairness, or Practicality:* A "creativity test" that actually measures vocabulary is not useful, even if it's consistent.
Validity
28
**FACTORS AFFECTING UTILITY** *Identify whether this statement relates to Reliability/Validity, Costs vs. Benefits, Fairness, or Practicality:* A company considers a complex personality test for hiring. It costs P5,000 per applicant but only slightly improves hiring decisions. The cost outweighs the benefit, so the test has low utility.
Costs vs. Benefits
29
**FACTORS AFFECTING UTILITY** *Identify whether this statement relates to Reliability/Validity, Costs vs. Benefits, Fairness, or Practicality:* A short, free online survey that predicts job performance almost as well as the expensive one.
Costs vs. Benefits
30
**FACTORS AFFECTING UTILITY** *Identify whether this statement relates to Reliability/Validity, Costs vs. Benefits, Fairness, or Practicality:* A math placement test written only in English may disadvantage students who are proficient in math but not fluent in English.
Fairness
31
**FACTORS AFFECTING UTILITY** *Identify whether this statement relates to Reliability/Validity, Costs vs. Benefits, Fairness, or Practicality:* A well-designed nonverbal reasoning test avoids language bias, making it fairer and more useful.
Fairness
32
**FACTORS AFFECTING UTILITY** *Identify whether this statement relates to Reliability/Validity, Costs vs. Benefits, Fairness, or Practicality:* A test requiring expensive equipment, three hours of administration, and advanced statistical scoring is impractical for most schools.
Practicality
33
**FACTORS AFFECTING UTILITY** *Identify whether this statement relates to Reliability/Validity, Costs vs. Benefits, Fairness, or Practicality:* A 30-minute test with easy-to-score answer sheets is more practical and therefore more useful.
Practicality
34
**TRUE OR FALSE?** A test must be reliable to be useful, but reliability alone is not enough. A test must also be valid to ensure it measures what it claims. Finally, a test must have utility to be practical in real-word settings. *Together, these three concepts determine whether a psychological test is worth using in practice.*
**TRUE.**
35
**OTHER KEY CONCEPTS IN PSYCHOLOGICAL ASSESSMENT** The process of giving a test the same way every time to ensure fairness; includes norms (what's "average" for a group). Example: IQ tests are standardized on large populations to know what an "average" score is.
Standardization
36
**OTHER KEY CONCEPTS IN PSYCHOLOGICAL ASSESSMENT** Reference points to interpret a score.
Norms
37
**TYPES OF NORMS** Compare with same-age peers.
Age Norms
38
**TYPES OF NORMS** Compare with students in the same grade.
Grade Norms
39
**TYPES OF NORMS** Tells the % of people who scored lower.
Percentile Ranks
40
**OTHER KEY CONCEPTS IN PSYCHOLOGICAL ASSESSMENT** No test is perfect; there's always an error. Errors may come from the test itself, the environment, or the test-taker (mood, fatigue). *Observed Score = True Score + Error*
Errors in Measurement
41
**ERRORS IN MEASUREMENT** The score you actually get on a test (e.g., 85/100).
Observed Score
42
**ERRORS IN MEASUREMENT** Your real level of the trait being measured (e.g., your actual intelligence, ability, or depression level).
True Score
43
**ERRORS IN MEASUREMENT** Everything else that affects your score but is not part of the true ability (e.g., guessing, bad instructions, being tired, noise in the testing room).
Error
44
**OTHER KEY CONCEPTS IN PSYCHOLOGICAL ASSESSMENT** Examining test questions to see if they are good measures; includes difficulty index (easy vs. hard items) and discrimination index (does the item distinguish between high and low scorers?).
Item Analysis
45
**ITEM ANALYSIS** Easy vs. hard items.
Difficulty Index
46
**ITEM ANALYSIS** Does the item distinguish between high and low scores?
Discrimination Index
47
**OTHER KEY CONCEPTS IN PSYCHOLOGICAL ASSESSMENT** A test must not be unfair to groups based on culture, language, gender, or socioeconomic status. Example: A math test with word problems written in complex English may disadvantage non-native speakers.
Fairness and Bias
48
**OTHER KEY CONCEPTS IN PSYCHOLOGICAL ASSESSMENT** Informed consent, confidentiality, appropriate use, test security. Tests are powerful tools; misuse can harm people.
Ethical Issues in Testing