Sample 1 Flashcards

(242 cards)

1
Q

Objective: Quantify a
construct

A. Testing
B. Assessment

A

A. Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Objective: Answer a
referral question

A. Testing
B. Assessment

A

B. Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Focus: Nomothetic
approach

A. Testing
B. Assessment

A

A. Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Focus: Idiographic
approach

A. Testing
B. Assessment

A

B. Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Process: Individual or group

A. Testing
B. Assessment

A

A. Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Process: Individual only

A. Testing
B. Assessment

A

B. Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Outcome: Test score or psychometric
report

A. Testing
B. Assessment

A

A. Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Outcome: Psychological
report

A. Testing
B. Assessment

A

B. Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Source: Test takers only

A. Testing
B. Assessment

A

A. Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Source: Collateral sources

A. Testing
B. Assessment

A

B. Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Evaluator: Tester is not the key

A. Testing
B. Assessment

A

A. Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Evaluator: Assesor is the key

A. Testing
B. Assessment

A

B. Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Duration: Shorter

A. Testing
B. Assessment

A

A. Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Duration: Longer

A. Testing
B. Assessment

A

B. Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Cost: Inexpensive

A. Testing
B. Assessment

A

A. Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cost: Expensive

A. Testing
B. Assessment

A

B. Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Qualification: RPm

A. Testing
B. Assessment

A

A. Testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Qualification: RPsy

A. Testing
B. Assessment

A

B. Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Gathering and
integration of psychology-related data for the
purpose of making a psychological evaluation that is accomplished through the use of tools, tests, interviews, case studies, behavioral observation, and
specially designed apparatuses

A. Psychological Testing
B. Renumeration
C. In-person check-in
D. Psychological Assessment

A

D. Psychological Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Interactive approach

A. Dynamic Psychological Assessment
B. Colaboartive Psychological Assessment
C. Theraoeutic Psychological Assessment

A

A. Dynamic Psychological Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Evaluation-intervention-evaluation (“sandwich method”)

A. Dynamic Psychological Assessment
B. Colaboartive Psychological Assessment
C. Theraoeutic Psychological Assessment

A

A. Dynamic Psychological Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Assessor and assessee may work as “partners” from the initial contact to the final
feedback

A. Dynamic Psychological Assessment
B. Colaboartive Psychological Assessment
C. Theraoeutic Psychological Assessment

A

B. Colaboartive Psychological Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Therapeutic self-discovery and new understandings throughout the process

A. Dynamic Psychological Assessment
B. Colaboartive Psychological Assessment
C. Theraoeutic Psychological Assessment

A

C. Theraoeutic Psychological Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Use of tests an other tools to evaluate abilities and skills relevant to success or failure in a pre-/school context

A. Educational Assessment
B. Retrospective Assessment
C. Remote Assessment
D. Ecological Momentary Assessment

A

A. Educational Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Intelligence tests, achievement tests, reading comprehension tests A. Educational Assessment B. Retrospective Assessment C. Remote Assessment D. Ecological Momentary Assessment
A. Educational Assessment
26
Use of evaluative tools to conclude psychological aspects of a person as they existed at some point in time before the assessment A. Educational Assessment B. Retrospective Assessment C. Remote Assessment D. Ecological Momentary Assessment
B. Retrospective Assessment
27
Use of psychological tools to gather data and draw conclusions about subjects who are not in physical proximity to the person or people conducting the evaluation A. Educational Assessment B. Retrospective Assessment C. Remote Assessment D. Ecological Momentary Assessment
C. Remote Assessment
28
“In the moment” evaluation of specific problems and related cognitive and behavioral variables at the very time and place that they occur A. Educational Assessment B. Retrospective Assessment C. Remote Assessment D. Ecological Momentary Assessment
D. Ecological Momentary Assessment
29
Process of measuring psychological-related variables by means of devices or procedures designed to obtain a sample of behavior A. Psychological Testing B. Renumeration C. In-person check-in D. Psychological Assessment
A. Psychological Testing
30
A device or procedure designed to measure variables related to psychology A. Psychological Test B. Psychological Assessment C. Psychological Placebo D. Scam
A. Psychological Test
31
Subject Matter A. Content B. Format C. Item D. Administration Procedure E. Score
A. Content
32
Form, plan, structure, arrangement, layout A. Content B. Format C. Item D. Administration Procedure E. Score
B. Format
33
A specific stimulus to which a person responds overtly, and this response is being scored or evaluated A. Content B. Format C. Item D. Administration Procedure E. Score
C. Item
34
Individual basis or group administration A. Content B. Format C. Item D. Administration Procedure E. Score
D. Administration Procedure
35
Code or summary of statement, usually but not necessarily numerical in nature, that reflects an evaluation of performance on a test A. Content B. Format C. Item D. Administration Procedure E. Score
E. Score
36
The process of assigning scores to performances A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician
A. Scoring
37
A reference point, usually numerical, derived by judgment and used to divide a set of data into two or more classifications A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician
B. Cut Score
38
Technical quality A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician
C. Psychometric Soundness
39
The science of psychological measurements A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician
D. Psychometric
40
A professional who uses, analyzes, and interprets psychological test data A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician
E. Psychometrician
41
Which statement best defines psychometric properties? A. Ethical standards governing psychological practice B. Statistical procedures for diagnosing disorders C. Technical qualities and characteristics of psychological tests and assessments that determine scientific soundness and practical usefulness D. Theoretical assumptions behind personality theories
C. Technical qualities and characteristics of psychological tests and assessments that determine scientific soundness and practical usefulness
42
Psychometric properties are primarily used to determine which aspect of psychological tests? A. Their cultural fairness B. Their scientific soundness and practical usefulness C. Their length and format D. Their popularity among clinician
B. Their scientific soundness and practical usefulness
43
Why are psychometric properties crucial in psychological testing? A. They ensure that a test measures what it intends to measure, B. They ensure that a test produces consistent results C. They ensure that a test allows meaningful interpretation of scores D. All of the above
D. All of the above
44
Reliability in psychological testing refers to what concept? A. Accuracy of score interpretation B. Consistency in measurement C. Validity of test content D. Predictive usefulness
B. Consistency in measurement
45
What is a reliability coefficient? A. A score that reflects item difficulty B. A qualitative judgment of test usefulness C. An index of reliability indicating the ratio between true score variance and total score variance D. A measure of examiner bias
C. An index of reliability indicating the ratio between true score variance and total score variance
46
What is the numerical range of a reliability coefficient? A. 0 to 1 B. −1 to +1 C. 0 to 100 D. 1 to 10
A. 0 to 1
47
Reliability is synonymous with which pair of terms? A. Accuracy and validity B. Stability and objectivity C. Consistency and dependability D. Precision and fairness
C. Consistency and dependability
48
What happens to a test’s reliability when a greater proportion of total variance is attributed to true variance? A. Reliability decreases B. Reliability remains unchanged C. Reliability becomes unpredictable D. Reliability increases
D. Reliability increases
49
What does the true score formula conceptually represent? A. Obtained score B. Mean score C. Reliability coefficient D. Examiner judgment of performance E. Relationship between all choices
E. Relationship between all choices
50
In the true score formula, what does x represent? A. The correlation coefficient B. The obtained score C. The true variance D. The standard error
B. The obtained score
51
In the true score formula, what does x̄ represent? A. Error variance B. Observed score C. Mean score D. Reliability index
C. Mean score
52
In reliability analysis, what does Rxx represent? A. Correlation coefficient B. Ratio of variances C. Mean score difference D. Error component
A. Correlation coefficient
53
One goal of reliability is to estimate what aspect of psychological testing? A. Cultural bias B. Estimate errors in psychological measurement C. Norm group representativeness D. Examiner competence
B. Estimate errors in psychological measurement
54
Another goal of reliability is to develop techniques that serve what purpose? A. Increase test difficulty B. Shorten testing time C. Reduce measurement errors D. Improve score distributions
C. Reduce measurement errors
55
In reliability theory, true variance is compared against which combination? A. Standard deviation and mean B. Observed score and reliability coefficient C. Content variance and criterion variance D. True score variance plus error variance
D. True score variance plus error variance
56
Classical Test Theory assumes that a test score reflects which components? A. Ability and motivation B. Knowledge and practice effects C. True score and examiner bias D. True score and error
D. True score and error
57
In Classical Test Theory, what does error represent? A. Intentional test manipulation B. A stable trait of the test taker C. The component of the observed score unrelated to the ability being measured D. Differences between test forms
C. The component of the observed score unrelated to the ability being measured
58
Which equation represents the Classical Test Theory model? A. X = V + R B. X = T + E C. T = X − V D. E = X + T
B. X = T + E
59
Variance is particularly useful in test theory for what purpose? A. Describing sources of test score variability B. Calculating raw scores C. Identifying item difficulty D. Ranking test takers
A. Describing sources of test score variability
60
What does true variance represent? A. Variance from irrelevant random sources B. Variance from testing conditions C. Variance from examiner differences D. Variance from true differences
D. Variance from true differences
61
Error variance refers to variance caused by what factor? A. True ability differences B. Irrelevant random sources C. Consistent test content D. Standardized administration
B. Irrelevant random sources
62
Measurement error includes which factors? A. Only test-taker abilities B. Only examiner scoring decisions C. All factors in measurement other than the variable being measured D. Only random fluctuations
C. All factors in measurement other than the variable being measured
63
Random error is best described as which type of source? A. Unavoidable and caused by unpredictable fluctuations B. Predictable and consistent C. Intentional and controllable D. Systematic and proportional
A. Unavoidable and caused by unpredictable fluctuations
64
What distinguishes systematic error from random error? A. It is unpredictable B. It is avoidable if corrected C. It is always larger D. It cannot be identified
B. It is avoidable if corrected
65
Systematic error is typically what in relation to the true value? A. Completely unrelated B. Randomly fluctuating C. Constant or proportionate D. Temporarily inconsistent
C. Constant or proportionate
66
Item sampling or content sampling refers to what type of variation? A. Variation in scoring procedures B. Variation among test administrators C. Variation in test-taker motivation D. Variation among items within a test and between tests
D. Variation among items within a test and between tests
67
Test environment factors include which of the following? A. Room temperature B. Lighting C. Ventilation D. Noise
All of the above
68
Which is an example of a test taker variable affecting reliability? A. Examiner professionalism B. Scoring rubrics C. Room size D. Lack of sleep
D. Lack of sleep
69
Which of the following is considered a test taker variable? A. Examiner’s demeanor B. Effects of drugs C. Item wording D. Test timing
B. Effects of drugs
70
Examiner-related variables include which factor? A. Emotional problems of the test taker B. Casual life experiences C. Examiner’s physical appearance and nonverbal gestures D. Content sampling
C. Examiner’s physical appearance and nonverbal gestures
71
Professionalism is categorized under which source of error? A. Test construction B. Test taker variables C. Test environment D. Examiner-related variables
D. Examiner-related variables
72
Internal reliability refers to what type of consistency? A. Consistency across different raters B. Consistency across time C. Consistency within the test itself D. Consistency across test versions
C. Consistency within the test itself
73
Which example best illustrates internal reliability? A. Two examiners giving similar scores B. A pretest matching a posttest C. Identical scores across schools D. Different items expressing the same meaning
D. Different items expressing the same meaning
74
External consistency evaluates reliability by comparing results across what dimensions? A. Test items only B. Individuals and time C. Content areas D. Scoring methods
B. Individuals and time
75
A pretest–posttest comparison is an example of which type of reliability? A. Internal reliability B. Content validity C. External consistency D. Construct validity
C. External consistency
76
Test–retest reliability is best described as a reliability estimate obtained by A. correlating scores from two equivalent halves of a test B. correlating scores from the same sample across two administrations C. comparing scores from different groups taking the same test D. analyzing item difficulty across test items
B. correlating scores from the same sample across two administrations
77
The coefficient of stability refers to the idea that A. reliability increases when test difficulty decreases B. reliability coefficients remain constant regardless of time C. longer time intervals tend to reduce reliability coefficients D. shorter tests always produce unstable scores
C. longer time intervals tend to reduce reliability coefficients
78
Test sophistication occurs when A. test takers improve due to coaching B. items are remembered, especially difficult or confusing ones C. scores are affected by item sampling error D. test length is increased
B. items are remembered, especially difficult or confusing ones
79
Test wiseness may affect test scores by A. lowering reliability coefficients B. increasing measurement error C. inflating the apparent abilities of test takers D. reducing internal consistency
C. inflating the apparent abilities of test takers
80
Mortality in test–retest reliability refers to A. item difficulty imbalance B. loss of test materials C. absence of some participants in the second session D. failure to counterbalance test forms
C. absence of some participants in the second session
81
In addressing mortality, the recommended action is to A. remove the first test scores of absent participants B. replace missing participants with new ones C. shorten the test interval D. recalculate item difficulty
A. remove the first test scores of absent participants
82
Counterbalancing is used primarily to A. increase internal consistency B. reduce practice effects within items C. ensure homogeneity of items D. avoid carryover effects by varying test sequences
D. avoid carryover effects by varying test sequences
83
McDonald’s Omega assesses how well items A. predict future performance B. measure different constructs C. correlate with external criteria D. consistently measure a single underlying construct
D. consistently measure a single underlying construct
84
Rulon’s Formula is best described as a method that assesses test consistency by A. comparing scores from two different tests B. correlating item difficulty with total score C. comparing scores obtained from two halves of the same test D. evaluating agreement among multiple raters
C. comparing scores obtained from two halves of the same test
85
Rulon’s Formula is considered a counterpart of the A. Kappa formula B. Kendall’s W C. Cronbach’s alpha D. Spearman-Brown formula
D. Spearman-Brown formula
86
The primary purpose of splitting a test when applying Rulon’s Formula is to A. create two equivalent halves B. increase test difficulty C. eliminate unreliable items D. rank test takers
A. create two equivalent halves
87
One commonly used way of dividing a test into two halves for Rulon’s Formula is A. first half versus second half B. easy items versus difficult items C. odd-numbered items versus even-numbered items D. objective items versus subjective items
C. odd-numbered items versus even-numbered items
88
In Rulon’s Formula, one variance that must be calculated is the variance of the A. item difficulties B. total scores for each person C. rater judgments D. test administration times
B. total scores for each person
89
Another required calculation in Rulon’s Formula is the variance of the A. individual item scores B. mean test scores C. differences between scores on the two halves D. percentile ranks
C. differences between scores on the two halves
90
Odd–even reliability refers specifically to A. alternating easy and hard items B. splitting the test by content area C. assigning odd-numbered items to one half and even-numbered items to the other D. comparing two different test forms
C. assigning odd-numbered items to one half and even-numbered items to the other
91
Kappa statistics are primarily used when dealing with A. interval data B. nominal data C. ratio data D. continuous scores
B. nominal data
92
Fleiss Kappa is most appropriate when measuring agreement among A. two test halves B. two raters C. three or more raters D. ranked data sets
C. three or more raters
93
Cohen’s Kappa is used to determine agreement between A. test items and total scores B. two raters C. multiple test forms D. several ranking judges
B. two raters
94
Kendall’s W is specifically designed for use with A. nominal data B. dichotomous data C. rankings or ordinal data D. interval-scale scores
C. rankings or ordinal data
95
A restricted range of scores typically results in a A. lower correlation coefficient B. higher correlation coefficient C. perfect correlation D. negative correlation
A. lower correlation coefficient
96
An inflated range of scores generally leads to a A. lower correlation coefficient B. higher correlation coefficient C. perfect correlation D. negative correlation
B. higher correlation coefficient
97
A power test is characterized by A. items of uniform difficulty B. extremely short time limits C. a time limit long enough for test takers to attempt all items D. all test takers obtaining perfect scores
C. a time limit long enough for test takers to attempt all items
98
In a power test, some items are designed so that A. all test takers can answer them correctly B. no time pressure is applied C. guessing is eliminated D. no test takers can obtain a perfect score
D. no test takers can obtain a perfect score
99
A speed test contains items that are A. increasingly difficult B. varied in complexity C. of uniform level of difficulty D. based on ranking tasks
C. of uniform level of difficulty
100
When given generous time limits, a defining feature of a speed test is that A. all test takers should be able to complete all items correctly B. only high-ability test takers finish C. scores depend on item difficulty D. perfect scores are impossible
A. all test takers should be able to complete all items correctly
101
The main distinction between speed tests and power tests lies in A. scoring procedures B. item format C. time limits and item difficulty D. number of test items
C. time limits and item difficulty
102
Which concept directly affects whether a correlation coefficient becomes higher or lower? A. Test length B. Restriction or inflation of range C. Number of raters D. Item numbering
B. Restriction or inflation of range
103
Which statistical method is most appropriate for measuring agreement using nominal categories with two evaluators? A. Fleiss Kappa B. Kendall’s W C. Rulon’s Formula D. Cohen’s Kappa
D. Cohen’s Kappa
104
Which statement best describes the main purpose of Domain Sampling Theory? A. To explain how test scores change due to testing conditions B. To model the probability that a person with a given ability can perform a task C. To estimate how specific sources of variation contribute to test scores under defined conditions D. To identify observable and unobservable traits in test performance
C. To estimate how specific sources of variation contribute to test scores under defined conditions
105
The Domain of Behavior is best described as: A. The set of test scores obtained under identical testing conditions B. The universe of items that could conceivably measure a behavior C. The observable outcomes of a latent trait D. The specific facets involved in test administration
B. The universe of items that could conceivably measure a behavior
106
Within Domain Sampling Theory, the Domain of Behavior is considered a: A. Hypothetical construct B. Measurable variable C. Manifest trait D. Statistical estimate
A. Hypothetical construct
107
Generalizability Theory primarily explains test score variation as resulting from: A. Differences in latent traits among individuals B. Errors in prediction and estimation C. The number of test items used D. Variables in the testing situation
D. Variables in the testing situation
108
Generalizability Theory primarily explains test score variation as resulting from: A. Differences in latent traits among individuals B. Errors in prediction and estimation C. Variables in the testing situation D. The number of test items used
C. Variables in the testing situation
109
In Generalizability Theory, the Universe refers to: A. All possible latent traits being measured B. The observable behaviors shown during testing C. The probability model used to predict performance D. The details of the particular test situation
D. The details of the particular test situation
110
Which of the following is identified as a facet in Generalizability Theory? A. The number of items included in the test B. The unobservable trait being measured C. The standard error of the estimate D. The predicted test score
A. The number of items included in the test
111
Which option correctly identifies another facet within the testing universe? A. The purpose of test administration B. The universe score C. The domain of behavior D. The manifestation trait
A. The purpose of test administration
112
The Universe Score represents: A. The average score across different testing situations B. The score predicted from a regression equation C. The score obtained when facets vary D. The score obtained when all facets remain exactly the same
D. The score obtained when all facets remain exactly the same
113
A Generalizability Study is conducted to examine: A. How well predicted scores match observed scores B. Whether scores remain consistent across different testing situations C. The discrimination power of individual test items D. The relationship between latent and manifest traits
B. Whether scores remain consistent across different testing situations
114
The primary focus of a Decision Study is to: A. Determine how useful test scores are for decision-making B. Model performance probabilities C. Measure item discrimination D. Identify sources of test score variation
A. Determine how useful test scores are for decision-making
115
Item-Response Theory is also known as: A. Domain Sampling Theory B. Generalizability Theory C. Classical Test Theory D. Latent-Trait Theory
D. Latent-Trait Theory
116
Item-Response Theory is designed to model: A. How testing conditions affect observed scores B. The probability that a person with a certain ability can perform at a given level C. The consistency of scores across administrations D. The difference between predicted and observed values
B. The probability that a person with a certain ability can perform at a given level
117
A latent trait is defined as a trait that is: A. Directly measurable B. Observable during testing C. Unobservable D. A facet of test administration
C. Unobservable
118
A manifestation trait differs from a latent trait because it is: A. Hypothetical B. Statistically predicted C. Unobservable D. Observable
D. Observable
119
In Item-Response Theory, discrimination refers to an item’s ability to: A. Predict future test performance B. Distinguish between individuals with different levels of the measured trait C. Increase test reliability across situations D. Reduce the standard error of the estimate
B. Distinguish between individuals with different levels of the measured trait
120
Dichotomous test items are characterized by: A. Only two possible responses B. More than two response options C. Continuous scoring D. Variable testing conditions
A. Only two possible responses
121
Polytomous test items are defined as items that: A. Measure latent traits only B. Have more than two possible responses C. Produce universe scores D. Are limited to correct or incorrect answers
B. Have more than two possible responses
122
The Standard Error of the Difference is used to help determine: A. Whether a predicted score is accurate B. How observable a trait is C. Whether a difference between scores is statistically significant D. How many facets should be included
C. Whether a difference between scores is statistically significant
123
The Standard Error of Estimate refers to the standard error of the difference between: A. Two observed scores B. Two predicted scores C. Latent and manifest traits D. Predicted and observed values
D. Predicted and observed values
124
Which concept specifically focuses on the accuracy of predictions rather than score differences? A. Standard Error of the Difference B. Standard Error of Estimate C. Universe Score D. Domain of Behavior
B. Standard Error of Estimate
125
Which type of validity refers to the degree of control among variables within a study? A. External validity B. Conceptual validity C. Internal validity D. Face validity
C. Internal validity
126
Increasing random assignment primarily strengthens which type of validity? A. Internal validity B. Face validity C. External validity D. Conceptual validity
A. Internal validity
127
Which form of validity is most concerned with whether research results can be generalized? A. Conceptual validity B. Internal validity C. Face validity D. External validity
D. External validity
128
Random selection is specifically associated with increasing which type of validity? A. Internal validity B. External validity C. Face validity D. Conceptual validity
B. External validity
129
Which type of validity focuses on individuals and their unique histories and behaviors? A. Face validity B. Internal validity C. Conceptual validity D. External validity
C. Conceptual validity
130
Which validity is based on how a test appears to measure what it is intended to measure to the person taking or viewing it? A. Conceptual validity B. Face validity C. External validity D. Internal validity
B. Face validity
131
Lawshe’s Content Validity Ratio (CVR) relies on judgments provided by whom? A. Study participants B. Researchers only C. Statistical software D. Subject matter experts (SMEs)
D. Subject matter experts (SMEs)
132
In Lawshe’s CVR method, how are individual items typically evaluated by SMEs? A. By ranking items from best to worst B. By scoring items on a numerical difficulty scale C. By rating essentiality categories D. By voting to accept or reject items
C. By rating essentiality categories
133
Which of the following is one of the standard essentiality ratings used in the CVR process? A. Highly valid B. Useful but not essential C. Strongly disagree D. Moderately effective
B. Useful but not essential
134
The CVR for an item is calculated using which key piece of information? A. The average relevance score across items B. The total number of test items C. The proportion of items with universal agreement D. The number of experts who rate the item as essential
D. The number of experts who rate the item as essential
135
What does a positive CVR value indicate? A. Exactly half of the experts rated the item as essential B. Fewer than half of the experts rated the item as essential C. The item lacks relevance D. More than half of the experts rated the item as essential
D. More than half of the experts rated the item as essential
136
A CVR value of zero occurs when which condition is met? A. All experts rate the item as essential B. No experts rate the item as essential C. Exactly half of the experts rate the item as essential D. The average relevance score equals three
C. Exactly half of the experts rate the item as essential
137
The Content Validity Index (CVI) can be calculated at which levels? A. Participant and population levels B. Item and scale levels C. Variable and construct levels D. Predictor and criterion levels
B. Item and scale levels
138
The I-CVI represents what measurement? A. The proportion of experts who rate an item as highly relevant B. The number of items rated essential C. The average CVR across all items D. The percentage of items with zero CVR
A. The proportion of experts who rate an item as highly relevant
139
On a 4-point relevance scale, which ratings are typically considered “highly relevant” for calculating the I-CVI? A. 1 or 2 B. 2 or 3 C. 3 or 4 D. 4 only
C. 3 or 4
140
Which description best defines S-CVI/Ave? A. The proportion of items with zero CVR B. The average of the I-CVIs across all items C. The number of experts who agree on relevance D. The percentage of essential items only
B. The average of the I-CVIs across all items
141
S-CVI/UA refers to which calculation approach? A. The mean relevance score for each item B. The ratio of essential to nonessential items C. The proportion of items achieving universal agreement D. The number of SMEs rating items as useful
C. The proportion of items achieving universal agreement
142
Incremental validity evaluates what specific contribution? A. The appearance of a test to participants B. The generalizability of findings C. The control of variables in a study D. The added explanatory power of an additional predictor
D. The added explanatory power of an additional predictor
143
Incremental validity focuses on explaining variance in relation to what? A. Independent variables only B. Participant behavior histories C. Criterion measures D. Sampling methods
C. Criterion measures
144
Which situation best reflects the concept of incremental validity? A. Selecting participants randomly to improve generalizability B. Adding a new predictor that explains variance beyond existing predictors C. Asking experts to judge item relevance D. Designing a test that appears valid to respondents
B. Adding a new predictor that explains variance beyond existing predictors
145
What best describes the Multi-trait Multi-method Matrix (MTMM)? A. A statistical technique used to reduce data dimensionality B. A method for assessing the construct validity of a set of measures in a study C. A graphical tool for identifying latent variables D. A procedure for estimating factor loadings
B. A method for assessing the construct validity of a set of measures in a study
146
What does the MTMM provide researchers? A. A way to calculate eigenvalues B. A process for generating theories C. A structured way to evaluate convergent and discriminant validity simultaneously D. A method for selecting the number of factors
C. A structured way to evaluate convergent and discriminant validity simultaneously
147
How many types of correlation coefficients are analyzed in the MTMM? A. Two B. Three C. Four D. Five
C. Four
148
Which correlations form the reliability diagonal in the MTMM? A. Heteromethod-heterotrait correlations B. Monomethod-monotrait correlations C. Monomethod-heterotrait correlations D. Heteromethod-monotrait correlations
B. Monomethod-monotrait correlations
149
What do monomethod-monotrait correlations represent? A. The validity of different methods B. The agreement between different traits C. The reliability of each measure D. The variance explained by each factor
C. The reliability of each measure
150
How should monomethod-monotrait correlations compare to other correlations in the MTMM? A. They should be the lowest B. They should be moderate C. They should be statistically insignificant D. They should be the highest in the entire matrix
D. They should be the highest in the entire matrix
151
Which MTMM correlations form the validity diagonal? A. Monomethod-monotrait correlations B. Heteromethod-monotrait correlations C. Monomethod-heterotrait correlations D. Heteromethod-heterotrait correlations
B. Heteromethod-monotrait correlations
152
What do heteromethod-monotrait correlations represent? A. The same trait measured by different methods B. Different traits measured by the same method C. Different traits measured by different methods D. Reliability estimates of a single method
B. Different traits measured by the same method
153
Significant heteromethod-monotrait correlations provide evidence for which type of validity? A. Discriminant validity B. Criterion validity C. Face validity D. Convergent validity
D. Convergent validity
154
What does convergent validity indicate? A. Measures of different traits disagree B. Measures of the same trait converge or agree C. Methods produce unrelated results D. Traits are statistically independent
B. Measures of the same trait converge or agree
155
Which correlations are found in the heterotrait-monomethod triangles? A. The same trait measured by different methods B. Different traits measured by the same method C. Different traits measured by different methods D. Reliability estimates of a single method
C. Different traits measured by different methods
156
How should monomethod-heterotrait correlations typically appear? A. High, to show convergence B. Moderate, to show overlap C. Low, to demonstrate discrimination D. Zero, to show independence
C. Low, to demonstrate discrimination
157
What type of validity is supported by low monomethod-heterotrait correlations? A. Predictive validity B. Convergent validity C. Content validity D. Discriminant validity
D. Discriminant validity
158
What do heteromethod-heterotrait correlations involve? A. Same traits and same methods B. Different traits and different methods C. Same traits and different methods D. Different traits and same methods
B. Different traits and different methods
159
How should heteromethod-heterotrait correlations compare to other correlations in the MTMM? A. They should be the highest B. They should be moderate C. They should be the lowest in the matrix D. They should be equal to reliability estimates
C. They should be the lowest in the matrix
160
What does having the lowest heteromethod-heterotrait correlations further support? A. Reliability B. Internal consistency C. Convergent validity D. Discriminant validity
D. Discriminant validity
161
What is factor analysis primarily designed to identify? A. Correlation coefficients B. Factors or specific variables on which people may differ C. Measurement errors D. Sampling distributions
B. Factors or specific variables on which people may differ
162
Which description best defines factors in factor analysis? A. Attributes B. Characteristics C. Dimensions D. All of the above
D. All of the above
163
What is the most common purpose of factor analysis? A. Hypothesis testing B. Construct validation C. Data reduction D. Reliability estimation
C. Data reduction
164
How does factor analysis simplify complex data? A. By increasing the number of variables B. By reducing the number of variables C. By maximizing correlations D. By eliminating variance
B. By reducing the number of variables
165
What benefit does data reduction provide? A. Easier interpretation of datasets B. More precise sampling C. Higher reliability coefficients D. Increased subjectivity
A. Easier interpretation of datasets
166
What does structure discovery in factor analysis help uncover? A. Measurement error B. Sample bias C. Underlying dimensions within observed variables D. External validity
C. Underlying dimensions within observed variables
167
How is factor analysis used in construct validation? A. To calculate eigenvalues B. To test hypotheses statistically C. To rank participants D. To ensure items measure the intended underlying construct
D. To ensure items measure the intended underlying construct
168
What are examples of constructs factor analysis may help validate? A. Age and gender B. Intelligence and anxiety C. Height and weight D. Reaction time and speed
B. Intelligence and anxiety
169
What does factor analysis provide regarding observed variables? A. Correlation matrices B. Reliability coefficients C. Estimates of factor loadings D. Norm-referenced scores
C. Estimates of factor loadings
170
What is a factor loading? A. A measure of sample size B. An estimate of explained variance across components C. A test of statistical significance D. Information about how strongly an observed variable relates to a factor
D. Information about how strongly an observed variable relates to a factor
171
What does a factor loading convey about test scores? A. Their reliability B. The extent to which the factor determines them C. Their norm group placement D. Their distribution shape
B. The extent to which the factor determines them
172
What is the primary goal of exploratory factor analysis (EFA)? A. To confirm a specific model B. To test theory C. To explore underlying factor structure D. To maximize variance
C. To explore underlying factor structure
173
When is exploratory factor analysis typically used? A. When factor structure is already known B. When no preconceived idea of factors exists C. When validating norms D. When conducting cross-validation
B. When no preconceived idea of factors exists
174
How is exploratory factor analysis often described conceptually? A. Theory-testing B. Error-correcting C. Norm-referencing D. Theory-generating
D. Theory-generating
175
What is the main purpose of confirmatory factor analysis (CFA)? A. To reduce dimensionality B. To discover factors C. To confirm a hypothesized factor structure D. To estimate reliability
C. To confirm a hypothesized factor structure
176
How is confirmatory factor analysis typically characterized? A. Theory-generating B. Theory-testing C. Data-mining D. Norm-building
B. Theory-testing
177
What is the main purpose of principal component analysis (PCA)? A. To identify latent traits B. To validate constructs C. To reduce data dimensionality D. To estimate reliability
C. To reduce data dimensionality
178
How does PCA summarize data? A. By increasing the number of variables B. By summarizing variance with fewer variables C. By eliminating correlations D. By estimating factor loadings
B. By summarizing variance with fewer variables
179
What is the primary goal of dimensionality reduction in PCA? A. Reducing the number of variables B. Increasing explained error C. Confirming theory D. Testing hypotheses
A. Reducing the number of variables
180
How are principal components selected in PCA? A. Based on reliability estimates B. Based on theory C. To capture maximum possible variance D. To minimize correlations
C. To capture maximum possible variance
181
What does orthogonality of principal components mean? A. They are linearly dependent B. They overlap substantially C. They are correlated D. They are uncorrelated
D. They are uncorrelated
182
What is a scree plot? A. A plot of factor loadings B. A plot of eigenvalues against component number C. A histogram of test scores D. A matrix of correlations
B. A plot of eigenvalues against component number
183
What is the purpose of a scree plot? A. To assess reliability B. To test hypotheses C. To decide how many components to retain D. To estimate norms
C. To decide how many components to retain
184
What feature of the scree plot is used to guide component retention? A. The highest eigenvalue B. The average variance C. The sample size D. The elbow or point of inflection
D. The elbow or point of inflection
185
What does the explained variance ratio represent? A. The reliability of components B. The proportion of variance along each principal component C. The correlation between variables D. The error variance
B. The proportion of variance along each principal component
186
What does the Kaiser Criterion (K1 Rule) recommend retaining? A. Factors with eigenvalues less than 1.0 B. Only the first factor C. Factors before the elbow D. Factors with eigenvalues greater than 1.0
D. Factors with eigenvalues greater than 1.0
187
What is a limitation of the Kaiser Criterion? A. It is highly subjective B. It underestimates factor numbers C. It tends to overestimate the number of factors D. It ignores eigenvalues
C. It tends to overestimate the number of factors
188
Under what condition does the Kaiser Criterion especially tend to overestimate factors? A. When sample size is small B. When reliability is low C. When variance is minimal D. When the number of variables is large
D. When the number of variables is large
189
How is the Kaiser Criterion generally regarded in terms of accuracy? A. The most accurate method B. Moderately accurate C. Highly precise D. Not the most accurate method
D. Not the most accurate method
190
What does the Elbow Method (Scree Test) involve plotting? A. Loadings against variables B. Eigenvalues against factor number C. Variance against sample size D. Scores against norms
B. Eigenvalues against factor number
191
What factors are retained using the Elbow Method? A. Those after the elbow B. Only the first factor C. Those before the elbow D. Those with the lowest variance
C. Those before the elbow
192
What is a key limitation of the Elbow Method? A. It requires large samples B. It is computationally complex C. It is highly subjective D. It always underestimates factors
C. It is highly subjective
193
Why can the elbow point be problematic to identify? A. It changes with rotation B. It is often ambiguous or hard to define C. It depends on reliability D. It requires confirmatory analysis
B. It is often ambiguous or hard to define
194
What consequence can arise from ambiguity in the elbow point? A. Reduced reliability B. Different researchers selecting different numbers of factors C. Loss of validity D. Increased error variance
B. Different researchers selecting different numbers of factors
195
What is cross-validation? A. Validation using the same sample B. Revalidation of a test using a different group C. Norming a test D. Estimating factor loadings
B. Revalidation of a test using a different group
196
What happens to validity after cross-validation in some cases? A. It increases B. It remains constant C. It disappears D. It decreases
D. It decreases
197
What is the decrease in validity after cross-validation called? A. Criterion contamination B. Validity shrinkage C. Reliability decay D. Sampling bias
B. Validity shrinkage
198
What does co-validation involve? A. Validating one test multiple times B. Validating tests across cultures C. Validation of more than one test from the same group D. Validation using multiple criteria
C. Validation of more than one test from the same group
199
What does co-norming refer to? A. Norming one test repeatedly B. Norming tests across populations C. Norming more than one test from the same group D. Norming after cross-validation
C. Norming more than one test from the same group
200
Which description best defines norms in the context of psychological or educational testing? A. Individual raw scores obtained from a single test administration B. Test performance data from a specific group used as a reference for interpreting individual scores C. Statistical formulas used to calculate test reliability D. The process of converting scores into percentile ranks
B. Test performance data from a specific group used as a reference for interpreting individual scores
201
What does norming refer to? A. Comparing individual scores to national averages B. Selecting test items for inclusion in an assessment C. The process of deriving norms D. Assigning grades based on test results
C. The process of deriving norms
202
Which group is described as the normative sample? A. Individuals who score above the average on a test B. Test developers who design the assessment C. People whose scores are excluded from analysis D. A group whose performance on a test is analyzed for reference
D. A group whose performance on a test is analyzed for reference
203
Why is a normative sample important? A. It determines how difficult the test items should be B. It provides a reference for evaluating individual test performance C. It eliminates the need for raw scores D. It guarantees equal performance across test takers
B. It provides a reference for evaluating individual test performance
204
Which type of norm converts raw scores from a standardization sample into a rank-based format? A. Developmental norms B. Local norms C. Percentile norms D. Subgroup norms
C. Percentile norms
205
What does a percentile express? A. The percentage of correct answers on a test B. The proportion of items answered incorrectly C. The percentage of people whose scores fall above a given raw score D. The percentage of people whose scores fall below a particular raw score
D. The percentage of people whose scores fall below a particular raw score
206
Which statement correctly describes percentage correct? A. The number of people scoring below a given raw score B. Raw score divided by the total number of test takers C. The number of correct responses multiplied by 100 D. A comparison between two different tests
C. The number of correct responses multiplied by 100
207
Which category of norms is developed based on characteristics that change or are affected by stages of life? A. National norms B. Percentile norms C. Subgroup norms D. Developmental norms
D. Developmental norms
208
Grade norms are best described as norms based on: A. Age-equivalent scores B. Nationally representative samples C. Grade-level performance D. Local population performance
C. Grade-level performance
209
Which type of developmental norm uses age-equivalent scores? A. Age norms B. Grade norms C. Local norms D. National anchor norms
A. Age norms
210
Which norms are derived from a sample that represents the population at a national level? A. Subgroup norms B. National norms C. Local norms D. Percentile norms
B. National norms
211
What is the primary function of national anchor norms? A. To describe performance within a local population B. To measure developmental changes across age C. To provide an equivalency table for comparing scores on two tests D. To convert raw scores into percentages
C. To provide an equivalency table for comparing scores on two tests
212
Which type of norm involves dividing the normative sample based on criteria used during sample selection? A. Developmental norms B. Local norms C. National anchor norms D. Subgroup norms
D. Subgroup norms
213
What distinguishes subgroup norms from national norms? A. Subgroup norms use smaller test forms B. Subgroup norms are always locally developed C. Subgroup norms segment the normative sample using specific criteria D. Subgroup norms focus only on age differences
C. Subgroup norms segment the normative sample using specific criteria
214
Which norms are most often developed by test users themselves? A. National norms B. Percentile norms C. Local norms D. Developmental norms
C. Local norms
215
What do local norms provide? A. Comparisons between two equivalent tests B. Normative information relative to the performance of a local population C. Age-equivalent interpretations of scores D. National-level test comparisons
B. Normative information relative to the performance of a local population
216
Which pairing is correctly matched? A. Percentile norms – age-equivalent scores B. Developmental norms – traits affected by stages of life C. Local norms – nationally representative samples D. National norms – locally developed by test users
B. Developmental norms – traits affected by stages of life
217
A score interpretation that focuses on how many individuals scored lower than a specific raw score relies on: A. Percentage correct B. Grade norms C. Percentile norms D. Local norms
C. Percentile norms
218
Which statement accurately differentiates percentage correct from percentile? A. Percentage correct compares performance across age groups B. Percentile reflects the proportion of items answered correctly C. Percentage correct is based on rank ordering D. Percentile reflects how many people scored below a given score
D. Percentile reflects how many people scored below a given score
218
Which term refers specifically to the process rather than the data or group? A. Normative sample B. Norms C. Percentile D. Norming
D. Norming
219
What best describes the Fixed Reference Group Scoring System? A. Test scores are adjusted for each new group of test takers B. A predetermined passing score is applied across all administrations C. Scores from one group are used as the basis for future score calculations D. Individual performance determines score interpretation
C. Scores from one group are used as the basis for future score calculations
220
In the Fixed Reference Group Scoring System, which group influences future test scoring? A. One original group of test takers B. The most recent group of test takers C. All groups combined over time D. A randomly selected sample group
A. One original group of test takers
221
A reliability coefficient of 0.92 is interpreted as: A. Good B. Adequate C. Excellent D. May have limited applicability
C. Excellent
222
Which reliability coefficient range is labeled as “Good”? A. 0.70 – 0.79 B. 0.80 – 0.89 C. Below 0.70 D. 0.90 and up
B. 0.80 – 0.89
223
A reliability coefficient below 0.70 suggests the test: A. Is excellent B. Is adequate C. Has strong consistency D. May have limited applicability
D. May have limited applicability
224
A validity coefficient of 0.36 would be interpreted as: A. Likely to be useful B. Depends on the circumstances C. Very beneficial D. Unlikely to be useful
C. Very beneficial
225
Which validity coefficient range is considered “Likely to be Useful”? A. 0.11 – 0.20 B. Above 0.35 C. Below 0.11 D. 0.21 – 0.35
D. 0.21 – 0.35
226
A validity coefficient of 0.15 falls under which interpretation? A. Very beneficial B. Depends on the circumstances C. Likely to be useful D. Unlikely to be useful
B. Depends on the circumstances
227
A Cronbach’s alpha value of 0.91 is interpreted as: A. Good B. Acceptable C. Excellent D. Questionable
C. Excellent
227
What interpretation corresponds to a validity coefficient below 0.11? A. Very beneficial B. Likely to be useful C. Depends on the circumstances D. Unlikely to be useful
D. Unlikely to be useful
228
Which Cronbach’s alpha range is classified as “Questionable”? A. 0.50 ≤ α < 0.60 B. 0.60 ≤ α < 0.70 C. 0.70 ≤ α < 0.80 D. α < 0.50
B. 0.60 ≤ α < 0.70
229
A Cronbach’s alpha value of 0.55 would be described as: A. Poor B. Unacceptable C. Acceptable D. Good
A. Poor
230
Which Cronbach’s alpha value indicates an unacceptable level? A. 0.65 B. 0.72 C. 0.48 D. 0.85
C. 0.48
231
When the p-value is less than or equal to ∞, the correct decision is to: A. Accept the null hypothesis B. Reject the null hypothesis C. Modify the null hypothesis D. Delay the decision
B. Reject the null hypothesis
232
A p-value greater than or equal to ∞ leads to which action? A. Reject the null hypothesis B. Revise the alternative hypothesis C. Ignore the hypothesis D. Accept the null hypothesis
D. Accept the null hypothesis
233
Measurement is best defined as: A. The interpretation of test scores B. Assigning numbers or symbols to characteristics C. Comparing observed and expected outcomes D. Eliminating error from testing
B. Assigning numbers or symbols to characteristics
234
Error refers to: A. Incorrect test scoring B. Random guessing by examinees C. Factors influencing a score beyond what is measured D. Poor test construction
C. Factors influencing a score beyond what is measured
235
Scales are described as: A. Methods for interpreting validity B. Statistical models of reliability C. Sets of numbers or symbols assigned to objects D. Errors affecting measurement
C. Sets of numbers or symbols assigned to objects
236
Which type of scale consists of a countable set of values that can be infinite? A. Discrete scale B. Nominal scale C. Ordinal scale D. Continuous scale
D. Continuous scale
237
A discrete scale is characterized by being: A. Countable in a finite amount of time B. Infinite and uncountable C. Based on equal intervals only D. Dependent on ratio properties
A. Countable in a finite amount of time
238
The property of magnitude refers to: A. The absence of a measured attribute B. Equal distances between values C. Moreness and comparison D. Zero point measurement
C. Moreness and comparison
239
Equal interval means that: A. Values can be ranked only B. Zero indicates absence of a trait C. Differences between scale points are consistent D. The scale has infinite values
C. Differences between scale points are consistent
240
The ratio property exists when: A. Differences between values are equal B. Rankings can be established C. Measurement error is eliminated D. Nothing of the property being measured exists
D. Nothing of the property being measured exists