Sample 1 Flashcards by Emjay Cosico

Objective: Quantify a
construct

A. Testing
B. Assessment

A. Testing

How well did you know this?

Not at all

Perfectly

Objective: Answer a
referral question

A. Testing
B. Assessment

B. Assessment

How well did you know this?

Not at all

Perfectly

Focus: Nomothetic
approach

A. Testing
B. Assessment

A. Testing

How well did you know this?

Not at all

Perfectly

Focus: Idiographic
approach

A. Testing
B. Assessment

B. Assessment

How well did you know this?

Not at all

Perfectly

Process: Individual or group

A. Testing
B. Assessment

A. Testing

How well did you know this?

Not at all

Perfectly

Process: Individual only

A. Testing
B. Assessment

B. Assessment

How well did you know this?

Not at all

Perfectly

Outcome: Test score or psychometric
report

A. Testing
B. Assessment

A. Testing

How well did you know this?

Not at all

Perfectly

Outcome: Psychological
report

A. Testing
B. Assessment

B. Assessment

How well did you know this?

Not at all

Perfectly

Source: Test takers only

A. Testing
B. Assessment

A. Testing

How well did you know this?

Not at all

Perfectly

Source: Collateral sources

A. Testing
B. Assessment

B. Assessment

How well did you know this?

Not at all

Perfectly

Evaluator: Tester is not the key

A. Testing
B. Assessment

A. Testing

How well did you know this?

Not at all

Perfectly

Evaluator: Assesor is the key

A. Testing
B. Assessment

B. Assessment

How well did you know this?

Not at all

Perfectly

Duration: Shorter

A. Testing
B. Assessment

A. Testing

How well did you know this?

Not at all

Perfectly

Duration: Longer

A. Testing
B. Assessment

B. Assessment

How well did you know this?

Not at all

Perfectly

Cost: Inexpensive

A. Testing
B. Assessment

A. Testing

How well did you know this?

Not at all

Perfectly

Cost: Expensive

A. Testing
B. Assessment

B. Assessment

How well did you know this?

Not at all

Perfectly

Qualification: RPm

A. Testing
B. Assessment

A. Testing

How well did you know this?

Not at all

Perfectly

Qualification: RPsy

A. Testing
B. Assessment

B. Assessment

How well did you know this?

Not at all

Perfectly

Gathering and
integration of psychology-related data for the
purpose of making a psychological evaluation that is accomplished through the use of tools, tests, interviews, case studies, behavioral observation, and
specially designed apparatuses

A. Psychological Testing
B. Renumeration
C. In-person check-in
D. Psychological Assessment

D. Psychological Assessment

How well did you know this?

Not at all

Perfectly

Interactive approach

A. Dynamic Psychological Assessment
B. Colaboartive Psychological Assessment
C. Theraoeutic Psychological Assessment

A. Dynamic Psychological Assessment

How well did you know this?

Not at all

Perfectly

Evaluation-intervention-evaluation (“sandwich method”)

A. Dynamic Psychological Assessment
B. Colaboartive Psychological Assessment
C. Theraoeutic Psychological Assessment

A. Dynamic Psychological Assessment

How well did you know this?

Not at all

Perfectly

Assessor and assessee may work as “partners” from the initial contact to the final
feedback

A. Dynamic Psychological Assessment
B. Colaboartive Psychological Assessment
C. Theraoeutic Psychological Assessment

B. Colaboartive Psychological Assessment

How well did you know this?

Not at all

Perfectly

Therapeutic self-discovery and new understandings throughout the process

A. Dynamic Psychological Assessment
B. Colaboartive Psychological Assessment
C. Theraoeutic Psychological Assessment

C. Theraoeutic Psychological Assessment

How well did you know this?

Not at all

Perfectly

Use of tests an other tools to evaluate abilities and skills relevant to success or failure in a pre-/school context

A. Educational Assessment
B. Retrospective Assessment
C. Remote Assessment
D. Ecological Momentary Assessment

A. Educational Assessment

How well did you know this?

Not at all

Perfectly

Intelligence tests, achievement tests, reading comprehension tests A. Educational Assessment B. Retrospective Assessment C. Remote Assessment D. Ecological Momentary Assessment

A. Educational Assessment

Use of evaluative tools to conclude psychological aspects of a person as they existed at some point in time before the assessment A. Educational Assessment B. Retrospective Assessment C. Remote Assessment D. Ecological Momentary Assessment

B. Retrospective Assessment

Use of psychological tools to gather data and draw conclusions about subjects who are not in physical proximity to the person or people conducting the evaluation A. Educational Assessment B. Retrospective Assessment C. Remote Assessment D. Ecological Momentary Assessment

C. Remote Assessment

“In the moment” evaluation of specific problems and related cognitive and behavioral variables at the very time and place that they occur A. Educational Assessment B. Retrospective Assessment C. Remote Assessment D. Ecological Momentary Assessment

D. Ecological Momentary Assessment

Process of measuring psychological-related variables by means of devices or procedures designed to obtain a sample of behavior A. Psychological Testing B. Renumeration C. In-person check-in D. Psychological Assessment

A. Psychological Testing

A device or procedure designed to measure variables related to psychology A. Psychological Test B. Psychological Assessment C. Psychological Placebo D. Scam

A. Psychological Test

Subject Matter A. Content B. Format C. Item D. Administration Procedure E. Score

A. Content

Form, plan, structure, arrangement, layout A. Content B. Format C. Item D. Administration Procedure E. Score

B. Format

A specific stimulus to which a person responds overtly, and this response is being scored or evaluated A. Content B. Format C. Item D. Administration Procedure E. Score

C. Item

Individual basis or group administration A. Content B. Format C. Item D. Administration Procedure E. Score

D. Administration Procedure

Code or summary of statement, usually but not necessarily numerical in nature, that reflects an evaluation of performance on a test A. Content B. Format C. Item D. Administration Procedure E. Score

E. Score

The process of assigning scores to performances A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician

A. Scoring

A reference point, usually numerical, derived by judgment and used to divide a set of data into two or more classifications A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician

B. Cut Score

Technical quality A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician

C. Psychometric Soundness

The science of psychological measurements A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician

D. Psychometric

A professional who uses, analyzes, and interprets psychological test data A. Scoring B. Cut Score C. Psychometric Soundness D. Psychometric E. Psychometrician

E. Psychometrician

Which statement best defines psychometric properties? A. Ethical standards governing psychological practice B. Statistical procedures for diagnosing disorders C. Technical qualities and characteristics of psychological tests and assessments that determine scientific soundness and practical usefulness D. Theoretical assumptions behind personality theories

C. Technical qualities and characteristics of psychological tests and assessments that determine scientific soundness and practical usefulness

Psychometric properties are primarily used to determine which aspect of psychological tests? A. Their cultural fairness B. Their scientific soundness and practical usefulness C. Their length and format D. Their popularity among clinician

B. Their scientific soundness and practical usefulness

Why are psychometric properties crucial in psychological testing? A. They ensure that a test measures what it intends to measure, B. They ensure that a test produces consistent results C. They ensure that a test allows meaningful interpretation of scores D. All of the above

D. All of the above

Reliability in psychological testing refers to what concept? A. Accuracy of score interpretation B. Consistency in measurement C. Validity of test content D. Predictive usefulness

B. Consistency in measurement

What is a reliability coefficient? A. A score that reflects item difficulty B. A qualitative judgment of test usefulness C. An index of reliability indicating the ratio between true score variance and total score variance D. A measure of examiner bias

C. An index of reliability indicating the ratio between true score variance and total score variance

What is the numerical range of a reliability coefficient? A. 0 to 1 B. −1 to +1 C. 0 to 100 D. 1 to 10

A. 0 to 1

Reliability is synonymous with which pair of terms? A. Accuracy and validity B. Stability and objectivity C. Consistency and dependability D. Precision and fairness

C. Consistency and dependability

What happens to a test’s reliability when a greater proportion of total variance is attributed to true variance? A. Reliability decreases B. Reliability remains unchanged C. Reliability becomes unpredictable D. Reliability increases

D. Reliability increases

What does the true score formula conceptually represent? A. Obtained score B. Mean score C. Reliability coefficient D. Examiner judgment of performance E. Relationship between all choices

E. Relationship between all choices

In the true score formula, what does x represent? A. The correlation coefficient B. The obtained score C. The true variance D. The standard error

B. The obtained score

In the true score formula, what does x̄ represent? A. Error variance B. Observed score C. Mean score D. Reliability index

C. Mean score

In reliability analysis, what does Rxx represent? A. Correlation coefficient B. Ratio of variances C. Mean score difference D. Error component

A. Correlation coefficient

One goal of reliability is to estimate what aspect of psychological testing? A. Cultural bias B. Estimate errors in psychological measurement C. Norm group representativeness D. Examiner competence

B. Estimate errors in psychological measurement

Another goal of reliability is to develop techniques that serve what purpose? A. Increase test difficulty B. Shorten testing time C. Reduce measurement errors D. Improve score distributions

C. Reduce measurement errors

In reliability theory, true variance is compared against which combination? A. Standard deviation and mean B. Observed score and reliability coefficient C. Content variance and criterion variance D. True score variance plus error variance

D. True score variance plus error variance

Classical Test Theory assumes that a test score reflects which components? A. Ability and motivation B. Knowledge and practice effects C. True score and examiner bias D. True score and error

D. True score and error

In Classical Test Theory, what does error represent? A. Intentional test manipulation B. A stable trait of the test taker C. The component of the observed score unrelated to the ability being measured D. Differences between test forms

C. The component of the observed score unrelated to the ability being measured

Which equation represents the Classical Test Theory model? A. X = V + R B. X = T + E C. T = X − V D. E = X + T

B. X = T + E

Variance is particularly useful in test theory for what purpose? A. Describing sources of test score variability B. Calculating raw scores C. Identifying item difficulty D. Ranking test takers

A. Describing sources of test score variability

What does true variance represent? A. Variance from irrelevant random sources B. Variance from testing conditions C. Variance from examiner differences D. Variance from true differences

D. Variance from true differences

Error variance refers to variance caused by what factor? A. True ability differences B. Irrelevant random sources C. Consistent test content D. Standardized administration

B. Irrelevant random sources

Measurement error includes which factors? A. Only test-taker abilities B. Only examiner scoring decisions C. All factors in measurement other than the variable being measured D. Only random fluctuations

C. All factors in measurement other than the variable being measured

Random error is best described as which type of source? A. Unavoidable and caused by unpredictable fluctuations B. Predictable and consistent C. Intentional and controllable D. Systematic and proportional

A. Unavoidable and caused by unpredictable fluctuations

What distinguishes systematic error from random error? A. It is unpredictable B. It is avoidable if corrected C. It is always larger D. It cannot be identified

B. It is avoidable if corrected

Systematic error is typically what in relation to the true value? A. Completely unrelated B. Randomly fluctuating C. Constant or proportionate D. Temporarily inconsistent

C. Constant or proportionate

Item sampling or content sampling refers to what type of variation? A. Variation in scoring procedures B. Variation among test administrators C. Variation in test-taker motivation D. Variation among items within a test and between tests

D. Variation among items within a test and between tests

Test environment factors include which of the following? A. Room temperature B. Lighting C. Ventilation D. Noise

All of the above

Which is an example of a test taker variable affecting reliability? A. Examiner professionalism B. Scoring rubrics C. Room size D. Lack of sleep

D. Lack of sleep

Which of the following is considered a test taker variable? A. Examiner’s demeanor B. Effects of drugs C. Item wording D. Test timing

B. Effects of drugs

Examiner-related variables include which factor? A. Emotional problems of the test taker B. Casual life experiences C. Examiner’s physical appearance and nonverbal gestures D. Content sampling

C. Examiner’s physical appearance and nonverbal gestures

Professionalism is categorized under which source of error? A. Test construction B. Test taker variables C. Test environment D. Examiner-related variables

D. Examiner-related variables

Internal reliability refers to what type of consistency? A. Consistency across different raters B. Consistency across time C. Consistency within the test itself D. Consistency across test versions

C. Consistency within the test itself

Which example best illustrates internal reliability? A. Two examiners giving similar scores B. A pretest matching a posttest C. Identical scores across schools D. Different items expressing the same meaning

D. Different items expressing the same meaning

External consistency evaluates reliability by comparing results across what dimensions? A. Test items only B. Individuals and time C. Content areas D. Scoring methods

B. Individuals and time

A pretest–posttest comparison is an example of which type of reliability? A. Internal reliability B. Content validity C. External consistency D. Construct validity

C. External consistency

Test–retest reliability is best described as a reliability estimate obtained by A. correlating scores from two equivalent halves of a test B. correlating scores from the same sample across two administrations C. comparing scores from different groups taking the same test D. analyzing item difficulty across test items

B. correlating scores from the same sample across two administrations

The coefficient of stability refers to the idea that A. reliability increases when test difficulty decreases B. reliability coefficients remain constant regardless of time C. longer time intervals tend to reduce reliability coefficients D. shorter tests always produce unstable scores

C. longer time intervals tend to reduce reliability coefficients

Test sophistication occurs when A. test takers improve due to coaching B. items are remembered, especially difficult or confusing ones C. scores are affected by item sampling error D. test length is increased

B. items are remembered, especially difficult or confusing ones

Test wiseness may affect test scores by A. lowering reliability coefficients B. increasing measurement error C. inflating the apparent abilities of test takers D. reducing internal consistency

C. inflating the apparent abilities of test takers

Mortality in test–retest reliability refers to A. item difficulty imbalance B. loss of test materials C. absence of some participants in the second session D. failure to counterbalance test forms

C. absence of some participants in the second session

In addressing mortality, the recommended action is to A. remove the first test scores of absent participants B. replace missing participants with new ones C. shorten the test interval D. recalculate item difficulty

A. remove the first test scores of absent participants

Counterbalancing is used primarily to A. increase internal consistency B. reduce practice effects within items C. ensure homogeneity of items D. avoid carryover effects by varying test sequences

D. avoid carryover effects by varying test sequences

McDonald’s Omega assesses how well items A. predict future performance B. measure different constructs C. correlate with external criteria D. consistently measure a single underlying construct

D. consistently measure a single underlying construct

Rulon’s Formula is best described as a method that assesses test consistency by A. comparing scores from two different tests B. correlating item difficulty with total score C. comparing scores obtained from two halves of the same test D. evaluating agreement among multiple raters

C. comparing scores obtained from two halves of the same test

Rulon’s Formula is considered a counterpart of the A. Kappa formula B. Kendall’s W C. Cronbach’s alpha D. Spearman-Brown formula

D. Spearman-Brown formula

The primary purpose of splitting a test when applying Rulon’s Formula is to A. create two equivalent halves B. increase test difficulty C. eliminate unreliable items D. rank test takers

A. create two equivalent halves

One commonly used way of dividing a test into two halves for Rulon’s Formula is A. first half versus second half B. easy items versus difficult items C. odd-numbered items versus even-numbered items D. objective items versus subjective items

C. odd-numbered items versus even-numbered items

In Rulon’s Formula, one variance that must be calculated is the variance of the A. item difficulties B. total scores for each person C. rater judgments D. test administration times

B. total scores for each person

Another required calculation in Rulon’s Formula is the variance of the A. individual item scores B. mean test scores C. differences between scores on the two halves D. percentile ranks

C. differences between scores on the two halves

Odd–even reliability refers specifically to A. alternating easy and hard items B. splitting the test by content area C. assigning odd-numbered items to one half and even-numbered items to the other D. comparing two different test forms

C. assigning odd-numbered items to one half and even-numbered items to the other

Kappa statistics are primarily used when dealing with A. interval data B. nominal data C. ratio data D. continuous scores

B. nominal data

Fleiss Kappa is most appropriate when measuring agreement among A. two test halves B. two raters C. three or more raters D. ranked data sets

C. three or more raters

Cohen’s Kappa is used to determine agreement between A. test items and total scores B. two raters C. multiple test forms D. several ranking judges

B. two raters

Kendall’s W is specifically designed for use with A. nominal data B. dichotomous data C. rankings or ordinal data D. interval-scale scores

C. rankings or ordinal data

A restricted range of scores typically results in a A. lower correlation coefficient B. higher correlation coefficient C. perfect correlation D. negative correlation

A. lower correlation coefficient

An inflated range of scores generally leads to a A. lower correlation coefficient B. higher correlation coefficient C. perfect correlation D. negative correlation

B. higher correlation coefficient

A power test is characterized by A. items of uniform difficulty B. extremely short time limits C. a time limit long enough for test takers to attempt all items D. all test takers obtaining perfect scores

C. a time limit long enough for test takers to attempt all items

In a power test, some items are designed so that A. all test takers can answer them correctly B. no time pressure is applied C. guessing is eliminated D. no test takers can obtain a perfect score

D. no test takers can obtain a perfect score

A speed test contains items that are A. increasingly difficult B. varied in complexity C. of uniform level of difficulty D. based on ranking tasks

C. of uniform level of difficulty

When given generous time limits, a defining feature of a speed test is that A. all test takers should be able to complete all items correctly B. only high-ability test takers finish C. scores depend on item difficulty D. perfect scores are impossible

A. all test takers should be able to complete all items correctly

The main distinction between speed tests and power tests lies in A. scoring procedures B. item format C. time limits and item difficulty D. number of test items

C. time limits and item difficulty

Which concept directly affects whether a correlation coefficient becomes higher or lower? A. Test length B. Restriction or inflation of range C. Number of raters D. Item numbering

B. Restriction or inflation of range

Which statistical method is most appropriate for measuring agreement using nominal categories with two evaluators? A. Fleiss Kappa B. Kendall’s W C. Rulon’s Formula D. Cohen’s Kappa

D. Cohen’s Kappa

Which statement best describes the main purpose of Domain Sampling Theory? A. To explain how test scores change due to testing conditions B. To model the probability that a person with a given ability can perform a task C. To estimate how specific sources of variation contribute to test scores under defined conditions D. To identify observable and unobservable traits in test performance

C. To estimate how specific sources of variation contribute to test scores under defined conditions

The Domain of Behavior is best described as: A. The set of test scores obtained under identical testing conditions B. The universe of items that could conceivably measure a behavior C. The observable outcomes of a latent trait D. The specific facets involved in test administration

B. The universe of items that could conceivably measure a behavior

Within Domain Sampling Theory, the Domain of Behavior is considered a: A. Hypothetical construct B. Measurable variable C. Manifest trait D. Statistical estimate

A. Hypothetical construct

Generalizability Theory primarily explains test score variation as resulting from: A. Differences in latent traits among individuals B. Errors in prediction and estimation C. The number of test items used D. Variables in the testing situation

D. Variables in the testing situation

Generalizability Theory primarily explains test score variation as resulting from: A. Differences in latent traits among individuals B. Errors in prediction and estimation C. Variables in the testing situation D. The number of test items used

C. Variables in the testing situation

In Generalizability Theory, the Universe refers to: A. All possible latent traits being measured B. The observable behaviors shown during testing C. The probability model used to predict performance D. The details of the particular test situation

D. The details of the particular test situation

Which of the following is identified as a facet in Generalizability Theory? A. The number of items included in the test B. The unobservable trait being measured C. The standard error of the estimate D. The predicted test score

A. The number of items included in the test

Which option correctly identifies another facet within the testing universe? A. The purpose of test administration B. The universe score C. The domain of behavior D. The manifestation trait

A. The purpose of test administration

The Universe Score represents: A. The average score across different testing situations B. The score predicted from a regression equation C. The score obtained when facets vary D. The score obtained when all facets remain exactly the same

D. The score obtained when all facets remain exactly the same

A Generalizability Study is conducted to examine: A. How well predicted scores match observed scores B. Whether scores remain consistent across different testing situations C. The discrimination power of individual test items D. The relationship between latent and manifest traits

B. Whether scores remain consistent across different testing situations

The primary focus of a Decision Study is to: A. Determine how useful test scores are for decision-making B. Model performance probabilities C. Measure item discrimination D. Identify sources of test score variation

A. Determine how useful test scores are for decision-making

Item-Response Theory is also known as: A. Domain Sampling Theory B. Generalizability Theory C. Classical Test Theory D. Latent-Trait Theory

D. Latent-Trait Theory

Item-Response Theory is designed to model: A. How testing conditions affect observed scores B. The probability that a person with a certain ability can perform at a given level C. The consistency of scores across administrations D. The difference between predicted and observed values

B. The probability that a person with a certain ability can perform at a given level

A latent trait is defined as a trait that is: A. Directly measurable B. Observable during testing C. Unobservable D. A facet of test administration

C. Unobservable

A manifestation trait differs from a latent trait because it is: A. Hypothetical B. Statistically predicted C. Unobservable D. Observable

D. Observable

In Item-Response Theory, discrimination refers to an item’s ability to: A. Predict future test performance B. Distinguish between individuals with different levels of the measured trait C. Increase test reliability across situations D. Reduce the standard error of the estimate

B. Distinguish between individuals with different levels of the measured trait

Dichotomous test items are characterized by: A. Only two possible responses B. More than two response options C. Continuous scoring D. Variable testing conditions

A. Only two possible responses

Polytomous test items are defined as items that: A. Measure latent traits only B. Have more than two possible responses C. Produce universe scores D. Are limited to correct or incorrect answers

B. Have more than two possible responses

The Standard Error of the Difference is used to help determine: A. Whether a predicted score is accurate B. How observable a trait is C. Whether a difference between scores is statistically significant D. How many facets should be included

C. Whether a difference between scores is statistically significant

The Standard Error of Estimate refers to the standard error of the difference between: A. Two observed scores B. Two predicted scores C. Latent and manifest traits D. Predicted and observed values

D. Predicted and observed values

Which concept specifically focuses on the accuracy of predictions rather than score differences? A. Standard Error of the Difference B. Standard Error of Estimate C. Universe Score D. Domain of Behavior

B. Standard Error of Estimate

Which type of validity refers to the degree of control among variables within a study? A. External validity B. Conceptual validity C. Internal validity D. Face validity

C. Internal validity

Increasing random assignment primarily strengthens which type of validity? A. Internal validity B. Face validity C. External validity D. Conceptual validity

A. Internal validity

Which form of validity is most concerned with whether research results can be generalized? A. Conceptual validity B. Internal validity C. Face validity D. External validity

D. External validity

Random selection is specifically associated with increasing which type of validity? A. Internal validity B. External validity C. Face validity D. Conceptual validity

B. External validity

Which type of validity focuses on individuals and their unique histories and behaviors? A. Face validity B. Internal validity C. Conceptual validity D. External validity

C. Conceptual validity

Which validity is based on how a test appears to measure what it is intended to measure to the person taking or viewing it? A. Conceptual validity B. Face validity C. External validity D. Internal validity

B. Face validity

Lawshe’s Content Validity Ratio (CVR) relies on judgments provided by whom? A. Study participants B. Researchers only C. Statistical software D. Subject matter experts (SMEs)

D. Subject matter experts (SMEs)

In Lawshe’s CVR method, how are individual items typically evaluated by SMEs? A. By ranking items from best to worst B. By scoring items on a numerical difficulty scale C. By rating essentiality categories D. By voting to accept or reject items

C. By rating essentiality categories

Which of the following is one of the standard essentiality ratings used in the CVR process? A. Highly valid B. Useful but not essential C. Strongly disagree D. Moderately effective

B. Useful but not essential

The CVR for an item is calculated using which key piece of information? A. The average relevance score across items B. The total number of test items C. The proportion of items with universal agreement D. The number of experts who rate the item as essential

D. The number of experts who rate the item as essential

What does a positive CVR value indicate? A. Exactly half of the experts rated the item as essential B. Fewer than half of the experts rated the item as essential C. The item lacks relevance D. More than half of the experts rated the item as essential

D. More than half of the experts rated the item as essential

A CVR value of zero occurs when which condition is met? A. All experts rate the item as essential B. No experts rate the item as essential C. Exactly half of the experts rate the item as essential D. The average relevance score equals three

C. Exactly half of the experts rate the item as essential

The Content Validity Index (CVI) can be calculated at which levels? A. Participant and population levels B. Item and scale levels C. Variable and construct levels D. Predictor and criterion levels

B. Item and scale levels

The I-CVI represents what measurement? A. The proportion of experts who rate an item as highly relevant B. The number of items rated essential C. The average CVR across all items D. The percentage of items with zero CVR

A. The proportion of experts who rate an item as highly relevant

On a 4-point relevance scale, which ratings are typically considered “highly relevant” for calculating the I-CVI? A. 1 or 2 B. 2 or 3 C. 3 or 4 D. 4 only

C. 3 or 4

Which description best defines S-CVI/Ave? A. The proportion of items with zero CVR B. The average of the I-CVIs across all items C. The number of experts who agree on relevance D. The percentage of essential items only

B. The average of the I-CVIs across all items

S-CVI/UA refers to which calculation approach? A. The mean relevance score for each item B. The ratio of essential to nonessential items C. The proportion of items achieving universal agreement D. The number of SMEs rating items as useful

C. The proportion of items achieving universal agreement

Incremental validity evaluates what specific contribution? A. The appearance of a test to participants B. The generalizability of findings C. The control of variables in a study D. The added explanatory power of an additional predictor

D. The added explanatory power of an additional predictor

Incremental validity focuses on explaining variance in relation to what? A. Independent variables only B. Participant behavior histories C. Criterion measures D. Sampling methods

C. Criterion measures

Which situation best reflects the concept of incremental validity? A. Selecting participants randomly to improve generalizability B. Adding a new predictor that explains variance beyond existing predictors C. Asking experts to judge item relevance D. Designing a test that appears valid to respondents

B. Adding a new predictor that explains variance beyond existing predictors

What best describes the Multi-trait Multi-method Matrix (MTMM)? A. A statistical technique used to reduce data dimensionality B. A method for assessing the construct validity of a set of measures in a study C. A graphical tool for identifying latent variables D. A procedure for estimating factor loadings

B. A method for assessing the construct validity of a set of measures in a study

What does the MTMM provide researchers? A. A way to calculate eigenvalues B. A process for generating theories C. A structured way to evaluate convergent and discriminant validity simultaneously D. A method for selecting the number of factors

C. A structured way to evaluate convergent and discriminant validity simultaneously

How many types of correlation coefficients are analyzed in the MTMM? A. Two B. Three C. Four D. Five

C. Four

Which correlations form the reliability diagonal in the MTMM? A. Heteromethod-heterotrait correlations B. Monomethod-monotrait correlations C. Monomethod-heterotrait correlations D. Heteromethod-monotrait correlations

B. Monomethod-monotrait correlations

What do monomethod-monotrait correlations represent? A. The validity of different methods B. The agreement between different traits C. The reliability of each measure D. The variance explained by each factor

C. The reliability of each measure

How should monomethod-monotrait correlations compare to other correlations in the MTMM? A. They should be the lowest B. They should be moderate C. They should be statistically insignificant D. They should be the highest in the entire matrix

D. They should be the highest in the entire matrix

Which MTMM correlations form the validity diagonal? A. Monomethod-monotrait correlations B. Heteromethod-monotrait correlations C. Monomethod-heterotrait correlations D. Heteromethod-heterotrait correlations

B. Heteromethod-monotrait correlations

What do heteromethod-monotrait correlations represent? A. The same trait measured by different methods B. Different traits measured by the same method C. Different traits measured by different methods D. Reliability estimates of a single method

B. Different traits measured by the same method

Significant heteromethod-monotrait correlations provide evidence for which type of validity? A. Discriminant validity B. Criterion validity C. Face validity D. Convergent validity

D. Convergent validity

What does convergent validity indicate? A. Measures of different traits disagree B. Measures of the same trait converge or agree C. Methods produce unrelated results D. Traits are statistically independent

B. Measures of the same trait converge or agree

Which correlations are found in the heterotrait-monomethod triangles? A. The same trait measured by different methods B. Different traits measured by the same method C. Different traits measured by different methods D. Reliability estimates of a single method

C. Different traits measured by different methods

How should monomethod-heterotrait correlations typically appear? A. High, to show convergence B. Moderate, to show overlap C. Low, to demonstrate discrimination D. Zero, to show independence

C. Low, to demonstrate discrimination

What type of validity is supported by low monomethod-heterotrait correlations? A. Predictive validity B. Convergent validity C. Content validity D. Discriminant validity

D. Discriminant validity

What do heteromethod-heterotrait correlations involve? A. Same traits and same methods B. Different traits and different methods C. Same traits and different methods D. Different traits and same methods

B. Different traits and different methods

How should heteromethod-heterotrait correlations compare to other correlations in the MTMM? A. They should be the highest B. They should be moderate C. They should be the lowest in the matrix D. They should be equal to reliability estimates

C. They should be the lowest in the matrix

What does having the lowest heteromethod-heterotrait correlations further support? A. Reliability B. Internal consistency C. Convergent validity D. Discriminant validity

D. Discriminant validity

What is factor analysis primarily designed to identify? A. Correlation coefficients B. Factors or specific variables on which people may differ C. Measurement errors D. Sampling distributions

B. Factors or specific variables on which people may differ

Which description best defines factors in factor analysis? A. Attributes B. Characteristics C. Dimensions D. All of the above

D. All of the above

What is the most common purpose of factor analysis? A. Hypothesis testing B. Construct validation C. Data reduction D. Reliability estimation

C. Data reduction

How does factor analysis simplify complex data? A. By increasing the number of variables B. By reducing the number of variables C. By maximizing correlations D. By eliminating variance

B. By reducing the number of variables

What benefit does data reduction provide? A. Easier interpretation of datasets B. More precise sampling C. Higher reliability coefficients D. Increased subjectivity

A. Easier interpretation of datasets

What does structure discovery in factor analysis help uncover? A. Measurement error B. Sample bias C. Underlying dimensions within observed variables D. External validity

C. Underlying dimensions within observed variables

How is factor analysis used in construct validation? A. To calculate eigenvalues B. To test hypotheses statistically C. To rank participants D. To ensure items measure the intended underlying construct

D. To ensure items measure the intended underlying construct

What are examples of constructs factor analysis may help validate? A. Age and gender B. Intelligence and anxiety C. Height and weight D. Reaction time and speed

B. Intelligence and anxiety

What does factor analysis provide regarding observed variables? A. Correlation matrices B. Reliability coefficients C. Estimates of factor loadings D. Norm-referenced scores

C. Estimates of factor loadings

What is a factor loading? A. A measure of sample size B. An estimate of explained variance across components C. A test of statistical significance D. Information about how strongly an observed variable relates to a factor

D. Information about how strongly an observed variable relates to a factor

What does a factor loading convey about test scores? A. Their reliability B. The extent to which the factor determines them C. Their norm group placement D. Their distribution shape

B. The extent to which the factor determines them

What is the primary goal of exploratory factor analysis (EFA)? A. To confirm a specific model B. To test theory C. To explore underlying factor structure D. To maximize variance

C. To explore underlying factor structure

When is exploratory factor analysis typically used? A. When factor structure is already known B. When no preconceived idea of factors exists C. When validating norms D. When conducting cross-validation

B. When no preconceived idea of factors exists

How is exploratory factor analysis often described conceptually? A. Theory-testing B. Error-correcting C. Norm-referencing D. Theory-generating

D. Theory-generating

What is the main purpose of confirmatory factor analysis (CFA)? A. To reduce dimensionality B. To discover factors C. To confirm a hypothesized factor structure D. To estimate reliability

C. To confirm a hypothesized factor structure

How is confirmatory factor analysis typically characterized? A. Theory-generating B. Theory-testing C. Data-mining D. Norm-building

B. Theory-testing

What is the main purpose of principal component analysis (PCA)? A. To identify latent traits B. To validate constructs C. To reduce data dimensionality D. To estimate reliability

C. To reduce data dimensionality

How does PCA summarize data? A. By increasing the number of variables B. By summarizing variance with fewer variables C. By eliminating correlations D. By estimating factor loadings

B. By summarizing variance with fewer variables

What is the primary goal of dimensionality reduction in PCA? A. Reducing the number of variables B. Increasing explained error C. Confirming theory D. Testing hypotheses

A. Reducing the number of variables

How are principal components selected in PCA? A. Based on reliability estimates B. Based on theory C. To capture maximum possible variance D. To minimize correlations

C. To capture maximum possible variance

What does orthogonality of principal components mean? A. They are linearly dependent B. They overlap substantially C. They are correlated D. They are uncorrelated

D. They are uncorrelated

What is a scree plot? A. A plot of factor loadings B. A plot of eigenvalues against component number C. A histogram of test scores D. A matrix of correlations

B. A plot of eigenvalues against component number

What is the purpose of a scree plot? A. To assess reliability B. To test hypotheses C. To decide how many components to retain D. To estimate norms

C. To decide how many components to retain

What feature of the scree plot is used to guide component retention? A. The highest eigenvalue B. The average variance C. The sample size D. The elbow or point of inflection

D. The elbow or point of inflection

What does the explained variance ratio represent? A. The reliability of components B. The proportion of variance along each principal component C. The correlation between variables D. The error variance

B. The proportion of variance along each principal component

What does the Kaiser Criterion (K1 Rule) recommend retaining? A. Factors with eigenvalues less than 1.0 B. Only the first factor C. Factors before the elbow D. Factors with eigenvalues greater than 1.0

D. Factors with eigenvalues greater than 1.0

What is a limitation of the Kaiser Criterion? A. It is highly subjective B. It underestimates factor numbers C. It tends to overestimate the number of factors D. It ignores eigenvalues

C. It tends to overestimate the number of factors

Under what condition does the Kaiser Criterion especially tend to overestimate factors? A. When sample size is small B. When reliability is low C. When variance is minimal D. When the number of variables is large

D. When the number of variables is large

How is the Kaiser Criterion generally regarded in terms of accuracy? A. The most accurate method B. Moderately accurate C. Highly precise D. Not the most accurate method

D. Not the most accurate method

What does the Elbow Method (Scree Test) involve plotting? A. Loadings against variables B. Eigenvalues against factor number C. Variance against sample size D. Scores against norms

B. Eigenvalues against factor number

What factors are retained using the Elbow Method? A. Those after the elbow B. Only the first factor C. Those before the elbow D. Those with the lowest variance

C. Those before the elbow

What is a key limitation of the Elbow Method? A. It requires large samples B. It is computationally complex C. It is highly subjective D. It always underestimates factors

C. It is highly subjective

Why can the elbow point be problematic to identify? A. It changes with rotation B. It is often ambiguous or hard to define C. It depends on reliability D. It requires confirmatory analysis

B. It is often ambiguous or hard to define

What consequence can arise from ambiguity in the elbow point? A. Reduced reliability B. Different researchers selecting different numbers of factors C. Loss of validity D. Increased error variance

B. Different researchers selecting different numbers of factors

What is cross-validation? A. Validation using the same sample B. Revalidation of a test using a different group C. Norming a test D. Estimating factor loadings

B. Revalidation of a test using a different group

What happens to validity after cross-validation in some cases? A. It increases B. It remains constant C. It disappears D. It decreases

D. It decreases

What is the decrease in validity after cross-validation called? A. Criterion contamination B. Validity shrinkage C. Reliability decay D. Sampling bias

B. Validity shrinkage

What does co-validation involve? A. Validating one test multiple times B. Validating tests across cultures C. Validation of more than one test from the same group D. Validation using multiple criteria

C. Validation of more than one test from the same group

What does co-norming refer to? A. Norming one test repeatedly B. Norming tests across populations C. Norming more than one test from the same group D. Norming after cross-validation

C. Norming more than one test from the same group

Which description best defines norms in the context of psychological or educational testing? A. Individual raw scores obtained from a single test administration B. Test performance data from a specific group used as a reference for interpreting individual scores C. Statistical formulas used to calculate test reliability D. The process of converting scores into percentile ranks

B. Test performance data from a specific group used as a reference for interpreting individual scores

What does norming refer to? A. Comparing individual scores to national averages B. Selecting test items for inclusion in an assessment C. The process of deriving norms D. Assigning grades based on test results

C. The process of deriving norms

Which group is described as the normative sample? A. Individuals who score above the average on a test B. Test developers who design the assessment C. People whose scores are excluded from analysis D. A group whose performance on a test is analyzed for reference

D. A group whose performance on a test is analyzed for reference

Why is a normative sample important? A. It determines how difficult the test items should be B. It provides a reference for evaluating individual test performance C. It eliminates the need for raw scores D. It guarantees equal performance across test takers

B. It provides a reference for evaluating individual test performance

Which type of norm converts raw scores from a standardization sample into a rank-based format? A. Developmental norms B. Local norms C. Percentile norms D. Subgroup norms

C. Percentile norms

What does a percentile express? A. The percentage of correct answers on a test B. The proportion of items answered incorrectly C. The percentage of people whose scores fall above a given raw score D. The percentage of people whose scores fall below a particular raw score

D. The percentage of people whose scores fall below a particular raw score

Which statement correctly describes percentage correct? A. The number of people scoring below a given raw score B. Raw score divided by the total number of test takers C. The number of correct responses multiplied by 100 D. A comparison between two different tests

C. The number of correct responses multiplied by 100

Which category of norms is developed based on characteristics that change or are affected by stages of life? A. National norms B. Percentile norms C. Subgroup norms D. Developmental norms

D. Developmental norms

Grade norms are best described as norms based on: A. Age-equivalent scores B. Nationally representative samples C. Grade-level performance D. Local population performance

C. Grade-level performance

Which type of developmental norm uses age-equivalent scores? A. Age norms B. Grade norms C. Local norms D. National anchor norms

A. Age norms

Which norms are derived from a sample that represents the population at a national level? A. Subgroup norms B. National norms C. Local norms D. Percentile norms

B. National norms

What is the primary function of national anchor norms? A. To describe performance within a local population B. To measure developmental changes across age C. To provide an equivalency table for comparing scores on two tests D. To convert raw scores into percentages

C. To provide an equivalency table for comparing scores on two tests

Which type of norm involves dividing the normative sample based on criteria used during sample selection? A. Developmental norms B. Local norms C. National anchor norms D. Subgroup norms

D. Subgroup norms

What distinguishes subgroup norms from national norms? A. Subgroup norms use smaller test forms B. Subgroup norms are always locally developed C. Subgroup norms segment the normative sample using specific criteria D. Subgroup norms focus only on age differences

C. Subgroup norms segment the normative sample using specific criteria

Which norms are most often developed by test users themselves? A. National norms B. Percentile norms C. Local norms D. Developmental norms

C. Local norms

What do local norms provide? A. Comparisons between two equivalent tests B. Normative information relative to the performance of a local population C. Age-equivalent interpretations of scores D. National-level test comparisons

B. Normative information relative to the performance of a local population

Which pairing is correctly matched? A. Percentile norms – age-equivalent scores B. Developmental norms – traits affected by stages of life C. Local norms – nationally representative samples D. National norms – locally developed by test users

B. Developmental norms – traits affected by stages of life

A score interpretation that focuses on how many individuals scored lower than a specific raw score relies on: A. Percentage correct B. Grade norms C. Percentile norms D. Local norms

C. Percentile norms

Which statement accurately differentiates percentage correct from percentile? A. Percentage correct compares performance across age groups B. Percentile reflects the proportion of items answered correctly C. Percentage correct is based on rank ordering D. Percentile reflects how many people scored below a given score

D. Percentile reflects how many people scored below a given score

Which term refers specifically to the process rather than the data or group? A. Normative sample B. Norms C. Percentile D. Norming

D. Norming

What best describes the Fixed Reference Group Scoring System? A. Test scores are adjusted for each new group of test takers B. A predetermined passing score is applied across all administrations C. Scores from one group are used as the basis for future score calculations D. Individual performance determines score interpretation

C. Scores from one group are used as the basis for future score calculations

In the Fixed Reference Group Scoring System, which group influences future test scoring? A. One original group of test takers B. The most recent group of test takers C. All groups combined over time D. A randomly selected sample group

A. One original group of test takers

A reliability coefficient of 0.92 is interpreted as: A. Good B. Adequate C. Excellent D. May have limited applicability

C. Excellent

Which reliability coefficient range is labeled as “Good”? A. 0.70 – 0.79 B. 0.80 – 0.89 C. Below 0.70 D. 0.90 and up

B. 0.80 – 0.89

A reliability coefficient below 0.70 suggests the test: A. Is excellent B. Is adequate C. Has strong consistency D. May have limited applicability

D. May have limited applicability

A validity coefficient of 0.36 would be interpreted as: A. Likely to be useful B. Depends on the circumstances C. Very beneficial D. Unlikely to be useful

C. Very beneficial

Which validity coefficient range is considered “Likely to be Useful”? A. 0.11 – 0.20 B. Above 0.35 C. Below 0.11 D. 0.21 – 0.35

D. 0.21 – 0.35

A validity coefficient of 0.15 falls under which interpretation? A. Very beneficial B. Depends on the circumstances C. Likely to be useful D. Unlikely to be useful

B. Depends on the circumstances

A Cronbach’s alpha value of 0.91 is interpreted as: A. Good B. Acceptable C. Excellent D. Questionable

C. Excellent

What interpretation corresponds to a validity coefficient below 0.11? A. Very beneficial B. Likely to be useful C. Depends on the circumstances D. Unlikely to be useful

D. Unlikely to be useful

Which Cronbach’s alpha range is classified as “Questionable”? A. 0.50 ≤ α < 0.60 B. 0.60 ≤ α < 0.70 C. 0.70 ≤ α < 0.80 D. α < 0.50

B. 0.60 ≤ α < 0.70

A Cronbach’s alpha value of 0.55 would be described as: A. Poor B. Unacceptable C. Acceptable D. Good

A. Poor

Which Cronbach’s alpha value indicates an unacceptable level? A. 0.65 B. 0.72 C. 0.48 D. 0.85

C. 0.48

When the p-value is less than or equal to ∞, the correct decision is to: A. Accept the null hypothesis B. Reject the null hypothesis C. Modify the null hypothesis D. Delay the decision

B. Reject the null hypothesis

A p-value greater than or equal to ∞ leads to which action? A. Reject the null hypothesis B. Revise the alternative hypothesis C. Ignore the hypothesis D. Accept the null hypothesis

D. Accept the null hypothesis

Measurement is best defined as: A. The interpretation of test scores B. Assigning numbers or symbols to characteristics C. Comparing observed and expected outcomes D. Eliminating error from testing

B. Assigning numbers or symbols to characteristics

Error refers to: A. Incorrect test scoring B. Random guessing by examinees C. Factors influencing a score beyond what is measured D. Poor test construction

C. Factors influencing a score beyond what is measured

Scales are described as: A. Methods for interpreting validity B. Statistical models of reliability C. Sets of numbers or symbols assigned to objects D. Errors affecting measurement

C. Sets of numbers or symbols assigned to objects

Which type of scale consists of a countable set of values that can be infinite? A. Discrete scale B. Nominal scale C. Ordinal scale D. Continuous scale

D. Continuous scale

A discrete scale is characterized by being: A. Countable in a finite amount of time B. Infinite and uncountable C. Based on equal intervals only D. Dependent on ratio properties

A. Countable in a finite amount of time

The property of magnitude refers to: A. The absence of a measured attribute B. Equal distances between values C. Moreness and comparison D. Zero point measurement

C. Moreness and comparison

Equal interval means that: A. Values can be ranked only B. Zero indicates absence of a trait C. Differences between scale points are consistent D. The scale has infinite values

C. Differences between scale points are consistent

The ratio property exists when: A. Differences between values are equal B. Rankings can be established C. Measurement error is eliminated D. Nothing of the property being measured exists

D. Nothing of the property being measured exists

Sample 1 Flashcards

(242 cards)