Assessment Flashcards

Question

Hermann Ebbinghaus

Answer 1

Hermann Ebbinghaus (1850–1909) studied human memory and is well known for his work on the forgetting curve. He administered mental tests to school-age children and was able to show that his sentence completion test was related to scholastic achievement.

Answer 2

(1875–1911) developed the first modern intelligence test, the Binet-Simon scale, with Theophile Simon.

Answer 3

(1877–1956) revised the Binet-Simon scale, naming the enhanced version the Stanford-Binet Intelligence Test.

Answer 4

The first intelligence test to incorporate the intelligence quotient (ratio IQ), which is chronological age divided by mental age.

Answer 5

(1886–1964) devised the first scientifically reliable measure for testing the intelligence of individuals in groups. The assessment was called the Otis Group Intelligence Scale.

Answer 6

(1876–1956) used Otis’s group intelligence instrument to develop the Army Alpha and Army Beta group intelligence tests.

Answer 7

Designed to screen the cognitive ability of military recruits. The intelligence measure was eventually revised for civilian use.

Answer 8

The language-free version of the test designed for recruits who could not read or were foreign-born.

Answer 9

(1863–1945) and L. L. Thurston (1887–1955) developed a statistical test known as factor analysis, which led to the development of multiple aptitude testing.

Answer 10

(1893–1978), in conjunction with the Educational Testing Service (ETS), developed the Scholastic Aptitude Test (SAT).

Answer 11

(1874–1949) developed the first achievement test battery, the Stanford Achievement Test (SAT), which provided an objective measure of academic performance and could be administered to large groups of students.

Answer 12

(1896–1962) developed Woodworth’s Personal Data Sheet, an emotional stability-screening test for World War I military recruits. It was the first standardized personality inventory.

Answer 13

(1903–1984) and J. Charnley McKinley (1891–1950) developed the Minnesota Multiphasic Personality Inventory (MMPI), an objective measure of personality structure.

Answer 14

The second version of the Minnesota Multiphasic Personality Inventory, now the personality test most widely used to identify and diagnose psychopathology.

Answer 15

(1875–1961), Herman Rorschach (1884–1922), and Henry Murray (1893–1988) developed projective techniques (Jung’s word associations, Rorschach’s inkblots, and Murray’s Thematic Apperception Test, respectively) to assess personality.

Answer 16

(1854–1908) was the father of vocational guidance and counseling. His work gave birth to the development of vocational and interest inventories.

Answer 17

(1884–1963) devised the Strong Vocational Interest Blank, which is known today as the Strong Interest Inventory.

Answer 18

Remains among the most widely used and researched vocational measure in career counseling.

Answer 19

Measurement is the process of defining and estimating the magnitude of human attributes and behavioral expressions using standardized instruments.

Answer 20

Human attributes and behaviors are distinct enough to be objectively defined and quantified.

Answer 21

All human attributes and behavioral expressions exist in all people.

Answer 22

The presence or absence of attributes or behaviors in certain situations indicates normalcy or deficiency.

Answer 23

Measurement instruments such as tests, surveys, and inventories.

Answer 24

Assessment is a broad, systematic process of gathering and documenting client information.

Answer 25

• A test is a subset of assessment, providing data from responses to test items. • Assessment encompasses the entire process of collecting and integrating information.

Answer 26

A test is a standardized instrument used to yield data about an examinee’s responses to specific items.

Answer 27

Interpretation is when the counselor assigns meaning to test or assessment data using norms, criteria, or professional judgment.

Answer 28

1. Comparing to a peer group (norm-referenced) 2. Using predetermined criteria (criterion-referenced) 3. Applying professional judgment

Answer 29

Evaluation is determining the worth, significance, or progress based on measurement results.

Answer 30

Examining a client’s monthly Beck Depression Inventory scores to determine progress over time.

Answer 31

To assess client progress and determine the effectiveness of interventions, programs, or services.

Answer 32

To differentiate between individuals by highlighting differences in ability, performance, or characteristics.

Answer 33

A power test limits perfect scores by including very difficult items and measures how well a test-taker performs regardless of time limits.

Answer 34

The level of ability or knowledge a test-taker has when given sufficient time.

Answer 35

A speed test limits perfect scores by imposing strict time limits, not item difficulty.

Answer 36

How quickly a test-taker can understand questions and respond accurately.

Answer 37

• Power tests → difficulty limits scores, time is not emphasized • Speed tests → time limits scores, items are usually easy

Answer 38

A test designed to measure a client’s best possible or highest attainable performance.

Answer 39

Achievement tests and aptitude tests.

Answer 40

A test that measures characteristic or usual behavior, not one’s best effort.

Answer 41

Personality tests, which assess normal patterns of behavior, thoughts, and emotions.

Answer 42

A test with uniform administration, scoring, and interpretation procedures.

Answer 43

• Predetermined instructions • Objective scoring • Established reliability and validity • Comparison to a norm group

Answer 44

The SAT and GRE.

Answer 45

A test that allows flexibility in administration, scoring, and interpretation.

Answer 46

Scores cannot be compared to a norm group, requiring reliance on professional judgment.

Answer 47

Projective tests such as the Rorschach Inkblot Test and the Thematic Apperception Test (TAT).

Answer 48

A test administered to one examinee at a time.

Answer 49

• Builds rapport • Allows close observation • Counselor can monitor fatigue, anxiety, and motivation

Answer 50

They are time-consuming and more costly.

Answer 51

A test administered to two or more examinees at the same time.

Answer 52

• Economical • Efficient administration • Objective scoring • Established norms

Answer 53

• Limited flexibility • Restricted responses • Less opportunity for individual observation

Answer 54

A test with clear correct answers and consistent scoring, minimizing examiner bias.

Answer 55

Multiple-choice, true/false, and matching questions.

Answer 56

A test involving open-ended responses that are influenced by examiner and examinee interpretation.

Answer 57

Essay questions or open-ended responses.

Answer 58

A test with clear correct answers and consistent scoring, minimizing examiner bias.

Answer 59

Multiple-choice, true/false, and matching questions.

Answer 60

To gather systematic information that supports diagnosis, treatment planning, placement, selection, monitoring progress, and evaluating outcomes.

Answer 61

Assessment helps counselors identify symptoms, evaluate their severity, and determine their impact on functioning, which guides diagnosis and treatment decisions.

Answer 62

It helps determine the severity of depressive symptoms, supports diagnosis of mood disorders, and informs treatment recommendations.

Answer 63

They provide standardized terminology that allows mental health professionals to communicate clearly about diagnosis and treatment.

Answer 64

The Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, published by the American Psychiatric Association.

Answer 65

Managed care organizations often require formal diagnoses and documentation to authorize and reimburse treatment.

Answer 66

Counselors use assessment data to determine the most appropriate program, service, or setting for a client.

Answer 67

Behavioral records, observations, and individualized achievement tests.

Answer 68

Assessments help determine eligibility for educational programs or institutions.

Answer 69

The GRE, which is often required for postgraduate program admission.

Answer 70

Assessments are used to select candidates for specific programs, positions, or jobs.

Answer 71

A battery of mechanical aptitude tests to evaluate suitability for the position.

Answer 72

It allows counselors to determine whether clients are moving toward counseling goals over time.

Answer 73

It can be administered periodically (e.g., intake, session 3, 5, 7) to track changes in depression severity.

Answer 74

Asking clients to rate symptoms (e.g., depression or anxiety) on a 1–10 scale at each session.

Answer 75

It provides ongoing, real-time insight into the client’s subjective experience and symptom changes.

Answer 76

Determining whether counseling interventions are effective overall, not just whether an individual client improves.

Answer 77

To demonstrate accountability and provide evidence that counseling leads to positive client change, especially for managed care.

Answer 78

No. Assessment is an ongoing process used throughout counseling for diagnosis, monitoring, and evaluation.

Answer 79

1. Diagnosis & treatment planning 2. Placement services 3. Admission decisions 4. Selection decisions 5. Monitoring client progress 6. Evaluating counseling outcomes

Answer 80

Outcome research evaluates the effectiveness of counseling by examining (a) the degree of client change and (b) the factors that contribute to client change.

Answer 81

It helps counselors determine whether counseling works, guides program improvement, and supports accountability to stakeholders and managed care.

Answer 82

Whiston (2016).

Answer 83

A specific counseling service, a particular intervention, or an entire counseling program.

Answer 84

A pretest–intervention–posttest design, comparing scores before and after counseling.

Answer 85

By interviewing participants and analyzing their experiences and narratives about counseling.

Answer 86

• All clients • A random sample • A specific subgroup (e.g., adolescents, women, Latino clients)

Answer 87

To increase variation in perspectives and improve experimental validity.

Answer 88

Established instruments with strong validity and reliability, often symptom-based, measuring change over time.

Answer 89

Yes, counselors may develop study-specific surveys when appropriate.

Answer 90

By determining whether changes are statistically significant, indicating the intervention was effective.

Answer 91

Through coding narratives and transcripts and identifying themes related to client change.

Answer 92

1. Do clients change? 2. What factors contribute to that change?

Answer 93

It promotes evidence-based practice, responsible decision-making, and continuous improvement of services.

Answer 94

Look for language about effectiveness, client change over time, pretest–posttest, or program evaluation.

Answer 95

1. Define the evaluation study focus (what service, intervention, or program is being evaluated) 2. Determine the evaluation design (e.g., pretest–posttest or qualitative interviews) 3. Select participants (all clients, random sample, or subgroup) 4. Select assessments (valid and reliable instruments or study-specific tools) 5. Analyze data (statistical analysis for quantitative data or thematic analysis for qualitative data)

Answer 96

To determine whether counseling is effective, assess client change over time, and identify factors that contribute to change, supporting evidence-based practice and accountability.

Answer 97

The Mental Measurements Yearbook (MMY), published by the Buros Institute of Mental Measurements, is the most comprehensive source for commercially available English-language assessments. It provides: • Test purpose and population • Administration and scoring procedures • Reliability and validity data • Norming information • Pricing and forms • Expert critical reviews (key distinction)

Answer 98

Tests in Print (TIP), also published by the Buros Institute of Mental Measurements, provides: • Titles of all published tests • Intended population • Author and publisher • Publication date and acronym ❌ Does NOT include: • Reliability or validity data • Norms • Expert critiques

Answer 99

Tests is a quick-reference directory of thousands of assessments that includes: • Purpose and major features • Target population • Administration time • Scoring method • Cost and availability ❌ Does NOT include: • Reliability • Validity • Norms • Expert critiques Used mainly for initial screening and selection of assessments.

Answer 100

Test Critiques, published by PRO-ED, provides: • Detailed descriptions of assessments • Administration and interpretation guidance • Reliability and validity information • In-depth expert reviews (≈8 pages) It is user-friendly, written for both professionals and non-experts, and is updated annually.

Answer 101

Validity refers to how accurately a test measures the construct it claims to measure. It answers the question: 👉 “Does this test measure what it is supposed to measure?”

Answer 102

Validity addresses: 1. What the instrument measures 2. How well it measures the construct 3. Whether meaningful inferences can be made from the test scores

Answer 103

Validity depends on: • Purpose of testing • Population being tested A test may produce valid scores for one group or purpose and invalid scores for another.

Answer 104

Validity varies depending on the test-taker and the intended use. Example: • An anxiety measure may show high validity for anxious adults • The same test may show low validity for disruptive children Therefore, validity must always be reported relative to the target population and purpose.

Answer 105

❌ Saying “This test is valid.” ✅ Correct phrasing: “Scores from this test demonstrate validity for this purpose with this population.”

Answer 106

Content validity is the extent to which a test’s items adequately represent all important areas of the construct’s domain. It is established by: • Clearly defining the domain • Ensuring all major content areas are included • Weighting items so more important areas have more items 📌 Judgment-based, not statistical

Answer 107

The test must include items representing: • Physical symptoms (e.g., sleep, appetite) • Psychological symptoms (e.g., sadness, loss of interest) • Cognitive symptoms (e.g., guilt, worthlessness, suicidal thoughts) If psychological symptoms are more central to depression, more items must measure those symptoms.

Answer 108

Criterion validity is the extent to which test scores are related to an external criterion of performance. It answers: 👉 “Does this test relate to a meaningful real-world outcome?” There are two types: • Concurrent validity • Predictive validity

Answer 109

Concurrent validity measures the relationship between: • Test scores now • A criterion measured at the same time 📌 Both collected simultaneously

Answer 110

Administer a depression test to adults while simultaneously collecting: • Hospital admission data for suicidal ideation If higher depression scores correlate with more admissions, concurrent validity is supported.

Answer 111

Predictive validity examines how well test scores predict a future outcome. • Test administered now • Criterion measured later 📌 Time delay is the key

Answer 112

Depression scores collected today are correlated with: • Number of hospitalizations for suicidal ideation two years later If scores predict future hospitalizations → predictive validity is supported.

Answer 113

Construct validity is the extent to which a test measures a theoretical construct (abstract concept). It is especially important for constructs like: • Personality • Intelligence • Depression • Anxiety

Answer 114

Construct validity is supported through: • Experimental design • Factor analysis • Convergent validity • Discriminant (divergent) validity

Answer 115

If a test truly measures a construct: • Scores should change in expected directions after treatment Example: • Depression scores decrease after effective counseling • If not → problem may be the test, design, or treatment

Answer 116

Factor analysis is a statistical technique that identifies latent (hidden) factors underlying test items. For construct validity: • Subscales must relate to the overall construct • Subscales must be related but not redundant

Answer 117

Convergent validity exists when a test correlates strongly with other measures of the same construct. 📌 “Measures that should be related are related.”

Answer 118

A new depression test shows a strong positive correlation with: • Beck Depression Inventory–II (BDI-II) This supports convergent validity.

Answer 119

Discriminant validity is established when a test does NOT correlate with measures of theoretically unrelated constructs.

Answer 120

By showing little to no correlation between the test and measures of unrelated traits or constructs.

Answer 121

A depression inventory shows no relationship with an achievement test, supporting discriminant validity.

Answer 122

Face validity refers to whether a test appears to measure what it claims to measure.

Answer 123

Because it is: • Superficial • Based on appearance only • Lacks empirical support

Answer 124

When the test items look like they measure depression (e.g., sadness, sleep problems, hopelessness).

Answer 125

Establishing multiple types of validity, including content, criterion, and construct validity.

Answer 126

• Convergent validity → related constructs are correlated • Discriminant validity → unrelated constructs are not correlated

Answer 127

False. Face validity does not provide empirical evidence of accuracy.

Answer 128

Discriminant validity

Answer 129

Validity is reported as a correlation coefficient between test scores and a criterion.

Answer 130

A validity coefficient is the correlation between a test score and a criterion measure. how well a test or measure accurately predicts or relates to a specific outcome (criterion), indicating its usefulness; it's a number from -1.0 to +1.0, where closer to 1.0 means stronger validity (e.g., a test score predicting job performance).

Answer 131

A stronger relationship between test scores and the criterion, meaning better predictive accuracy.

Answer 132

Validity can be reported using a regression equation.

Answer 133

To predict a future criterion score based on a current test score. models the relationship between variables, allowing you to predict a dependent variable (ŷ or Y) based on one or more independent variables (x or X) by finding the best-fit line through data points, with b₀ as the intercept (where the line crosses the Y-axis) and b₁ as the slope (how steep the line is). It helps understand trends, like how much a child grows yearly or how gas prices relate to other factors.

Answer 134

Predicting a college GPA from a student’s SAT score.

Answer 135

Because measurement error and imperfect validity are always present.

Answer 136

The standard error of estimate.

Answer 137

A statistic that indicates the expected margin of error in a predicted criterion score.

Answer 138

The imperfect validity of the test being used for prediction.

Answer 139

Predictions are more accurate and closer to actual criterion scores.

Answer 140

By examining the squared differences between actual scores and predicted scores, averaged across cases.

Answer 141

Standard error of estimate

Answer 142

True (theoretically—but this never occurs in practice).

Answer 143

• Validity coefficient • Standard error of estimate

Answer 144

The degree to which a test correctly supports counselor decisions about diagnosis, treatment, or placement.

Answer 145

Because counselors make real-world decisions (diagnosis, treatment, placement) that impact client outcomes.

Answer 146

An instrument’s ability to correctly identify the presence of a condition.

Answer 147

A depression inventory correctly identifies a depressed client as depressed.

Answer 148

An instrument’s ability to correctly identify the absence of a condition.

Answer 149

A depression inventory correctly identifies a non-depressed client as not depressed.

Answer 150

When a test incorrectly identifies the presence of a condition.

Answer 151

A depression inventory says a non-depressed client is depressed.

Answer 152

When a test incorrectly identifies the absence of a condition.

Answer 153

A depression inventory says a depressed client is not depressed.

Answer 154

False negatives (missing a real problem).

Answer 155

The ratio of total correct decisions to the total number of decisions.

Answer 156

How accurate the test is overall.

Answer 157

The extent to which a test adds predictive power beyond existing information or assessments.

Answer 158

A new aptitude test improves prediction of college GPA beyond SAT scores alone.

Answer 159

Low or none.

Answer 160

“Correctly identifies those WITH the condition”.

Answer 161

“Correctly identifies those WITHOUT the condition”.

Answer 162

Incremental validity.

Answer 163

Decision accuracy evaluates how well a test correctly identifies, excludes, and improves decisions about client conditions.

Answer 164

The consistency of scores obtained by the same person across repeated test administrations.

Answer 165

“Does the test give consistent results?”

Answer 166

Observed score = true score + error

Answer 167

X = T + e (X = observed score, T = true score, e = error)

Answer 168

• Instrument problems • Test-taker factors (anxiety, fatigue) • Testing environment (noise, distractions)

Answer 169

The amount of error in test scores.

Answer 170

The consistency of scores across time using the same test.

Answer 171

Temporal stability

Answer 172

For stable traits (e.g., intelligence).

Answer 173

• Memory effects • Practice effects • Longer time gaps ↓ correlation

Answer 174

Comparing scores from two equivalent versions of the same test.

Answer 175

Eliminates memory and practice effects.

Answer 176

True equivalence between forms is hard to achieve.

Answer 177

How consistently test items measure the same construct within one administration.

Answer 178

Correlation between two halves of the same test.

Answer 179

Shortening the test reduces reliability.

Answer 180

The Spearman–Brown Prophecy Formula

Answer 181

Estimates reliability for a full-length test from split-half data.

Answer 182

Correlation among all test items and the total score.

Answer 183

Kuder–Richardson Formula 20 (KR-20)

Answer 184

Cronbach’s alpha

Answer 185

Consistency of scores between two or more raters.

Answer 186

When scoring involves judgment or subjectivity.

Answer 187

Multiple clinicians independently scoring open-ended responses.

Answer 188

As a reliability coefficient (correlation).

Answer 189

High reliability (low error).

Answer 190

Presence of measurement error.

Answer 191

.80 – .95

Answer 192

Usually >.90

Answer 193

Can be below .90 and still acceptable.

Answer 194

Reliability reflects the consistency of test scores and freedom from measurement error.

Answer 195

A statistic that estimates how an individual’s repeated test scores are distributed around their true score.

Answer 196

Because a person’s true score is unknown and all test scores contain measurement error.

Answer 197

The standard deviation of an individual’s repeated scores on the same test.

Answer 198

SEM = SD × √(1 − r) SD = Standard Deviation r = reliability coefficient

Answer 199

Inverse relationship — higher reliability = smaller SEM.

Answer 200

SEM equals 0 (no measurement error).

Answer 201

As a confidence interval around the observed score.

Answer 202

±4 points from the mean (±2 SEM)

Answer 203

Longer tests are generally more reliable than shorter tests.

Answer 204

More homogeneous (similar-content) items increase reliability.

Answer 205

Limited score range lowers reliability.

Answer 206

More diverse test-takers increase reliability estimates.

Answer 207

Because most test-takers answer most items correctly.

Answer 208

Valid test scores are always reliable, but reliable scores are not always valid.

Answer 209

A statistical process examining individual test items to evaluate test quality.

Answer 210

To remove confusing, too easy, or too difficult items from future tests.

Answer 211

The percentage of test-takers who answer an item correctly.

Answer 212

Number correct ÷ total test-takers = p value

Answer 213

The item is very easy.

Answer 214

measures how well a test question separates high-performers from low-performers

Answer 215

Performance of top 25% minus performance of bottom 25%.

Answer 216

More high scorers answer correctly than low scorers.

Answer 217

Poor test items that should be revised or removed.

Answer 218

Good items separate people who have the trait from those who do not.

Answer 219

A framework that requires psychological constructs to be measurable in quality and quantity to be considered empirical.

Answer 220

To reduce test error and improve reliability and validity of scores.

Answer 221

The major models used to develop, evaluate, and interpret assessment instruments.

Answer 222

A psychometric theory stating that an observed score equals a true score plus error.

Answer 223

Observed score = True score + Error

Answer 224

To increase the reliability of test scores.

Answer 225

Total test scores rather than individual items.

Answer 226

A modern test theory that uses mathematical models to evaluate individual test items and test performance.

Answer 227

Modern test theory

Answer 228

How individual test items function across different levels of ability.

Answer 229

• Detecting item bias • Equating scores across different tests • Tailoring test items to individual test-takers

Answer 230

By examining whether items function differently for different groups (e.g., males vs. females).

Answer 231

It focuses on individual items rather than total test scores.

Answer 232

Samuel Messick (1995)

Answer 233

Validity is a single, holistic construct—not separate types.

Answer 234

It rejects separating validity into content, criterion, and construct components.

Answer 235

• Internal structural aspects • External aspects of validity

Answer 236

As an integrated evaluation of score meaning and score use.

Answer 237

A group of items combined to produce a composite score on a single variable.

Answer 238

A specific construct or variable.

Answer 239

Discrete variables and continuous variables.

Answer 240

Discrete: distinct categories Continuous: measured along a range

Answer 241

Data represented numerically.

Answer 242

Data represented in nonnumeric forms (e.g., Yes/No responses).

Answer 243

By summing or averaging responses across items.

Answer 244

They improve measurement reliability and allow constructs to be quantified.

Answer 245

Test theory provides the scientific foundation for reducing error and improving measurement quality.

Answer 246

IRT evaluates how individual test items function across levels of ability.

Answer 247

CTT focuses on total test scores and increasing reliability by reducing error.

Answer 248

Scales combine multiple items to measure a single construct.

Answer 249

Systems used to classify or measure characteristics of data.

Answer 250

1. Nominal 2. Ordinal 3. Interval 4. Ratio

Answer 251

A scale that names or categorizes data without order or equal intervals.

Answer 252

• Rank order • Equal intervals • Meaningful magnitude

Answer 253

Yes, but only as labels (e.g., male = 0, female = 1).

Answer 254

“Name only, no order.”

Answer 255

A scale that ranks data in order but does not assume equal intervals.

Answer 256

Rank order

Answer 257

Equal spacing between values.

Answer 258

Likert-type scale

Answer 259

A rating of 4 indicates more satisfaction than 3, but not twice as much.

Answer 260

“Ranked, unequal spacing.”

Answer 261

A scale with rank order and equal intervals but no true zero.

Answer 262

Equal distances between points.

Answer 263

An absolute zero point.

Answer 264

Temperature in Fahrenheit

Answer 265

Zero does not represent absence of the construct.

Answer 266

Interval scale

Answer 267

“Equal intervals, no true zero.”

Answer 268

A scale with rank order, equal intervals, and a true zero.

Answer 269

It includes all properties of nominal, ordinal, and interval scales.

Answer 270

Complete absence of the measured variable.

Answer 271

Because zero is absolute (e.g., 6 feet is twice 3 feet).

Answer 272

Natural sciences (e.g., weight, time, length).

Answer 273

“Equal intervals + true zero.”

Answer 274

Ordinal scale

Answer 275

Interval scale

Answer 276

Ratio scale

Answer 277

Nominal names, ordinal ranks, interval measures equal spacing, and ratio adds a true zero.

Answer 278

A scale commonly used to measure attitudes or opinions using graded response options.

Answer 279

A statement followed by response options ranging from Strongly Disagree to Strongly Agree.

Answer 280

Attitudes or opinions.

Answer 281

Strongly Disagree – Disagree – Neutral – Agree – Strongly Agree

Answer 282

Ordinal scale (often treated as interval in practice).

Answer 283

“Strongly agree to strongly disagree.”

Answer 284

A scale that measures attitudes by asking respondents to rate a concept between two opposite adjectives.

Answer 285

People think dichotomously (in opposites).

Answer 286

A line or continuum anchored by two opposing adjectives (e.g., Bad — Good).

Answer 287

“How do you feel about your NCE scores?” Bad __________ Good

Answer 288

Places a mark along the continuum between two adjectives.

Answer 289

“Opposite adjectives with a line between them.”

Answer 290

A scale that measures multiple dimensions of an attitude using agree/disagree responses.

Answer 291

Items are scaled to represent equal-appearing intervals of attitude strength.

Answer 292

Agree / Disagree

Answer 293

Paired comparison method.

Answer 294

Attitudes across multiple dimensions.

Answer 295

“Agree/disagree statements with equal-appearing intervals.”

Answer 296

A scale designed to measure the intensity or extremity of a variable.

Answer 297

From least extreme to most extreme.

Answer 298

Agreement with an extreme item implies agreement with all previous items.

Answer 299

Intensity or strength of an attitude.

Answer 300

Increasing levels of tolerance or acceptance.

Answer 301

“If you agree with the last item, you agree with all before it.”

Answer 302

Likert scale (Likert-type scale)

Answer 303

Semantic differential scale (self-anchored scale)

Answer 304

Thurstone scale (equal-appearing interval scale)

Answer 305

Guttman scale (cumulative scale)

Answer 306

Likert scales rate agreement, semantic differential scales rate between opposites, Thurstone scales measure attitudes with equal intervals, and Guttman scales measure intensity cumulatively.

Answer 307

Paired comparison method.

Answer 308

Attitudes across multiple dimensions.

Answer 309

“Agree/disagree statements with equal-appearing intervals.”

Answer 310

Thurstone scale (equal-appearing interval scale).

Answer 311

Guttman scale (cumulative scale).

Answer 312

Likert scales rate agreement, semantic differential scales rate between opposites, Thurstone scales measure attitudes with equal intervals, and Guttman scales measure intensity cumulatively.

Answer 313

A bell-shaped curve.

Answer 314

The normal curve (bell curve).

Answer 315

The left and right sides of the curve are mirror images.

Answer 316

At the center of the distribution.

Answer 317

At the extreme ends (tails) of the distribution.

Answer 318

The tails approach the horizontal axis but never touch it.

Answer 319

Extremely high or low scores are possible but very rare.

Answer 320

• Measures of central tendency • Measures of variability

Answer 321

Mean, median, and mode.

Answer 322

They are equal and located at the center.

Answer 323

They provide the mathematical foundation for score interpretation.

Answer 324

Derived scores are based on the mathematical properties of normal distributions.

Answer 325

They allow meaningful comparisons across individuals and tests.

Answer 326

• Comparing different clients on the same test • Comparing one client across multiple tests

Answer 327

Norm-referenced assessments.

Answer 328

• Percentile ranks • Normal curve equivalents • Stanines (standard nines) • z-scores (standard scores expressed in standard deviation units) • T scores (standard scores with a mean of 50 and a standard deviation of 10)

Answer 329

Because normal distributions provide predictable mathematical relationships.

Answer 330

“Bell curve makes score comparison possible.”

Answer 331

The normal distribution’s symmetry and mathematical properties allow raw scores to be converted into meaningful derived scores.

Answer 332

Norms are typical scores or performances used as a comparison standard for evaluating test scores.

Answer 333

A norm-referenced assessment compares an individual’s score to the average score (mean) of a norm group.

Answer 334

“How did this person perform compared to others?”

Answer 335

The norm group’s average score (mean).

Answer 336

They indicate an individual’s relative position within the norm group.

Answer 337

How well the individual performed compared to peers.

Answer 338

Raw scores lack meaning without comparison to a norm group.

Answer 339

Ivan’s score (67) is above the group mean (63), indicating above-average performance.

Answer 340

• GRE (Graduate Record Examination) • SAT (Scholastic Assessment Test) • ACT (American College Testing) • MCAT (Medical College Admission Test) • GMAT (Graduate Management Admission Test)

Answer 341

• Stanford–Binet Intelligence Scales • Wechsler intelligence tests

Answer 342

• MBTI (Myers–Briggs Type Indicator) • CPI (California Psychological Inventory)

Answer 343

measures a learner's performance against specific, predetermined standards or skills (criteria) rather than comparing them to other students, showing if they've mastered content like a driver's test or a state exam

Answer 344

“Did the person meet the standard?”

Answer 345

• Driver’s licensing exams • Professional licensure exams such as the NCE (National Counselor Examination) • High school graduation exams • CPCE (Counselor Preparation Comprehensive Examination)

Answer 346

An assessment that compares an individual’s current score to their own previous scores.

Answer 347

An internal (self-referenced) frame of reference.

Answer 348

Norm- and criterion-referenced assessments use external standards, while ipsative assessments use self-comparison.

Answer 349

• Physical education classes • Computer games • Fitness tracking

Answer 350

Norm-referenced assessments interpret scores by comparing individuals to a norm group, unlike criterion-referenced assessments (standard-based) and ipsative assessments (self-based).

Answer 351

A percentage score is the raw score divided by the total number of test items.

Answer 352

The number or proportion of test items answered correctly.

Answer 353

67 percent (67%).

Answer 354

Because it must be compared to a criterion or a norm group to be meaningful.

Answer 355

A percentile rank is the percentage of scores that fall at or below a given score in a norm group.

Answer 356

“What percentage of people scored the same as or lower than this person?”

Answer 357

• A percentage score = percent correct • A percentile rank = percent of people scoring below a given score

Answer 358

Less than 1 to greater than 99.

Answer 359

Because percentile ranks represent the percentage of scores below a given score, and the normal curve is asymptotic.

Answer 360

No, percentile ranks are not equal units of measurement.

Answer 361

They exaggerate small differences in raw scores near the mean.

Answer 362

They minimize differences in raw scores at the extremes.

Answer 363

Because the distances between percentile ranks are not equal across the scale.

Answer 364

• Mean • Standard deviation (standard deviation, SD) • Individual raw score

Answer 365

The 84th percentile.

Answer 366

• 50% of scores fall below the mean • 34% fall between the mean and +1 standard deviation • 50% + 34% = 84%

Answer 367

Percentile ranks indicate the percentage of scores at or below a given score, are norm-referenced, and are not equal-interval measurements.

Answer 368

The 84th percentile.

Answer 369

Standardization is the process of converting raw scores into standard scores using a norm group.

Answer 370

To create a typical (average) score that serves as a reference point for interpreting future test results.

Answer 371

A group of test-takers whose scores are used to establish norms for comparison.

Answer 372

Because scores are only meaningful when compared to a relevant and similar population.

Answer 373

Comparing a third-grade student to a fifth-grade norm group.

Answer 374

Converted raw scores that show how an individual performed relative to a norm group.

Answer 375

Norm-referenced assessments.

Answer 376

The number of standard deviations a score is above or below the mean.

Answer 377

They allow comparison across different tests and test administrations.

Answer 378

The standard deviation (standard deviation, SD).

Answer 379

A standardized score indicating how many standard deviations a raw score is from the mean.

Answer 380

Mean = 0, standard deviation = 1.

Answer 381

A standardized score derived from a z-score with a mean of 50 and a standard deviation of 10.

Answer 382

They eliminate negative numbers and decimals.

Answer 383

An intelligence score standardized with a mean of 100 and a standard deviation of 15.

Answer 384

Ratio intelligence quotient (ratio intelligence quotient).

Answer 385

A standardized score that divides the normal distribution into nine categories.

Answer 386

Mean = 5, standard deviation = 2.

Answer 387

A standardized score with a mean of 50 and a standard deviation of approximately 21.06.

Answer 388

They have equal-interval properties, unlike percentile ranks.

Answer 389

Standardized scores are equal-interval measures; percentile ranks are not.

Answer 390

The z-score (z-score, standard score).

Answer 391

Standardized scores convert raw scores into equal-interval measures that reflect distance from the mean in standard deviation units.

Answer 392

The most basic type of standardized score that expresses how many standard deviations a raw score is from the mean.

Answer 393

Mean = 0; standard deviation = 1.

Answer 394

The number of standard deviation units a score is above or below the mean.

Answer 395

The raw score is above the mean.

Answer 396

The raw score is below the mean.

Answer 397

The raw score is exactly at the mean.

Answer 398

z = (X − M) / SD (X = raw score, M = mean, SD = standard deviation)

Answer 399

The raw score, the mean, and the standard deviation.

Answer 400

X represents the individual’s raw score.

Answer 401

M represents the sample mean.

Answer 402

SD represents the sample standard deviation.

Answer 403

They show where a score falls relative to the mean in standard deviation units.

Answer 404

Approximately the 84th percentile rank.

Answer 405

The 50th percentile rank.

Answer 406

Approximately the 16th percentile rank.

Answer 407

They scored one standard deviation above the mean and above approximately 84% of peers.

Answer 408

The student scored exactly at the mean and at the 50th percentile rank.

Answer 409

Because most other standardized scores are calculated by transforming z-scores.

Answer 410

T scores (T scores, standard scores), deviation intelligence quotient (deviation IQ), stanine scores (standard nine), and normal curve equivalent scores (normal curve equivalent scores).

Answer 411

A z-score expresses a raw score as the number of standard deviations it lies above or below the mean.

Answer 412

A standardized score with a mean of 50 and a standard deviation of 10.

Answer 413

Personality, interest, and aptitude measures.

Answer 414

Mean = 50; standard deviation = 10.

Answer 415

By transforming a z-score (z-score, standard score).

Answer 416

T = 10(z) + 50 (z = z-score, standard score)

Answer 417

The raw score is above the mean.

Answer 418

The raw score is below the mean.

Answer 419

One standard deviation above the mean.

Answer 420

A T score of 50.

Answer 421

T = 10(−2) + 50 = 30.

Answer 422

The score is two standard deviations below the mean.

Answer 423

Approximately the 2nd percentile rank.

Answer 424

T scores express standardized performance with a mean of 50 and standard deviation of 10 and are commonly used in personality and interest testing.

Answer 425

A standardized score used primarily in intelligence testing with a mean of 100 and a standard deviation of 15.

Answer 426

Because they are commonly used to interpret achievement and aptitude test results.

Answer 427

Mean = 100; standard deviation = 15.

Answer 428

SS = 15(z) + 100 (z = z-score, standard score)

Answer 429

The raw score is above the mean.

Answer 430

The raw score is below the mean.

Answer 431

SS = 15(1) + 100 = 115.

Answer 432

One standard deviation above the mean.

Answer 433

In the same way—by how many standard deviations they fall above or below the mean.

Answer 434

Deviation IQ scores are standardized scores with a mean of 100 and a standard deviation of 15 used primarily in intelligence and achievement testing.

Answer 435

Scores that place an individual’s raw score along a developmental continuum to derive meaning.

Answer 436

Developmental scores describe location on a developmental continuum, whereas standard scores transform raw scores into a new mean and standard deviation.

Answer 437

An individual’s performance relative to others of the same age or grade level.

Answer 438

Children and young adolescents.

Answer 439

A developmental score that compares an individual’s performance to the average performance of individuals of the same age.

Answer 440

In chronological years and months.

Answer 441

As the age at which the average individual earns the same score.

Answer 442

The child is performing at the average level of children aged 8 years 2 months.

Answer 443

A developmental score that compares an individual’s performance to the average performance of students at a given grade level.

Answer 444

As a decimal representing grade level and months completed in that grade.

Answer 445

Performance equivalent to the average student who has completed 6 months of fifth grade.

Answer 446

The student is performing at the mean for her grade level.

Answer 447

Yes, they can show individual growth from year to year.

Answer 448

Because the student was not compared to students in the higher grade and grade-equivalent scores do not analyze specific skills.

Answer 449

Where an individual’s score falls relative to peers at the same grade level.

Answer 450

The student is performing higher than most seventh-grade peers in math.

Answer 451

That the student is ready for tenth-grade math.

Answer 452

They are often misinterpreted as indicators of readiness or ability.

Answer 453

Developmental scores describe where a person falls on an age or grade continuum but do not measure skill mastery or placement readiness.

Answer 454

A collection of tests that measure across broad content areas rather than one subject.

Answer 455

To assess general academic progress.

Answer 456

School settings.

Answer 457

They do not assess any single subject in depth.

Answer 458

Measures academic knowledge across multiple subject areas.

Answer 459

For Students in kindergarten through eighth grade. a series of nationally standardized achievement tests for students in K-8 (and formerly high school) measuring core subjects like reading, math, science, and language arts against national norms

Answer 460

High school students.

Answer 461

Broad assessment of achievement across kindergarten through twelfth grade. a series of standardized tests used to measure K-12 students' knowledge and skills in core subjects like reading, math, and language arts, providing educators and parents data to track progress, identify strengths/weaknesses, and inform instruction

Answer 462

Broad-based achievement testing with multiple versions, including Common Core and Spanish editions. Spanish Text-to-Speech (TTS) for math sections and sometimes provide Spanish language accommodations, helping English Language Learners (ELLs) access content, but it's not a full Spanish language proficiency test itself, rather a tool for Spanish speakers or learners within the general assessment framework.

Answer 463

Tests designed to identify learning disabilities and specific academic skill deficits.

Answer 464

Diagnostic tests provide in-depth analysis of specific strengths and weaknesses.

Answer 465

brief, reliable assessment tool for ages 5-94 that measures foundational academic skills in reading (word decoding, sentence comprehension), spelling, and math computation, used by educators and clinicians to diagnose learning disabilities, track progress, and guide interventions, offering a quick, efficient snapshot of basic academic functioning.

Answer 466

Comprehensive assessment of math-related learning disabilities.

Answer 467

Detailed assessment of reading, writing, and math aligned with Individuals with Disabilities Education Act (IDEA) categories.

Answer 468

Screening for learning disabilities in reading, math, and spelling. This test is. standardized, individually given test assessing academic skills in areas like reading, math, spelling, and general knowledge for students K-12 (or up to age 22 with the Normative Update, PIAT-R/NU). It's designed to be a low-pressure, conversational, multiple-choice test that identifies a student's academic strengths and weaknesses, helping educators and parents understand overall achievement and specific learning needs.

Answer 469

Adults aged 16 years and older seeking to improve basic skills.

Answer 470

Criterion-referenced achievement tests indicating minimum skills needed to advance to the next level.

Answer 471

Cultural and language bias affecting students from lower socioeconomic status and non-English-speaking homes.

Answer 472

Verbal, quantitative, and nonverbal reasoning abilities.

Answer 473

Abstract thinking and reasoning abilities.

Answer 474

Predicts readiness for college-level academic work.

Answer 475

Critical reading, mathematical reasoning, and writing skills.

Answer 476

Graduate school success.

Answer 477

Analogy-based assessment of analytical reasoning.

Answer 478

Reading comprehension, analytical reasoning, and logical reasoning.

Answer 479

Scientific knowledge, problem solving, and critical thinking.

Answer 480

Predictive testing measuring potential for occupational success.

Answer 481

Multiple aptitude tests and special aptitude tests.

Answer 482

Measures multiple abilities for military and civilian job placement.

Answer 483

Students in grades seven through twelve.

Answer 484

One specific, homogeneous area of aptitude.

Answer 485

Ages 2 through 90 years.

Answer 486

Mean of 100 and standard deviation of 15.

Answer 487

Most widely used intelligence tests with age-specific versions.

Answer 488

Ages 16 through 89 years.

Answer 489

Ages 6 through 16 years.

Answer 490

Ages 2 years 6 months through 7 years 3 months.

Answer 491

Minority children and children with learning disabilities.

Answer 492

Luria neuropsychological model and Cattell–Horn–Carroll (CHC) theory.

Answer 493

Assessment batteries differ by purpose—survey for breadth, diagnostic for depth, readiness for minimum competency, aptitude for prediction, and intelligence for cognitive functioning.

Answer 494

Test results were used to evaluate school progress and accountability.

Answer 495

Scores can determine college credit and placement.

Answer 496

Passing is required to receive a diploma.

Answer 497

Passing determines legal permission to drive.

Answer 498

Passing determines entry into a profession.

Answer 499

A single defined assessment determines the outcome.

Answer 500

A clear pass–fail cutoff score.

Answer 501

Test results have direct, real-world consequences.

Answer 502

High stakes testing uses criterion-referenced standardized tests as the sole basis for major educational or professional decisions with significant consequences.

Answer 503

A “whole person” assessment that evaluates clients using multiple methods such as testing, observation, interviewing, and performance.

Answer 504

To increase client self-awareness and assist counselors with case conceptualization and treatment planning.

Answer 505

• Personality • Behavior • Affect • Cognition • Functioning • Risk (e.g., suicide)

Answer 506

Because it integrates multiple data sources rather than relying on a single test.

Answer 507

The affective realm, including stable traits such as temperament and behavior patterns.

Answer 508

Traits and patterns that remain consistent through adulthood.

Answer 509

• Objective personality tests • Projective personality tests

Answer 510

Standardized self-report instruments using structured response formats such as multiple-choice or true/false.

Answer 511

• Identify personality traits, types, and states • Assess self-concept • Detect psychopathology • Assist with treatment planning

Answer 512

They have standardized administration, scoring, and interpretation.

Answer 513

A test used to identify adult psychopathology and assist in diagnosis.

Answer 514

• 567 true/false items • Adult population • 10 clinical scales • Multiple validity scales

Answer 515

Response distortion such as lying, defensiveness, exaggeration, or inconsistency.

Answer 516

A test that assesses Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) personality disorders and clinical syndromes in adults.

Answer 517

A personality inventory based on Carl Jung’s psychological types, often used for self-awareness and career counseling.

Answer 518

• Extraversion vs. Introversion • Sensing vs. Intuition • Thinking vs. Feeling • Judging vs. Perceiving

Answer 519

A measure of normal, nonpathological personality traits.

Answer 520

Well-adjusted individuals; often used for vocational prediction.

Answer 521

A test measuring 16 basic personality traits in normal populations, based on Raymond Cattell’s theory.

Answer 522

A measure of normal personality based on the Big Five personality traits.

Answer 523

• Neuroticism • Extraversion • Openness to experience • Agreeableness • Conscientiousness

Answer 524

A test designed to measure self-esteem in children and adolescents.

Answer 525

Assessments that interpret responses to ambiguous stimuli to reveal unconscious thoughts and motivations.

Answer 526

Psychoanalytic theory.

Answer 527

Clients are unaware of what is being assessed, reducing conscious response distortion.

Answer 528

A projective test using 10 inkblot cards to assess personality and thought processes.

Answer 529

• Location • Determinants • Content

Answer 530

A projective test where clients create stories about ambiguous pictures.

Answer 531

Lack of a standardized scoring system.

Answer 532

A projective drawing technique used to interpret personality characteristics.

Answer 533

Projective tests requiring clients to complete unfinished statements.

Answer 534

No, interpretation is subjective.

Answer 535

Projective tests use ambiguous stimuli to uncover unconscious processes but lack standardization and reliability.

Assessment Flashcards

(571 cards)