CAP 2.3: Reliability & Validity (REVISE) Flashcards

(21 cards)

1
Q

What is prevalence?

A

Prevalence = Number of Existing Cases/ Total Population

Often expressed as:

  • Percentage (%)
  • Per 1,000 people
  • Per 100,000 people

📊 What Is Prevalence?

🧠 Simple Definition

Prevalence is:

> The proportion of a population that has a disease (or condition) at a specific time.

————– | ————————— |
| Existing cases | New cases |
| Snapshot | New cases over time |
| “How common?” | “How fast is it occurring?” |

🎯** Types of Prevalence**

1️⃣* Point Prevalence*

Number of cases at a specific moment in time

Example:

> On 1 January 2026, 50 out of 1,000 people have depression.
Point prevalence = 5%.

2️⃣* Period Prevalence*

Number of people who had the condition at any time during a specified period

Example:
> During 2025, 120 out of 1,000 people had at least one depressive episode.
> Period prevalence = 12%.

3️⃣* Lifetime Prevalence*

Proportion who have ever had the condition in their lifetime.

Example:
> 15% of adults report ever having major depressive disorder.

🩺 Clinical Examples

1️⃣ Major Depressive Disorder

If 80 out of 1,000 people in a community currently meet criteria:

Prevalence = 8%.

This tells us how common depression is right now.

2️⃣ Schizophrenia

If 1 in 100 people have schizophrenia at a given time:

Prevalence = 1%.

3️⃣ Hypertension

If 300 out of 1,000 adults in a clinic have hypertension:

Prevalence = 30%.

🧠 Prevalence vs Incidence (Very Important)

Example

A rare cancer:

  • Few new cases per year (low incidence)
  • People live long with it → many existing cases (high prevalence)

Cholera:

  • Many new cases during outbreak (high incidence)
  • Rapid recovery or death → low prevalence

👶 10-Year-Old Explanation

Imagine 100 kids in a school.

If 10 kids have the flu today:

Prevalence = 10%.

It just tells you:

> “How many kids are sick right now?”

It doesn’t tell you when they got sick.

🧠 Why It Matters Clinically

Prevalence helps with:

  • Healthcare planning
  • Resource allocation
  • Public health policy
  • Understanding disease burden

High prevalence = common condition
Low prevalence = rare condition

⚡ Ultra-Short Exam Summary

Prevalence is the proportion of a population that has a condition at a specific time or during a specified period, reflecting how common the disease is.

| Prevalence | Incidence |

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is incidence?

A

📈 What Is Incidence?

🧠 Simple Definition

Incidence is:

> The number of new cases of a disease that develop in a population over a specified period of time.

It tells you how fast a disease is occurring.

📐 Two Main Types of Incidence

1️⃣ Cumulative Incidence (Risk)

Cumulative incidence = New cases during a time period/ population at risk at start

It answers:

> “What is the probability that someone develops the disease during this time?”

2️⃣ Incidence Rate (Incidence Density)

incidence rate = new cases/ total person-time at risk

It accounts for different follow-up times.

Used often in cohort studies.

🩺 Clinical Examples

1️⃣ First Episode Psychosis

A town has 10,000 people.

Over 1 year, 50 people develop first episode psychosis.

Incidence = 50/10,000 = 0.5%

This tells us the risk of developing psychosis that year.

2️⃣ Postpartum Depression

Among 1,000 new mothers followed for 6 months,
120 develop postpartum depression.

Incidence over 6 months = 12%.

3️⃣ Medication Side Effect

In a clinical trial of 500 patients on a new drug,
25 develop liver toxicity during follow-up.

Incidence = 5%.

This tells us the risk of developing that side effect.

🧠 Incidence vs Prevalence (Very High-Yield)

Example

Rare cancer:

  • Few new cases per year → Low incidence
  • Patients survive long → Higher prevalence

Food poisoning outbreak:

  • Many new cases quickly → High incidence
  • Short duration → Low prevalence at a given time

👶 10-Year-Old Explanation

Imagine 100 kids in a school.

At the start of the week, none have the flu.

During the week, 10 kids catch it.

Incidence = 10%.

It answers:

> “How many NEW kids got sick this week?”

🎯 Why Incidence Matters Clinically

Incidence helps:

  • Identify risk factors
  • Study causes of disease
  • Measure treatment side effects
  • Plan prevention strategies

It is essential in cohort studies.

⚡ Ultra-Short Exam Summary

Incidence is the number of new cases of a disease occurring in a population over a specified period of time and reflects the risk of developing the condition.

Incidence | Prevalence |
| —————— | ————— |
| New cases | Existing cases |
| Measures risk | Measures burden |
| Has time component | Snapshot |
| “How fast?” | “How common?” |

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Compare incidence and prevalence.

1) What cases do they focus on? (existing or new)
2) What do they measure?
3) How do they relate to time?
4) What is the clinical question they answer?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the crude mortality rate?

A

📐 Formula

Crude mortality rate = number of deaths for a given time, period and place/ mid-period population during the same time period and at the same place

It is usually expressed per:

  • 1,000 people per year
  • or 100,000 people per year

☠️ What Is the Crude Mortality Rate?

🧠 Simple Definition

The crude mortality rate (also called the crude death rate) is:

> The total number of deaths in a population during a specified time period, divided by the total population.

It tells you how many people die in a population overall, without adjusting for age, sex, or other factors.

🩺 Clinical Examples

1️⃣ National Mortality

In a country with 5,000,000 people,
50,000 deaths occur in one year.

[
\frac{50,000}{5,000,000} = 0.01
]

Crude mortality rate = 10 per 1,000 per year

This tells us overall death frequency in the population.

2️⃣ Hospital Mortality

In a hospital with 10,000 admissions in a year,
200 patients die.

[
\frac{200}{10,000} = 0.02
]

Crude mortality rate = 2%

This gives a general idea of hospital mortality burden.

3️⃣ Psychiatric Cohort Study

You follow 1,000 patients with schizophrenia for 1 year.
15 die during that year.

Crude mortality rate = 15 per 1,000 per year.

⚠️ Why It’s Called “Crude”

It is called crude because:

It does not adjust for:

  • Age distribution
  • Sex distribution
  • Risk factors

Two countries could have different crude mortality rates simply because:

  • One has more elderly people

Even if healthcare quality is identical.

🧠 Example of Why Adjustment Matters

Country A:

  • Very old population
  • Higher crude mortality

Country B:

  • Younger population
  • Lower crude mortality

The difference may reflect age structure — not healthcare quality.

That’s why we use age-standardised mortality rates when comparing populations.

👶 10-Year-Old Explanation

Imagine a school with 1,000 students.

If 5 students die this year (hypothetically):

Crude mortality rate = 5 per 1,000.

It just answers:

> “How many people died out of everyone?”

It doesn’t ask:

  • How old they were
  • Why they died
  • Whether some groups were at higher risk

It’s just the total.

🎯 Ultra-Short Exam Summary

The crude mortality rate is the total number of deaths in a population during a specified time period divided by the total population, typically expressed per 1,000 or 100,000 people, and does not adjust for demographic differences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Briefly compare crude mortality rate to:
1) Age specific mortality rate
2) Case fatality rate
3) Cause-specific mortality rate
4) Standardised mortality rate
5) Infant mortality rate

A

See table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is validity? What is reliability? Compare the two.

A

🎯 What Is Validity?

🧠 Simple Definition

Validity means:

> Does the test measure what it is supposed to measure?

If a depression scale claims to measure depression, validity asks:

> “Is it actually measuring depression?”

🩺 Clinical Examples of Validity

1️⃣ Depression Rating Scale

If a questionnaire measures:

  • Low mood
  • Anhedonia
  • Guilt
  • Sleep disturbance

And correlates well with clinical diagnosis → good construct validity.

If it mostly measures anxiety instead → poor validity.

2️⃣ Blood Test for Lithium

If the test truly reflects serum lithium concentration → valid.

If it is affected by unrelated substances → not valid.

3️⃣ Cognitive Screening Tool

If a dementia screening test accurately identifies patients with Alzheimer’s disease → good validity.

If it misses most cases → poor validity.

🔁 What Is Reliability?

🧠 Simple Definition

Reliability means:

> Does the test give consistent results?

If you repeat it under the same conditions, do you get the same answer?

🩺 Clinical Examples of Reliability

1️⃣ Blood Pressure Measurement

If you measure BP three times and get similar readings → reliable.

If readings vary wildly → unreliable.

2️⃣ Psychiatric Interview

If two psychiatrists diagnose the same patient and agree → good inter-rater reliability.

3️⃣ Questionnaire

If a patient completes a depression scale today and tomorrow (with no clinical change) and scores are similar → good test–retest reliability.

📊 Types of Reliability

  • Test–retest reliability → stability over time
  • Inter-rater reliability → agreement between observers
  • Internal consistency → items measure same construct (e.g., Cronbach’s alpha)

📊 Compare Validity vs Reliability

🎯 Key Concept

A test can be:

  • Reliable but not valid
  • But it cannot be valid if it is not reliable

Because:

If results jump around randomly,
they can’t be accurate.

🎯 Classic Dartboard Analogy

🎯 Reliable but not valid:
All darts land tightly grouped — but far from the bullseye.

🎯 Valid and reliable:
All darts land tightly at the bullseye.

🎯 Neither:
Darts scattered everywhere.

🧠 Clinical Scenario Example

Suppose a new depression scale:

  • Gives similar results each time (reliable)
  • But actually measures anxiety more than depression

→ Reliable but not valid.

👶 10-Year-Old Explanation

Validity =
“Are we hitting the right target?”

Reliability =
“Are we hitting the same spot every time?”

If you always hit the wrong target,
you’re reliable — but not valid.

⚡ Ultra-Short Exam Summary

Validity refers to whether a test accurately measures what it is intended to measure, whereas reliability refers to the consistency and reproducibility of the measurement; reliability is necessary but not sufficient for validity.

Feature | Validity | Reliability |
| —————————— | ———————————– | ———————————– |
| Meaning | Accuracy | Consistency |
| Question | “Are we measuring the right thing?” | “Are we measuring it consistently?” |
| Can be high without the other? | No (must have reliability first) | Yes |
| Example | Correct diagnosis | Same diagnosis repeatedly |

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is face/content validity?

A

🎭 What Is Face Validity?

🧠 Simple Definition

Face validity asks:

> On the surface, does this test look like it measures what it’s supposed to measure?

It is about appearance.

It is subjective and based on judgement.

🩺 Clinical Example

A depression questionnaire includes:

  • “I feel sad”
  • “I have lost interest in things”
  • “I feel hopeless”

That looks like a depression test → good face validity.

If a “depression test” asks:

  • “Do you enjoy spicy food?”

It has poor face validity.

📚 What Is Content Validity?

🧠 Clear Definition (Using Your Required Phrase)

Content validity is:

> The degree to which the test’s content is related to what the test is supposed to measure —
i.e., are the criteria used overtly related to the diagnosis?

In other words:

Does the test cover all the important parts of the condition?

🩺 Clinical Examples

1️⃣ Depression Scale

If DSM-5 criteria include:

  • Low mood
  • Anhedonia
  • Sleep disturbance
  • Appetite change
  • Guilt
  • Concentration problems

A test with questions covering all these domains → good content validity.

If it only asks about sleep and appetite → poor content validity.

2️⃣ ADHD Screening Tool

If it includes:

  • Inattention
  • Hyperactivity
  • Impulsivity

Good content validity.

If it only asks about hyperactivity → incomplete.

3️⃣ Cognitive Screening Tool

If dementia criteria include:

  • Memory
  • Executive function
  • Language
  • Visuospatial skills

A screening tool covering all domains → strong content validity.

🧠 Face vs Content Validity — Comparison

(See the table)

👶 10-Year-Old Explanation

Imagine a maths test.

Face validity:

> Does it look like a maths test?

Content validity:

> Does it test addition, subtraction, multiplication, and division —
or just addition?

🧠 Memory Hook

🎭 Face validity = “Does it look right to your face?”

📦 Content validity = “Does it cover the whole contents of the box?”

Think:

Face = appearance
Content = coverage

🎯 Ultra-Short Exam Summary

Face validity refers to whether a test appears to measure the intended construct, while content validity refers to the degree to which the test’s content comprehensively represents what it is supposed to measure — that is, whether the criteria used are overtly related to the diagnosis.

Feature | Face Validity | Content Validity |

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is criterion validity?

A

🏆 What Is Criterion Validity?

🧠 Clear Definition

Criterion validity is:

> The extent to which test scores correlate with another direct and independent measure (criterion) of what the test is supposed to measure.

In simpler terms:

> Does this test agree with a trusted gold standard?

🎯 What Is the “Criterion”?

The criterion is:

  • A well-established, trusted measure
  • Often considered the “gold standard”

Examples:

  • Structured diagnostic interview (e.g., SCID)
  • Laboratory-confirmed test
  • Long-term clinical outcome

🩺 Clinical Examples

1️⃣ New Depression Scale

A new questionnaire is developed.

Researchers compare it with:

  • A psychiatrist’s structured clinical diagnosis

If high scores on the new test strongly correlate with formal diagnosis → good criterion validity.

2️⃣ Cognitive Screening Tool

A new dementia screen is compared with:

  • Comprehensive neuropsychological testing (gold standard)

If the scores strongly match → good criterion validity.

3️⃣ Suicide Risk Tool

A risk scale is developed.

Researchers test whether high scores predict:

  • Future suicide attempts

If it successfully predicts future behaviour → good criterion validity.

📊 Types of Criterion Validity

1️⃣ Concurrent Validity

Measured at the same time.

Example:
New anxiety scale compared with established anxiety inventory on the same day.

2️⃣ Predictive Validity

Predicts future outcomes.

Example:
Low MMSE score predicts dementia diagnosis 5 years later.

👶 10-Year-Old Explanation

Imagine you invent a new maths test.

You want to know:

> Is it any good?

So you compare it to the official national maths exam.

If kids who score high on your test also score high on the official exam — your test is good.

That’s criterion validity.

🧠 Why It Matters Clinically

Criterion validity tells us whether a new tool:

  • Actually works
  • Can replace an existing gold standard
  • Has clinical usefulness

Without it, a test might look good — but not actually measure the real condition.

🧠 Memory Hook

🏆 Criterion = Champion

Ask:

> “Does this test match the champion (gold standard)?”

Or:

Criterion validity =
“Does it agree with the king?”

If it correlates strongly with the gold standard → strong criterion validity.

🎯 Ultra-Short Exam Summary

Criterion validity refers to the extent to which a test’s scores correlate with an independent gold-standard measure of the same construct, demonstrating that the test accurately reflects the intended outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is concurrent validity?

A

⏱️ What Is Concurrent Validity?

🧠 Clear Definition

Concurrent validity is a type of criterion validity.

It:

> Compares the measure being assessed with an external valid yardstick at the same time.

In simple terms:

> Does this new test agree with a trusted gold standard when both are measured now?

🎯 What Is the “Yardstick”?

The yardstick is:

  • An established, valid measure
  • A gold-standard assessment
  • A trusted diagnostic method

Both tests are administered at the same time.

🩺 Clinical Examples

1️⃣ New Depression Scale

A new depression questionnaire is developed.

Researchers give patients:

  • The new questionnaire
  • A structured clinical interview (SCID)

On the same day.

If the scores strongly correlate → good concurrent validity.

2️⃣ New Cognitive Screening Tool

A new short cognitive screen is compared with:

  • Full neuropsychological testing

Administered at the same appointment.

If results match closely → strong concurrent validity.

3️⃣ Anxiety Scale

A new anxiety inventory is compared with:

  • Hamilton Anxiety Rating Scale

Given at the same visit.

Strong correlation → good concurrent validity.

🧠 Why It Matters

Concurrent validity helps determine:

  • Whether a new test can replace a longer or more expensive one
  • Whether it works as well as the gold standard
  • Whether it is clinically useful right now

👶 10-Year-Old Explanation

Imagine you invent a new thermometer 🌡️

To check if it works:

You use your thermometer and a hospital thermometer at the same time.

If they show the same temperature → your thermometer is good.

That’s concurrent validity.

🧠 Memory Hook

⏱️ Concurrent = Current

Think:

> “Are they agreeing right now?”

Or:

📏 Concurrent validity = “Compare to the yardstick now.”

Same time
Same patients
Same condition

🎯 Ultra-Short Exam Summary

Concurrent validity is a type of criterion validity that assesses whether a new measure correlates with an established external gold-standard measure when both are assessed at the same time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is incremental validity?

A

📈 What Is Incremental Validity?

🧠 Clear Definition

Incremental validity refers to:

> Whether a new test adds meaningful predictive value beyond existing measures.

Or, using your required phrasing in simple language:

> It indicates whether the test is superior to other measurements in approaching true validity.

In other words:

> Does this test improve prediction above and beyond what we already have?

🎯 The Core Question

Incremental validity asks:

> If we already have Test A, does adding Test B actually improve accuracy?

If it does → Test B has incremental validity.
If it doesn’t → Test B adds nothing useful.

🩺 Clinical Examples

1️⃣ Suicide Risk Assessment

You already have:

  • Clinical interview
  • History of prior attempts

A new suicide risk scale is developed.

Researchers ask:

> Does this scale predict suicide attempts better than clinical judgement alone?

If it significantly improves prediction → good incremental validity.

If it adds no improvement → poor incremental validity.

2️⃣ Cardiovascular Risk Prediction

You already measure:

  • Age
  • Blood pressure
  • Cholesterol

A new genetic marker is introduced.

Does adding the genetic test improve risk prediction?

If yes → incremental validity demonstrated.

3️⃣ ADHD Diagnosis

You already use:

  • Parent report
  • Teacher report

You add:

  • Neuropsychological testing

If diagnostic accuracy improves significantly → incremental validity present.

🧠 Why It Matters Clinically

Healthcare resources are limited.

If a new test:

  • Is expensive
  • Time-consuming
  • Complex

But does not improve prediction → it’s not worth using.

Incremental validity ensures:

> We are adding something useful — not redundant.

👶 10-Year-Old Explanation

Imagine you’re guessing someone’s test score.

You already know:

  • How much they studied

Then someone tells you:

  • Their favourite colour

Does knowing their favourite colour improve your prediction?

No → no incremental validity.

But if someone tells you:

  • How much sleep they got

And that improves your guess → that adds something useful.

That’s incremental validity.

🧠 Memory Hook

Incremental = “Does it add something?”

Think:

> “Does this test bring extra power?”

Or:

🧩 Incremental validity = “Does this new piece improve the puzzle?”

If it doesn’t improve prediction, it doesn’t earn its place.

🎯 Ultra-Short Exam Summary

Incremental validity refers to the extent to which a new test improves prediction or measurement beyond existing assessments, indicating whether it adds meaningful value in approaching true validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is cross-validity?

A

🔁 What Is Cross-Validity?

🧠 Clear Definition

Cross-validity refers to:

> Whether a test that has demonstrated criterion validity in one sample maintains that validity when applied to a different sample.

Or, using your required phrasing in simple language:

> It determines whether, after establishing criterion validity for one sample, the test maintains criterion validity when applied to another sample.

In plain terms:

> Does this test still work when we try it on new people?

🎯 Why Cross-Validity Matters

A test might work very well in:

  • The original research group
  • A specific hospital
  • A specific country

But:

Does it still work in a different population?

If yes → good cross-validity.
If no → it may have been overfitted.

🩺 Clinical Examples

1️⃣ Suicide Risk Prediction Model

Researchers develop a risk model in Hospital A.

It strongly predicts suicide attempts (good criterion validity).

Now they test it in Hospital B.

If it still predicts well → strong cross-validity.

If it performs poorly → weak cross-validity.

2️⃣ Depression Screening Tool

A new depression scale works well in a university sample.

Researchers then test it in:

  • Older adults
  • Primary care patients
  • Different countries

If it still correlates well with clinical diagnosis → good cross-validity.

3️⃣ Cardiovascular Risk Score

A risk score is developed in a European population.

It is then tested in:

  • Asian populations
  • Indigenous populations
  • Rural settings

If it maintains predictive accuracy → good cross-validity.

🧠 What Problem Does It Prevent?

It prevents:

  • Overfitting
  • Sample-specific bias
  • False confidence in a model

A model might look brilliant — but only in the original dataset.

Cross-validation tests:

> Is it robust?

👶 10-Year-Old Explanation

Imagine you invent a maths test.

You try it on your own class — it works great.

But when you try it in another school, it doesn’t work.

Cross-validity asks:

> “Does it still work in another classroom?”

If yes → it’s a good test.
If not → it only worked by luck.

🧠 Memory Hook

🔁 Cross-validity = “Does it cross over?”

Think:

> “Does the test survive outside its home turf?”

Or:

🏠 Developed here
🌍 Tested elsewhere

If it still works → strong cross-validity.

🎯 Ultra-Short Exam Summary

Cross-validity assesses whether a test that has demonstrated criterion validity in one sample maintains its predictive or diagnostic validity when applied to a different population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is predictive validity?

A

🔮 What Is Predictive Validity?

🧠 Clear Definition

Predictive validity is a type of criterion validity.

It:

> Asks whether the diagnosis or test score can be used to accurately predict future outcomes or other patient-related events.

Or using your required phrasing clearly:

> It asks: can the diagnosis be used to accurately predict outcome or other patient-related events?

In simple terms:

> Does this test tell us what will happen later?

🎯 The Core Idea

You measure something now
And check whether it predicts something in the future.

If it does → strong predictive validity.

🩺 Clinical Examples

1️⃣ MMSE & Dementia

A low MMSE score today predicts:

  • Diagnosis of dementia 5 years later

If early score predicts later outcome → good predictive validity.

2️⃣ Suicide Risk Assessment

A suicide risk scale is administered in the emergency department.

Researchers follow patients for 12 months.

If higher scores predict future suicide attempts → strong predictive validity.

3️⃣ Cardiovascular Risk Score

A risk calculator predicts:

  • 10-year risk of heart attack

If high predicted risk corresponds to actual heart attacks → good predictive validity.

4️⃣ ADHD Diagnosis in Childhood

Children diagnosed with ADHD are followed into adulthood.

If diagnosis predicts:

  • Academic difficulties
  • Occupational impairment

→ strong predictive validity.

🧠 Why It Matters Clinically

Predictive validity helps us:

  • Identify high-risk patients
  • Plan interventions
  • Guide treatment decisions
  • Allocate resources

Without predictive validity, a diagnosis may describe the present but not inform the future.

🆚 Predictive vs Concurrent Validity

Concurrent = agreement now
Predictive = prediction later

👶 10-Year-Old Explanation

Imagine a teacher gives a reading test.

If kids who score high today also do well in school next year,

The test has predictive validity.

It means:

> “This test tells us what will happen later.”

🧠 Memory Hook

🔮 Predictive = “Crystal Ball Validity”

Ask:

> “Can this test see the future?”

If today’s score predicts tomorrow’s outcome → predictive validity.

🎯 Ultra-Short Exam Summary

Predictive validity refers to the extent to which a test or diagnosis accurately predicts future outcomes or patient-related events, demonstrating its usefulness in forecasting clinical course.

| ——————- | ———————— |

Concurrent validity | Same time |
| Predictive validity | Future |

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is convergent validity?

A

🔗 What Is Convergent Validity?

🧠 Clear Definition

Convergent validity is established when:

> Measures that are expected to be correlated are indeed found to be associated.

In simple terms:

> If two tests are supposed to measure the same thing, they should agree.

If they do → good convergent validity.

🎯 The Core Idea

If Test A and Test B both measure depression:

Their scores should correlate.

If they don’t → something is wrong.

🩺 Clinical Examples

1️⃣ Depression Scales

A new depression questionnaire is developed.

Researchers compare it with:

  • Hamilton Depression Rating Scale (HAM-D)
  • Beck Depression Inventory (BDI)

If the new test strongly correlates with both → good convergent validity.

2️⃣ Anxiety Measures

A new anxiety scale should correlate with:

  • Physiological arousal (e.g., heart rate under stress)
  • Established anxiety inventories

If correlations are strong → convergent validity demonstrated.

3️⃣ ADHD Assessment

A child ADHD rating scale should correlate with:

  • Teacher reports
  • Parent reports
  • Observed behavioural hyperactivity

If they move together → good convergent validity.

🧠 Why It Matters

Convergent validity helps show:

  • The test measures the intended psychological construct
  • It behaves the way theory predicts
  • It aligns with similar tools

It strengthens construct validity.

🆚 Convergent vs Discriminant Validity

Example:

A depression scale:

  • Should correlate with other depression scales (convergent)
  • Should not strongly correlate with intelligence (discriminant)

👶 10-Year-Old Explanation

Imagine two thermometers measuring temperature.

If both say it’s hot → they agree.

That’s convergent validity.

If one says hot and the other says freezing → something’s wrong.

🧠 Memory Hook

🤝 Convergent = “They Converge.”

Think:

> “If they measure the same thing, they should meet in the middle.”

Or:

🧲 Convergent validity = “Similar things stick together.”

🎯 Ultra-Short Exam Summary

Convergent validity is established when measures that are theoretically expected to correlate are indeed found to be associated, supporting that the test measures the intended construct.

Convergent | Discriminant |
| —————— | ——————– |
| Should correlate | Should NOT correlate |
| Similar constructs | Different constructs |

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is divergent validity?

A

Divergent validity (also called discriminant validity) is another subtype of construct validity.

🚫 What Is Divergent Validity?

🧠 Clear Definition

Divergent validity is demonstrated:

> When measures discriminate successfully between other measures of unrelated constructs.

In simple terms:

> If two tests are supposed to measure different things, they should NOT strongly correlate.

If they don’t correlate → good divergent validity.

🎯 The Core Idea

A test should:

  • Correlate with similar constructs (convergent validity)
  • Not correlate with unrelated constructs (divergent validity)

If it correlates with everything → it’s not specific.

🩺 Clinical Examples

1️⃣ Depression Scale

A depression questionnaire should:

  • Correlate with other depression measures (convergent)

But it should NOT strongly correlate with:

  • IQ tests
  • Height
  • Blood glucose

If it doesn’t correlate with unrelated variables → good divergent validity.

2️⃣ Anxiety Inventory

An anxiety scale should:

  • Correlate with physiological stress markers

But not strongly correlate with:

  • Extroversion
  • Academic ability

If it stays specific → good divergent validity.

3️⃣ ADHD Rating Scale

An ADHD scale should not strongly correlate with:

  • Depression severity
  • Physical fitness

If it clearly distinguishes attention problems from unrelated traits → strong divergent validity.

🧠 Why It Matters

Without divergent validity:

A test might just measure general distress, mood, or response bias.

Divergent validity shows:

> The test is measuring something specific, not everything.

🆚 Convergent vs Divergent

👶 10-Year-Old Explanation

Imagine you make a maths test.

It should:

  • Match other maths tests (convergent)

But it should NOT match:

  • A spelling test
  • A swimming test

If it only matches maths and not other subjects → good divergent validity.

🧠 Memory Hook

🚧 Divergent = “Drive Apart.”

Think:

> “Different things should stay different.”

Or:

🧲 Convergent sticks together
🚫 Divergent pushes apart

If unrelated tests don’t correlate → good divergent validity.

🎯 Ultra-Short Exam Summary

Divergent validity (discriminant validity) is demonstrated when a measure does not strongly correlate with measures of unrelated constructs, confirming that it assesses a distinct and specific concept.

Convergent Validity | Divergent Validity |
| ——————- | ——————– |
| Should correlate | Should NOT correlate |
| Similar constructs | Unrelated constructs |
| Agreement | Separation |

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is construct validity?

A

Excellent — this is the big-picture validity concept 🔥

Construct validity is the umbrella that holds many other validity types together.

🧠 What Is Construct Validity?

📌 Clear Definition (Using Your Required Phrase Clearly)

Construct validity refers to:

> The extent to which the scores on a test are in accordance with one’s theory about what is being tested and in terms of future consequences.

In simpler terms:

> Does this test behave the way theory says it should?

It relies on both:

  • Convergent validity (correlates with related constructs)
  • Divergent validity (does not correlate with unrelated constructs)

Most psychiatric diagnoses rely heavily on construct validity.

🎯 The Core Idea

A construct is something abstract:

  • Depression
  • Anxiety
  • Intelligence
  • Personality

We can’t directly see them.
So we test whether our measure behaves as theory predicts.

If it does → strong construct validity.

🩺 Clinical Examples

1️⃣ Major Depressive Disorder

Theory says depression should:

  • Correlate with low mood and anhedonia (convergent)
  • Not correlate strongly with IQ (divergent)
  • Predict poorer occupational functioning (future consequence)

If all this happens → good construct validity.

2️⃣ ADHD Diagnosis

Theory predicts ADHD should:

  • Correlate with inattention and hyperactivity scales
  • Not strongly correlate with depression scales
  • Predict academic impairment over time

If this pattern holds → good construct validity.

3️⃣ Anxiety Disorder

Theory predicts anxiety should:

  • Correlate with physiological arousal
  • Not strongly correlate with unrelated personality traits
  • Predict avoidance behaviours in the future

If results align → construct validity supported.

🔬 Why It Matters in Psychiatry

Psychiatric diagnoses:

  • Do not have blood tests
  • Are theoretical constructs

Their validity depends on whether:

  • Symptoms cluster as theory predicts
  • They predict expected outcomes
  • They relate appropriately to other constructs

That’s construct validity.

👶 10-Year-Old Explanation

Imagine you invent something called “brave-ness.”

Your theory says brave kids:

  • Try new things
  • Speak in class
  • Don’t avoid challenges

If your “brave test” scores match those behaviours
and don’t match unrelated things (like shoe size)
and predict future confidence —

Then your test works.

That’s construct validity.

🧠 How It Connects to Other Validity Types

Construct validity depends on:

  • Convergent validity (should correlate)
  • Divergent validity (should not correlate)
  • Predictive validity (future consequences)
  • Sometimes criterion validity

It’s the big framework.

🧠 Memory Hook

🏗️ Construct validity = “Does the theory hold up?”

Think:

> “Does this test build the construct properly?”

Or:

🧠 Construct validity =
“Does it behave like the theory says it should?”

If it aligns with theory and predicts expected outcomes → strong construct validity.

🎯 Ultra-Short Exam Summary

Construct validity refers to the extent to which a test’s scores align with theoretical expectations about the construct being measured, including appropriate correlations with related constructs, lack of correlation with unrelated constructs, and prediction of expected future outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Review this comparison table for different types of validity & review this hierarchy.

A

See table

Hierarchy Tree of Validity

VALIDITY

├── Face Validity (superficial appearance)

├── Content Validity (coverage of construct)

├── Criterion Validity
│ ├── Concurrent Validity
│ └── Predictive Validity

└── Construct Validity (big theoretical umbrella)
├── Convergent Validity
├── Divergent (Discriminant) Validity
├── Predictive relationships
├── Incremental Validity
└── Cross-Validity

17
Q

What is test-retest reliability?

A

🔁 What Is Test–Retest Reliability?

🧠 Clear Definition

Test–retest reliability refers to:

> The degree to which a test produces consistent results when administered to the same people on different occasions.

Or, using your required phrase clearly:

> It is demonstrated when there is a high correlation between scores on the same test given on different occasions.

In simple terms:

> If nothing has changed, the scores should stay similar.

🎯 The Core Idea

You give:

  • The same test
  • To the same people
  • At two different times

If the scores strongly correlate → high test–retest reliability.

If they vary wildly → poor reliability.

🩺 Clinical Examples

1️⃣ Depression Questionnaire

A patient completes a depression scale today.

You repeat it one week later (no clinical change expected).

If scores are similar → good test–retest reliability.

If they swing dramatically → poor reliability.

2️⃣ Cognitive Screening Tool

A stable elderly patient completes the MMSE.

You repeat it two weeks later.

If scores remain similar → reliable test.

3️⃣ Personality Inventory

Personality traits are relatively stable.

If a personality test produces similar scores over time → strong test–retest reliability.

🧠 Why It Matters

If a test is not stable over time:

  • It may reflect random error
  • It may be poorly constructed
  • It may be too sensitive to minor fluctuations

Reliability is necessary before validity.

A test cannot be valid if it is inconsistent.

👶 10-Year-Old Explanation

Imagine you step on a weighing scale today.

It says 40 kg.

You step on it tomorrow (and didn’t eat 10 pizzas 🍕).

It should say about 40 kg again.

If it says 40 kg today and 55 kg tomorrow → the scale is unreliable.

That’s test–retest reliability.

🧠 Memory Hook

🔁 Retest = Repeat Test

Think:

> “If I test it again, do I get the same result?”

Or:

📅 Test today
📅 Test tomorrow
📈 High correlation = reliable

🎯 Ultra-Short Exam Summary

Test–retest reliability refers to the consistency of a measure over time and is demonstrated when there is a high correlation between scores on the same test administered on different occasions.

18
Q

What is alternate-form reliability?

19
Q

What is split half reliability?

A

Excellent — this is a classic internal consistency concept 🔥

Let’s make it clear and intuitive.

✂️ What Is Split-Half Reliability?

🧠 Clear Definition

Split-half reliability is a measure of internal consistency.

It assesses whether:

> Different parts of the same test produce similar results.

In simple terms:

> If you split the test into two comparable halves, a person’s score on one half should correlate strongly with their score on the other half.

📊 How It Works (Using Your Required Phrase Clearly)

  • The test is divided into two comparable halves (e.g., odd vs even questions).
  • A correlation coefficient is calculated between a person’s scores on the two halves.
  • If the correlation is high → strong split-half reliability.

To refine this further:

> Cronbach’s alpha (α) gives a measure of the average correlation between all items in the test when assessing split-half reliability.
It therefore indicates the internal consistency of the test.

So:

Cronbach’s alpha tells us how well all the items in a test “hang together.”

🎯 What Is Internal Consistency?

Internal consistency asks:

> Do all the questions measure the same underlying construct?

If they do → high Cronbach’s alpha.

If questions are unrelated → low alpha.

🩺 Clinical Examples

1️⃣ Depression Questionnaire

A 20-item depression scale includes questions about:

  • Mood
  • Sleep
  • Guilt
  • Energy

If someone scores high on half the items, they should score high on the other half.

If Cronbach’s alpha is high (e.g., α > 0.8) → strong internal consistency.

2️⃣ Anxiety Inventory

If anxiety items correlate strongly with each other → good split-half reliability.

If some items behave completely differently → poor internal consistency.

3️⃣ ADHD Rating Scale

Questions about inattention and hyperactivity should correlate within their domain.

High internal consistency supports reliability.

🧠 Why It Matters

If a test is meant to measure one construct (e.g., depression):

All items should “pull in the same direction.”

If half the items measure anxiety and half measure sleep quality, internal consistency will be low.

👶 10-Year-Old Explanation

Imagine a maths test.

You split it into:

  • First half
  • Second half

If you’re good at maths, you should do well on both halves.

If you score 90% on the first half and 20% on the second half, something’s wrong.

That’s split-half reliability.

🧠 Memory Hook

✂️ Split-Half = “Cut it and Compare.”

Think:

> “If I cut the test in half, do both sides agree?”

Or:

🧵 Cronbach’s alpha =
“Do all the threads weave the same fabric?”

If all items stick together → high internal consistency.

🎯 Ultra-Short Exam Summary

Split-half reliability is a measure of internal consistency in which a test is divided into two comparable halves and the correlation between the two sets of scores is calculated; Cronbach’s alpha reflects the average inter-item correlation and indicates how consistently the test items measure the same construct.

20
Q

What is inter-rater reliability?

A

Excellent — this is a very clinically relevant reliability concept 🔥

👥 What Is Inter-Rater Reliability?

🧠 Clear Definition

Inter-rater reliability refers to:

> The degree to which two or more raters give consistent ratings of the same material at roughly the same time.

Or, using your required phrasing clearly:

> It is demonstrated when there is a high correlation between the results of two or more raters assessing the same material at roughly the same time.

In simple terms:

> If two clinicians assess the same patient, do they agree?

📊 How Is It Measured?

Inter-rater reliability is often measured using:

🔢 Intraclass Correlation Coefficient (ICC)

  • ICC ranges from 0 to 1
  • ≥ 0.7 is generally considered acceptable
  • Closer to 1 = stronger agreement

ICC measures how similar the ratings are between raters.

🩺 Clinical Examples

1️⃣ Psychiatric Diagnosis

Two psychiatrists independently interview the same patient.

If both diagnose:

  • Major depressive disorder

→ high inter-rater reliability.

If one diagnoses depression and the other bipolar disorder → poor reliability.

2️⃣ HAM-D Scoring

Two clinicians score the same patient using the Hamilton Depression Rating Scale.

If their scores are very similar → high ICC → good reliability.

3️⃣ Radiology Interpretation

Two radiologists evaluate the same brain MRI.

If both identify the same lesion → strong inter-rater reliability.

If interpretations vary widely → poor reliability.

4️⃣ Risk Assessment

Two clinicians assess suicide risk at the same time.

If both rate risk similarly → strong reliability.

🧠 Why It Matters

In clinical medicine:

  • Different clinicians assess patients
  • Diagnostic criteria require interpretation

If agreement is poor:

  • The diagnosis is unreliable
  • Research findings become questionable
  • Treatment decisions vary unnecessarily

👶 10-Year-Old Explanation

Imagine two teachers marking the same essay.

If both give about the same score → good reliability.

If one gives 90% and the other 40% → poor reliability.

Inter-rater reliability asks:

> “Do different people agree?”

🧠 Memory Hook

👥 Inter-rater = “In agreement?”

Think:

> “If two experts rate it, do they match?”

Or:

🎤 Two judges
📊 Same performance
🤝 High agreement = good reliability

ICC ≥ 0.7 = acceptable teamwork.

🎯 Ultra-Short Exam Summary

Inter-rater reliability refers to the consistency of measurements made by two or more raters assessing the same material at roughly the same time and is commonly measured using the Intraclass Correlation Coefficient (ICC), where values ≥ 0.7 indicate acceptable agreement.

21
Q

What is intra-rater reliability?

A

🔄 What Is Intra-Rater Reliability?

🧠 Clear Definition

Intra-rater reliability refers to:

> The consistency of measurements made by the same rater when assessing the same material at different times.

In simple terms:

> If I rate something today and rate it again later, do I give the same score?

🎯 The Core Idea

One rater.
Same material.
Different times.

High agreement → good intra-rater reliability.
Large variation → poor reliability.

🩺 Clinical Examples

1️⃣ HAM-D Scoring

A psychiatrist scores a patient’s depression severity.

Later, they review the same recorded interview and rescore it.

If scores are similar → strong intra-rater reliability.

2️⃣ Radiology Review

A radiologist reads a scan today.

Two weeks later, they re-evaluate the same scan.

If their interpretation is consistent → good intra-rater reliability.

3️⃣ Essay Marking

A clinician marks a reflective assignment.

Later, they re-mark it without seeing the previous score.

If the scores match closely → strong intra-rater reliability.

🧠 Why It Matters

If the same assessor gives different ratings over time:

  • Measurement error is present
  • Judgement may be unstable
  • Clinical decisions may vary

Reliable raters should be internally consistent.

👶 10-Year-Old Explanation

Imagine you’re a judge in a talent show.

You watch a performance and give it 8/10.

Next week you watch the exact same performance again.

If you give it 8/10 again → you’re consistent.

If you give it 5/10 → something’s off.

That’s intra-rater reliability.

🧠 Inter vs Intra (Quick Contrast)

Inter = between people
Intra = inside one person

🧠 Memory Hook

🪞 Intra = Inside the same person

Think:

> “Am I consistent with myself?”

Or:

👤 One rater
📅 Two times
📊 Same score = reliable

🎯 Ultra-Short Exam Summary

Intra-rater reliability refers to the degree of consistency when the same assessor evaluates the same material on two or more occasions, indicating stability of that individual’s judgement over time.

Term | Meaning |
| ————— | ———————— |
| Inter-rater | Between different raters |
| Intra-rater | Within the same rater |