reliability pt 3 Flashcards

Question 1

Q

inter rater reliability

Answer

A

-Applies when judgment must be exercised in scoring responses (e.g., WAIS-IV VCI subtests)

Item level (agreement on item scores)
How much agreement on each individual item?
Correlation between raters on assigned item scores
Scale level (total score on scale)
How much agreement on the total score?
Correlation between raters on total scores

If there are more than two raters, take the mean of the correlations for each pair of raters (A & B, B & C, A & C)

Question 2

Q

inter rater reliability categorical decisions

Answer

A

When there is a finite number of categories to which each person being rated can be assigned

Items: Pass/Fail (0,1)
Items: 0, 1, 2
Diagnosis: Present/Absent

Two methods for assessing:

Percent Agreement
Kappa

Question 3

Q

inter rater reliability: percent agreement

Answer

A

Percentage of all cases for which both raters make the same decision (i.e., both assign a score of 0 or both assign a score of 1)
Problem: Raters could agree simply by chance
Percent agreement can OVERESTIMATE inter-rater reliability
Kappa (Κ) takes chance agreement into account and is the preferred method for assessing inter-rater reliability

Question 4

Q

how to calculate percent agreement

Answer

A

Two raters independently decide whether an item score should be 0 or 1 for N individuals who complete the item
A = number of times item was scored 0 by both #1 and #2
B = number of times item was scored 0 by #1 and 1 by #2
C = number of times item was scored 1 by #1 and 0 by #2
D = number of times item was scored 0 by both #1 and #2
Percent agreement = percentage of cases for which both raters gave the same score (either both 0 or both 1) = (A+D)/N

Question 5

Q

calculating chance agreement for a score = 0

Answer

A

Total scores of 0 given for Rater #1 = A + B
Total scores of 0 given for Rater #2 = A + C
Proportion of cases given a score of 0 by Rater #1 = (A+B)/N
Proportion of cases given a score of 0 by Rater #2 = (A+C)/N
Chance agreement for a score of 0 =
(A+B)/N times (A+C)/N

Question 6

Q

calculating chance agreement for score = 1

Answer

A

Total scores of 1 given for Rater #1 = C + D
Total scores of 1 given for Rater #2 = B + D
Proportion of cases given a score of 1 by Rater #1 = (C+D)/N
Proportion of cases given a score of 1 by Rater #2 = (B+D)/N
Chance agreement for a score of 1 =
(C+D)/N times (B+D)/N

Question 7

Q

calculating total chance agreement

Answer

A

Add the chance agreement for a score of 0 to the chance agreement for a score of 1
(A+B)/N times (A+C)/N
PLUS
(C+D)/N times (B+D)/N

Question 8

Q

implications of reliability

Answer

A

There is no single value that represents the reliability of a test … we must specify which type of reliability we are estimating
The methods we have considered all permit us to estimate a specific type or source of error
To estimate multiple sources of error simultaneously Generalizability Theory
Test manuals will report all relevant types of reliability (test/retest; split-half; internal consistency; inter-rater)

Question 9

Q

standard error of measurement

Answer

A

Reliability coefficients apply to the test itself

- The SEM permits us to estimate how much error is likely to be present in an individual examinee’s score

Question 10

Q

SEM in words

Answer

A

Step 1. Subtract the reliability of the test from 1.
Step 2. Take the square root of Step 1.
Step 3. Multiply the standard deviation of the test by Step 2.

Question 11

Q

SEM and reliability

Answer

A

The SEM is INVERSELY -
If reliability is high, SEM is low
If reliability is low, SEM is high

Question 12

Q

standard error of measurement according to classical reliability theory

Answer

A

According to Classical -Error is normally distributed around a mean of 0
-SEM = the standard deviation of the distribution of error scores

Using the probabilities associated with the normal curve

The probability is 68% that the amount of error is within 1 SEM
The probability is 95% that the amount of error is within 2 SEM

Question 13

Q

estimating error

Answer

A

We can use the SEM to make probability statements about the amount of error associated with an observed score

NOTE: To do this accurately, we have to use the exact values rather than the “approximate” values we used in Chapter 1 of the Manual

The probability is 68% that the amount of error associated with an observed score is no more than +/- 1 SEM
The probability is 90% that the amount of error associated with an observed score is no more than +/- 1.65 times SEM.
The probability is 95% that the amount of error associated with an observed score is no more than +/- 1.96 times SEM.

Question 14

Q

confidence intervals for estimated true score

Answer

A

We can also construct confidence intervals around the estimated true score
-We can’t know the actual true score, but we can estimate it.

These confidence intervals tell us the range in which the person’s true score is likely to fall with a specified degree of certainty (probability)

These are the CI’s that are given in the table in the WAIS-IV Manual

Step 1. Calculate the estimated true score
Step 2. Calculate the standard error of estimate
Step 3. Calculate the desired confidence interval

Question 15

Q

estimating the true score formula in words

Answer

A

Step 1. Subtract the Mean (M) from the observed score (Xo)
Step 2. Multiply Step 1 by the reliability of the test (rtt)
Step 3. Add the Mean to Step 2.

Question 16

Q

standard error of estimate

Answer

Study These Flashcards

A

Standard Error of Estimate (SEE) = SEM times reliability

Question 17

Q

CI’s around estimated score….

Answer

Study These Flashcards

A

…..will sometimes be asymmetrical around the obtained score

Reason: regression towards the mean

The estimated true score will always be closer to the mean compared to the observed score
Est True Score > Observed Score when observed score is below the mean
Est True Score < Observed Score when observed score is above the mean

Question 18

Q

difference between estimated true scores and observed scores

Answer

Study These Flashcards

A

GREATER when

Reliability is LOWER
Observed Score is farther from Mean

LESS when
-Reliability is HIGHER
Observed Score is closer to Mean

Question 19

Q

standard error of difference

Answer

Study These Flashcards

A

Used to decide if two scores are “significantly different” from one another
i.e., the observed difference between them is NOT just due to measurement error

Question 20

Q

how to find SED in words

Answer

Study These Flashcards

A

Step 1. Square the SEM of the first score
Step 2. Square the SEM of the second score.
Step 3. Add Steps 1 and 2
Step 4. Take the square root of Step 3.

The SED will always be larger than the larger of the two SEMs

Question 21

Q

using SED

Answer

Study These Flashcards

A

Multiplying the SED by 1.96 gives the amount of difference required for the scores to be considered significantly different at p < .05.
For the VCI and PRI, this difference is 1.96 times 4.50, or 8.82
The VCI and PRI must differ by at least 8.82 points (rounded to 9 points) in order for the difference to be considered statistically significant (not just due to measurement error) at p < .05.

Question 22

Q

example of using SED

Answer

Study These Flashcards

A

In other words, differences less than 9 points could be due entirely to measurement error and therefore cannot be considered “true” differences
VCI = 109 vs. PRI = 115. The difference is NOT statistically significant because the difference is only 6 points which is less than 9 pts.
A difference that is less than 9 points could be due entirely to measurement error, i.e., the true scores actually might not differ from one another.

Question 23

Q

SED and WAIS-IV

Answer

Study These Flashcards

A

To get more precise values for the minimum differences required for statistical significance at p < .05, we can use Table B.1 on p. 230 of the Administration Manual.
This table calculates values using the reliability of the indices within each specific age range.
Reliability of the indices varies slightly with age

For most purposes, the differences given in the WAIS-IV Interpretation Manual are sufficient

reliability pt 3 Flashcards

(23 cards)