1. test-retest reliability 2. parallel forms reliability 3. internal consistency reliability (split half, Kuder-Richards 20, Cronbach's alpha) 4. inter-rater reliability

Reliability and coefficient alpha Flashcards by Abbie Chetwin

What is reliability?

The desired consistency or reproductibility of test scores

-does my test give me the same accurate measurement each time?

How well did you know this?

Not at all

Perfectly

Test score theory

every person has a true score that we can measure but no test is free from error

x=T+e (x=observed score, T= true score, e= error)

How well did you know this?

Not at all

Perfectly

Classical test theory: 4 assumptions

each person has a true score we could obtain if there was no measurement error
there is measurement error but this error is random
the true score of an individual doesn’t change with repeated applications of the same test even though their observed score does
the distribution of random errors will be the same for all ages

How well did you know this?

Not at all

Perfectly

Classical test theory:

The domain sampling model

If we construct a test on something, we can’t ask all possible questions

> So we only use a few test items (sample)
Using fewer test items can lead to the introduction of error

How well did you know this?

Not at all

Perfectly

The domain sampling model

formula

reliability = variance of observed scores on short test/variance of true scores

As the sample gets larger, estimate is more accurate

How well did you know this?

Not at all

Perfectly

Other things can affect performance…

might be tired on day taking the test (different scores for different days)

How well did you know this?

Not at all

Perfectly

Types of reliability

test-retest reliability
parallel forms reliability
internal consistency reliability (split half, Kuder-Richards 20, Cronbach’s alpha)
inter-rater reliability

How well did you know this?

Not at all

Perfectly

Test-retest reliability

Give someone the same test at two different points in time.
If the scores are highly correlated, we have good test-retest reliability
Correlation between the 2 scores also known as the co-efficient of stability

How well did you know this?

Not at all

Perfectly

Source of error in test-retest reliability

time sampling

How well did you know this?

Not at all

Perfectly

Issues with test-retest reliability

Can we use it when measuring things like mood, stress, etc.?

Won’t the person’s score increase the 2nd time because of practice effect?

What if we want to measure changes between 1st and 2nd administration?

Can the actual experience of being tested change the thing being tested?

What if some event happens in between the 1st and 2nd administration to change the thing being tested?

How well did you know this?

Not at all

Perfectly

Parallel forms reliability

Two different forms of the same test (i.e., measuring the same construct)
Correlation between the two forms known as the co-efficient of equivalence

How well did you know this?

Not at all

Perfectly

Parallel forms reliability- source of error

item sampling

How well did you know this?

Not at all

Perfectly

Parallel forms reliability- Ways to change the form of test

question response alternatives are reworded
order is changed (reduce practice effect)
change wording of question

How well did you know this?

Not at all

Perfectly

Parallel forms reliability: issues

What if we give the different forms to people at two different times?

Do we give the different forms to the same people, or different people?

What if people work out how to answer the one form from doing the other form?

Difficult to generate a big enough item pool

How well did you know this?

Not at all

Perfectly

Internal consistency reliability

Do the different items within one test all measure the same thing to the same extent?
I.e., Are items within a single test highly correlated?
Split-half reliability
Coefficient alpha

How well did you know this?

Not at all

Perfectly

Internal consistency reliability: source of error

Study These Flashcards

-internal consistency/reliability of on test administered on one occasion

Split-half reliability

Study These Flashcards

A test is split in half
Each half scored separately
Total scores for each half correlated

Split-half reliability- advantage

Study These Flashcards

we only need 1 test- don’t need 2 forms

Split-half reliability- disadvantage

Study These Flashcards

-challenging to divide the test into equal halves

SPEARMAN-BROWN CORRECTION

Study These Flashcards

solves the problem of split half tests having reduced reliability compared to the total test

rsb= 2r(hh)/1+r(hh)

r(sb)- predicted reliability
r(hh)- reliability of the current test(correlation between halves)

Split-half reliability: issues

Study These Flashcards

We have taken one test and split it into two tests that are half the length – won’t this underestimate reliability?

Example: We have a test of 20 items, split in half, and correlate each half
Similar to 2 tests of 10 items
The fewer items we have, the lower our reliability

Won’t the correlation change each time depending on which items we put in each half?

Yes – we will get a different reliability coefficient for each different split

Ideally the halves should be equivalent

Coefficient/cronbach’s alpha

Study These Flashcards

Takes the average of all possible split-half correlations for a test

a=kr/(1+(k-1)r)

k- number of indicators
r- mean inter-rater correlation

cronbach’s alpha

-number of items

Study These Flashcards

Rapid increase in internal consistency reliability from 2 to 10 items
Steady increase from 11 to 30
Tapers off after about 40 items

Interpreting cronbach’s alpha

Study These Flashcards

00 = no consistency in measurement
00 = perfect consistency in measurement
70 = exploratory research
80 = basic research
90 = applied scenarios

Cronbach’s alpha can be affected by:

Multidimensionality Bad test items Number of items

inter-rater reliability

Measures how consistently 2 or more raters/judges agree on rating something

Cohen’s kappa &Fleiss' kappa

Cohen-2 raters/judges Fleiss- more than 2 raters/judge Ranges from 1 (perfect agreement) to -1 >.75 excellent agreement .50-.75 satisfactory

source of error of split half, alpha and Kr-20?

internal consistency

source of error of Kappa?

observer differences

What is coefficient alpha?

-Coefficient (Cronbach’s) alpha is one way of calculating the reliability of a test -Specifically, it tells us about the internal consistency of a test -Coefficient alpha measures the error associated with each individual test item As well as error associated with how well the test items fit together

``` What is coefficient alpha? according to (cortina, 1993) ```

The mean of all split-half reliabilities A measure of first-factor saturation

Cronbach's alpha vs standardised item alpha

Cronbach’s alpha Deals with variance and covariance (the amount by which variables/items vary together, i.e., co-vary) Standardized item alpha Deals with the interim correlations (i.e., the correlation of each item with every other item)and the sum of the correlation matrix of each item with each other

When do we use Cronbach's alpha?

- used when you want to use raw scores (the actual score) | - item variance affects the score and CA takes this into account

when do we use Standardised item alpha?

Standard scores are scores that have been transformed by taking into account things like age, in particular E.g., raw scores on any WAIS subtest are converted to standard scores with a mean of 10 and a SD of 3

Cronbach’s alpha is calculated by...

using inter-item variance and co-variance

Variance

a measure of how scores vary - usually taken as a measure of error - Because if scores differ a lot between people, the item or test is not very accurate – less accuracy means more error

Covariance

a measure of how much scores on items go together E.g., if a person with a high score on Item 1 of a test also gets a high score on Item 2 of a test Then items 1 and 2 of the test have a lot of co-variance (shared variance)

Reliability and coefficient alpha Flashcards

(37 cards)