Lecture 6 Flashcards

Midterm Study (22 cards)

1
Q

Construct

A

Hypothetical factor that cannot be observed directly; its existence is inferred from certain behaviors and assumed to follow from certain circumstances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Operational Definition

A

A way to attach a system of measurement in a way that can be replicated and which is a faithful PROXY of the construct.

We use different kinds of measures (e.g. questionnaires, tests) to act as faithful proxies of what we really want to measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Reliability

A

Reproducibility of a measurement; the extent to which measures of the same phenomenon are consistent and repeatable; measures that are high in reliability will contain a minimum measurement error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Validity

A

The extent to which a measure of a constuct truly measures that constuct and not something else.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Classical Test Theory

A

AKA “Classical Reliability Theory”
1. True score is constant
2. Error is random
3. Correlation between the true scores and error is 0
4. Correlation between errors of difficult measurements occasions are also equal (this error is because error is assumed to be random)

X(observed score) = TRUE SCORE + ERROR

Random Error and True Score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Generalizability Theory

A

Error is seperated into pieces, each of which can be estimated (if we collect the data properly)
1. To better understand what aspect is producing the most amount of error therefore we can understand it better.
2. Quickly identify where the error is coming from and determine how good the measure is
3. Explicitly connects measurement operations to the purpose of measurement

Random Error, True Score, Test-Retest Error, Rater Error, and Other identifiable sources of Error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Temporal Reliability

A

Reproductibility of values of a variable when you measure the same subjects twice or more. A form of reliability in which a test is administered on two seperate occassions and the correlation between them is calculated.
1. Test-Retest
2. Parallel Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Test-Retest

A

Administer the same test two or more different times and calculate the correlation between the scores. This helps us understand how stable the test is over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Parallel Test

A

Giving two different versions of a test that is supposed to be assessing the same construct (or constructs).
In which you would find the correlation between the two different versions of the test.

  • Minimizing threat to internal validity so test-takers do not end up remembering items from previous administrations*
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Internal Consistency Reliability

A

Measures the extent to which a measure yields the same number of score each time it is administered, assuming everything else is equal.
(Reliability of a 10-item test would be higher than that of a 5 similar items)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Split-half reliability

A

A format of reliability in which one half of the items on a test are correlated with the remaining items.

Finding questions that are let us say in one already established tool and merging it with another already established tool and detemining the reliablity then

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Cronbach’s Alpha

A

The internal reliability within a test by comparing each item to all other items.

Each item is compared with each other to assure that the ratings scale is consistent.

Used with a likert scale type item (range in severity from 1-10)

See formula in notes (can be calculated in R)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Kuder-Richardson coefficient (KR-20)

A

Variables that are dichotomous in nature (yes/no or true/false)
Use the KR-20 equation to determine if the items are not in fact continuous in nature

See notes for formula

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Interrater Reliability

A

The reliability of a test’s results as it occurs across multiple test adminstrators (that is the stability of test results for the same test takers but across different test adminstrators or scores)

Can be determined by:
(number of agreements)(number of agreements + disagreements)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Intraobserver or Interrater Reliability

A
  1. The degree of agreement between two or more observers of the same event
  2. This type of reliability is estimated by having two or more observers watching the same event and independently recording the variables according to a predetermined coding system.
  3. A correlation coefficient is computed to demonstrate the strength of the relationship between one observer’s rating and the other’s.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cohen’s Kappa

A

This is used to determine diagnosis similarities by two doctors
WHERE
Po= proportion of observed agreements between the raters
Pe= probability of agreement occuring by chance
K=(Po-Pe)/(1-Pe)

17
Q

How do you calculate the probability of agreement by chance

A

Probability that the raters would agree simply by random chance:
Calculated by considering the marginal totals (rows and column) totals in a contingency table and summing the products of these marginal totals

Cohen’s Kappa for nominal or more categorical data

18
Q

Standards of Reliability

A
  • .80 is a good benchmark for research
  • .90 - .95 is better for application (e.g., diagnosis, personnel decisions)
19
Q

Factors that affect reliability

A
  1. Length of test: the more items the better the reliability
  2. Timing of the test: the faster or less time spent on a test, the less reliable.
  3. Group heterogeneity/heterogeneity: the greater the variance of scores (more varied the sample is) the more reliable the test. VARIANCE IS NEEDED TO HAVE RELIABILITY
  4. Item Difficulty: the more varying of difficulty among the test items, the reliable the test.
20
Q

Item Response Theory (IRT)

A

Framework which helps us to evaluate a test takers performance to specific test items and to understand their underlying abilities or traits through that. LATENT TRAIT

Examples:
Adaptive Testing: SAT on a computer in that it is adaptive in nature with the questions becoming subseqently harder or easier based on your response to the previous questions

Clinical Assessment: Often focused on finding solid diagnostic items which could include the following questions from PHQ-9:
-PHQ-9, Item5, Poor appetite or overeating
-PHQ-9, Item9, Thoughts you would be better off dead, or of hurting yourself

21
Q

Item Response Theory Parameters on which items are characterized include:

A
  1. Their difficulty (known as “location” for their location on the difficulty range)
  2. Discrimination, representing how steeply the rate of success of individuals varies with their ability.
  3. “Psuedo-guessing”: the probability of scoring correct due to guessing (33% pire chance on a multiple choice item with 3 possible responses)

With all three of these parameters, IRT can do a reliable job in estimating ability at the low, middle, and high levels of ability.