Methods of Self-Report Test Construction
Rational
Factorial
Empirical
Methods of Self-Report Test Construction: Rational
items are written to capture understanding of what a trait is
characteristics:
- tend to be face valid
- susceptible to response biases (easily faked, know what you are being asked)
- may not be internally consistent or valid
ex. I feel blue
ex. i feel manic`
Methods of Self-Report Test Construction: Factorical
items are selected on the basis of factor analysis
characteristics:
- highly internally consistent
- tend to be face valid
- somewhat susceptible to response bias
can combine this method with rational, see similar questions
Methods of Self-Report Test Construction: Empirical
items are selected on their ability to empirically distinguish one group from another
characteristics
- often have low internal consistency
- often items are not face valid
- may be less susceptible to response biases
Questions are randomly generated, see what item responses different between two groups (one group being normal, another with diagnosis)
ex. i like blue shoes
first MMPI
Clinical judgement
are we better at understanding people through algorithms (MMPI) or interacting with them?
Dawes, Faust, and Meehl (1989) “Clinical Versus Actuarial Judgment“
AKA USE ALGORITHIMS
OVERVIEW: APA Report, June 1998, “Benefits and Costs of psychological assessment in Healthcare Delivery“
(specific information in slides to follow)
There is a nature of some errors in clinical judgment
The assessment process provides some checks on these potential errors
Clinical judgment can sometimes be as good as statistical decision rules, but it never exceeds them.
APA Report: Nature of some errors in clinical judgment:
Nature of some errors in clinical judgment:
APA Report: The assessment process provides some checks on these potential errors:
APA Report: Clinical judgment can sometimes be as good as statistical decision rules, but it never exceeds them.
Clinical judgment can sometimes be as good as statistical decision rules, but it never exceeds them.
BUT there are problems:
SUMMARY of clinical decision making
Clinical intuition is very fallible, but we tend to ignore this fact
Actuarial algorithms are better than clinical judgment
Diversity considerations: Language
In clinical work, use the term diversity (or idiographic) over multicultural, because:
–> therapeutic assessment is powerful agent of bias reduction and of understanding the person in front of us
Diversity considerations: Privilege
Socially, privilege generally refers to the advantages enjoyed by majority social groups (whites, heterosexuals, cis-gendered), which no doubt have effects that influence everyone psychologically
But Psychologically, privilege may be understood, for example, as the advantages of having parents who made possible a cohesive sense of self, and secure attachment status, because they lead to resilience, successful relationships, productive careers, and satisfying lives. This psychological level of our lives captures a universal concern–and perhaps a unifying force. And is certainly the focus of clinical work .
Diversity considerations: nomothetic tests
Nomothetic tests/methods can be extremely useful when used wisely
BUT thy can be misused. Thoughtless application of tests can:
- unfairly discriminate
- misdiagnose those from cultural groups not captured by the normative group as well as the idiographically different
- unfairly deny opportunities
One of the most compelling reasons FOR testing: to rise above our own biases/limitations
(ACTS AS BIAS MITIGATOR)
Overall diveristy considerations
We all have a multitude of unique life influences, and we make sense of them in a myriad of unique ways
> Must consider the individual’s life influences AND what effect they have had on that particular individual
Psychological assessment, especially Collaborative / Therapeutic Assessment is a powerful agent of bias reduction and of understanding the person in front of us-
HEXACO
six factor model
works across 16 languages (comparable)
Self-report: unidimensional vs multidimensional
Unidimensional:
- used for quick assessment of specific issue
- ex. Beck depression inventory
multidimensional;
- personality tests
- batteries (contain multiple scales)
- often include validity scales
**LIKE MMPI
Things that are difficult to measure accurately with self-report data
Impulsivity
maturity
behavior change
self report is only based on ones theory about themself
(aka people don’t have a good sense on how impulsive or mature they are)
Types of psychometric data
Observational data
Life data (grades in school, felon record, etc.)
Self report data
Performance-base data
Informant’s data
revealing when self-report mismatches other report
Report measures from collateral sources (examples)
Parent/teacher reports
- multidimensional, like CBCL, BASC
- unidimensional, like conners 3, beck depression
Clinician report
- SWAP (shedler weston assessment procedure)
History of the MMPI
originally developed using empirical methods
was for diagnosis, but turned out to be better for personality
MMPI2: more personality, for clinical setting
MMPI3: more symptom based (research based, more psychometric), used in hospital and forensic settings
Overview of major scale sets
Clinical or Basic scales
- Harris-Lingoes subscales
- Martin-Finn subscales
- Si Scales
Validity scales
- Ex. L, F, K
Content Scales
- Content Component scales
- Ex. Anx, FRS, DEP
Supplementary scales
- Ex. A, R, etc.
PSY-5 scales - personality Psychopathology scales
- ex. AGG, PSY
RC scales
Clinical Scales
Scale 1 or Hypochondriasis scale
Scale 2 or Depression scale
Scale 3 or Hysteria scale
Scale 4 or Psychopathic Deviate scale
Scale 5 or Masculinity-Femininity scale
(added after original development)
Scale 6 or Paranoia scale
Scale 7 or Psychasthenia scale
Scale 8 or Schizophrenia scale
Scale 9 or Hypomania scale
Scale 0 or Social Introversion scale
(added after original development)
Percentile Ranks for Uniform T-scores
Magic number is 65, only 8% of norm sample scored higher
38 is for low range
General interpretive guidelines
T-score above 65 generally considered elevated
T-score of 60 to 65 is interpretable on validity and content scales
Do not pay as much attention to low scores, but consider the scale (below 38ish seems to be something he mentions)