What are missing data?
Observations that could have been made but were not.
How can values be missed?
By design (intentionally), or unintentionally
In randomised controlled trials how are missing values most likely to occur?
In outcome variable(s)
In observational studies the explanatory variables (covariates) are just as likely to contain what?
Missing values as the outcome measures, that is:
Why can’t we simply analyse the observed data using an appropriate analysis method?
There are a number of potential problems:
What is the main issue with missing data?
Bias
What is the best solution to dealing with missing data?
Avoid it
A valid estimator is one that is what?
What is an example of providing a valid estimator?
If there were no missing data then fitting an ANCOVA model to a RCT that deployed a before-after design, would provide a valid estimator for the therapy difference.
What is a problem that affects the use of valid estimators?
In the presence of missing data analysis methods that would provide valid estimators for the complete data do not necessarily provide valid estimators when applied to the observed data.
What issue may still persist even under circumstances where a valid estimator can be obtained from the observed data?
What is an example?
This might not be the most efficient estimator.
For example, as implemented in most software packages, repeated measures (M)ANOVA uses only subjects for which the response has been observed at all time points.
What do we ideally want?
An analysis method that provides valid inferences in the presence of missing data and uses all the available information.
What does the intention to treat (ITT) principle refer to?
A type of analysis specific to RCTs, and states that all subjects should be analysed as part of the treatment group which they were originally assigned to, irrespectively of the level of treatment received and protocol adherence.
What is the purpose of the intention to treat principle?
This advice is aimed at maintaining the benefits of randomisation, that is avoiding confounding of the group effect (=avoiding selection bias).
What leads to a departure from the ITT principle which can introduce selection bias?
Missing values
Example:
* less chronically mentally ill patients may be less likely to adhere to intensive management and are then more likely to be lost to follow-up in this group.
What is generalisability?
Extent to which study results apply to the target population.
What can missing data affect?
The generalisability of the results from a trial or an observational study.
What is an example of RCT?
Suppose the most severely ill were most likely to be lost to follow-up (in both randomisation groups)
Then the observed results would be representative of a population in which the less severely ill are over-represented
What is the aim of data analysis?
Inference for a target population.
All data analyses are based on model assumptions about what?
target population
sampling process
What sampling method is typically used?
Random sampling
When data are missing and analyses are based on observed data further assumptions are being
made for what reason?
To describe how the observed data came about
Formally, the missing value generating mechanism is the probability of what?
Missing value pattern given the values taken by the (later observed or missing) observations
Examples:
* A lab sample is dropped.
* The interviewer overlooks a question by accident.
The mechanism is also known as uniform non-response.
A complete case analysis, albeit less precise, remains valid.