What two bits of information should every research paper include in terms of missing information?
2. The procedures used to manage the missing data, including the rationale for using the method selected
What are three patterns of missingness when it comes to missing data?
MCAR - missing completely at random
MAR - missing at random (a dummy variable to used to determine this)
NMAR - not missing at random (when there is a pattern - nonignorable nonresponse)
What is listwise deletion?
Cases with any missing values are deleted from analysis (complete case analysis)
What is pairwise deletion?
The maximum amount of available data is retained. Cases are only excluded from operations which missing data is required (available case analysis)
What is mean substitution?
Missing values are imputed with the mean value of that variable (this method reduces the variance of the variable, which also attenuates covariances that the variable has with other variables).
What is regression substitution?
A regression equation based on the nonmissing data is use to predict expected values for the missing data (its a best guess but produces biases in the variances and covariances).
What is the difference between stochastic and nonstochastic imputation methods?
Stochastic means having a random probability distribution or pattern that may be analysed statistically but may not be predicted precisely.
What is pattern-matching imputation (two types)?
Two types
What is stochastic regression?
A random value is added to the imputed predicted value. (reduces biased variance estimates)
What is expectation maximisation (EM)?
Two steps
What is maximum likelihood?
Strategies where observed data are used to estimate parameters, which are then used to estimate the missing scores.
What is multiple imputation?
Several imputed data sets are created. Analysis is carried out on the data sets with parameter estimates. Final results are obtained by averaging the parameter estimates across the multiple analyses. These are then used to calculate construction of confidence intervals around the parameter estimates.
What is full information maximum likelihood (FIML)?
It estimates parameters on the basis of the available complete data as well as the implied values of the missing data given the observed data.
What is central limit theorem? (CLT)
As your sample size becomes bigger, the closer we get to a normal distribution.