Characteristics of count data
What are the Count Data Models
Poisson Distribution characteristics
Issues with Poisson
Overdispersion => solved by robust command in stata
Negative Binomial
NB regression estimates an overdispersion parameter πΌ
if πΌ = 0, use Poisson
if πΌ > 0, use NB because variance is greater than mean
NB will likely have a more precise estimate, smaller CI
Poisson and NB interpretations
Regression outputs are semielasticities, may need to exponentiate!
Margins output is interpreted in level
When to use Zero-Inflated Models
Two methods to predict a zero
Logit/Poisson or NB
Inflation options
Interpretation of ZIP
Still in semielasticities (%)
Why shouldnt we multiply ZIP inflated on a CONSTANT coefficient interpretations with the mean?
Because of the zero-inflated factor, multiplying the mean would overstate the positive average marginal effects and understate the negative average marginal effects
How to interpret ZIP CONSTANT margins
In level, the margins command incorporates the infaltion factor for us
How to interpret ZIP inflated on X vars?
How to interpret the inflate coefficients?
The inflate coefficients are semielasticities on the probability of nonuse
Probably need to exponentiate!
Cannot interpret on their own
ZIP X var margins interpretation
In level!
ZINB inflated on a CONSTANT
Interpretations are still semielastic!
ZIP vs ZINB selection
Look at chi2 p-value
If chi2 value is low (<0.05), reject the H0 therefore rejecting ZIP
low chi2 (p < 0.05) => use ZINB
high chi2 (p > 0.1) => use ZIP
ZINB inflated on some X vars
Inflating on all X vars will not converge
Interpretatiosn still in semielasticities
Model selection: Alpha
Models Chi^2 p-value Conclusion
Standard (uninflated) 0.001 NB outperforms Poisson
Inflated on Constant 0.035 NB outperforms Poisson (at 95% confidence level)
Inflated on Birth Control 0.416 Poisson outperforms NB
Model selection: Correlations
Higher correlations => better fit
Information criterions: AIC BIC
The samller ICs are preferred