Statistical modelling
Trying to decide with distribution fits best
Statistical modelling process
Maximum likelihood
Find parameter values that maximise the probability of the data
Maximum likelihood estimate
The estimate of the parameter that maximises the likelihood
Obtain L(p) - likelihood function
Convert to l(p) - log-likelihood function
Solve derivative
Maximum likelihood: general procedure
Likelihood function - L(θ) = Product(from i to n, in relation to xi) * pdf/pmf
MLE - θ^ = arg max (of θ) * L(θ)
Invariance property
A consistent estimator for a parameter, then any transformation of that parameter will also be a consistent estimator for the transformed parameter
MLE properties
CI for MLE
Similar approach
- Can use: estimate +- quantile * se
OR
- Bootstrap method
Quantile-Quantile (QQ) plots
Help assess if plausible that data came from specified distribution
(e.g. a distribution from MLE fit)
Create scatter plot of ordered data (y-axis) against theoretical quantiles (x-axis)
If both sets of quantiles from same distribution ⇒ points should lie on a straight line
Use a “thick marker” judgment approach - thick marker approach highlights the outliers as it covers up the close values and only showcases the outliers
Note that a bootstrap can be applied -> bootstrap on sample and then do QQ plot for each sample against chosen distribution