What is a summary statistic T(D) and how can it vary in its representation
A function of the observed data {D} = {x 1,…,x n }
designed to describe key characteristics of the data. It can take various forms, including scalar values, vectors, matrices ect
Summary statistics typically focus on important data properties like:
* Location: For example, the mean (Λx ) or the median
* Scale: Such as the standard deviation or the interquartile range.
When is a statistic sufficient
How does the existence and uniqueness of the MLE
relate to sufficient statistics
If the MLE exists and is unique, then ΞΈ^ ML is a unique function of the sufficient statistic
How does a sufficient statistic partition the space of data sets
A sufficient statistic effectively partitions the space of all possible data sets into clusters, where each cluster contains data sets that result in the same value of
T(D). This partitioning is represented by:
The data sets in π³π‘ are equivalent in terms of the sufficient statistic
What does it mean for two data sets to be likelihood equivalent
Two data sets π·1 and π·2 for which the ratio of the corresponding likelihoods πΏ(π½|π·1)/πΏ(π½|π·2) does not depend on π½
Is π³π‘ liklihood equivalent
Yes all data sets in π³π‘ are likelihood equivalent
What defines a minimal sufficient statistic
A minimal sufficient statistic is defined as a sufficient statistic for which all likelihood equivalent data sets are also equivalent under this statistic
What is a trivial example of a minimal sufficient statistic
The likelihood function itself since by definition it can be computed from any set of sufficient statistics
What are the differences between forward KL and reverse KL divergence minimisation
How does a small sample size n affect the reliability of MLE and what are alternative strategies
MLE can overfit, meaning it too closely matches the particularities of the small dataset, leading to poor generalization to the broader population.
Alternative methods -
* Regularised/penalised likelihood
* Bayesian methods