What is Pointwise Mutual Information?
It is a measure of the discrepancy between (the coincidence of) the joint probability of 2 RVs, X and Y, and their individual distributions, assuming independence.
PMI(X, Y) = log[ P(X, Y)/ (P(X) P(Y))]
Sensitivity
TPR = TP / (TP + FN)
Specificity
TNR = TN / (TN + FP)
ROC
Skeweness
Kurtois
‘skewed to normal’ transformation
rule of thumb: having skewness in the range of −0.8 to 0.8 and kurtosis in the range of −3.0 to 3.0, use log or Box-Cox transformation to a normal (symmetric) distribution
- make sure to transform it back i.e using exp() in case of log transformation
multicollinearity
detect multicollinearity
Perturbing the data: multicollinearity can be detected by adding random noise to the data, re-running the regression many times, and seeing how much the coefficients change (see wikipedia)
transform a random var to a uniform distribution
Using the probability integral transform, if X is any random variable, and F is the cumulative distribution function of X, then as long as F is invertible, the random variable U = F(X) follows a uniform distribution on the unit interval [0,1].
coefficient of determination, R-squared