What is the flaw of PCA?
PCA looks for components with largest variance (by eigendecomposition) but sometimes large variance is not the most interesting, we could be more interested in clusters.
How does entropy measure “interestingness” with multimodal distribution (unlike Gaussian)?
Compare entropy of densities with same variance σ^2 and mean 0.
Minimised by optimal g(.) and maximised by Gaussian (= most boring).
Explain sphering
Aim is to make the variance matrix the identity I
Explain Exploratory Projection Pursuit (computational technique)
Explain Independent Component Analysis
Using entropy minimisation: make columns of S statistically independent and non-Gaussian:
Explain flaw of eigendecomposition for ICA
Using eigendecomposition:
writing X = SA^T where S = √n U and A^T = DV^T /√n from X = UDV^T where vectors of S are uncorrelated but decomposition is not unique (S can be rotated)!
Express the Projection Pursuit Regression
f(x) = sum(m=1 to M) gm( wm^T X)
Explain structure of neural networks
Compare PPR and ANN
What are the disadvantages of ANN?