Why would you want to use dimensionality reduction techniques to transform your data before training?
Dimensionality reduction can allow you to:
Why would you want to avoid dimensionality reduction techniques to transform your data before training?
Dimensionality reduction can:
Name the 4 popular dimensionality reduction algorithms and briefly describe them.
The most important vectors (with highest eigenvalues) are then selected to represent the features in the transformed space.
By far the most popular is PCA and similar eigen-decomposition based variations.
After doing dimensionality reduction, can you transform the data back to the original feature space? If so, how?
Yes and no.
Most dimensionality methods have inverse transformations, but signal is often lost when reducing dimensions, so the inverse transformation is usually only an approximation of the original data.
How do you select the number of principal components needed for PCA?
Selecting the number of latent features to retain is typically done by inspecting the eigenvalue of each eigenvector (where eigenvalue is percent variance explained). As eigenvalues decrease, the impact of the latent feature on the target variable also increases.
This means that principal components with small eigenvalues have a small impact on the model and can be removed.
There are various rules of thumb, but one general rule is to include the most significant principal components that account for at least 95% of the variation of the features.