What is meant by dimensionality reduction?
The reduction of feature count within a data set.
What 3 ways does dimensionality reduction improve / enhance the creation and running of ML models?
Saves time, saves money, removed irrelevant data.
What are the 2 methods of dimensionality reduction? Define each…
What are the 3 methods for feature selection?
Explain the Filter Method…
Explain the Wrapper Method…
Explain how Forward Search works in Wrapper Method of Feature Selection…
Explain how Recursive Feature Elimination works in Wrapper Method of Feature Selection…
Explain the Embedded Method
What is a Random Forest?
An aggregation of decision trees
What are the 2 types of methods for Feature Extraction?
What is the main Linear method for feature extraction? Explain it…
What is the worst case scenario of PCA?
What are the steps of PCA?
What do the Eigenvalues represent in PCA?
The variance capture by each principal component.
What is the Input-Output into PCA’s?
Input -> High-D data
Output -> Low-D data
What are the 2 main Non-linear methods for feature extraction?
Explain t-SNE
A non-linear method for feature extraction
1 - Calculate the distribution of distances across the N points and call this D
2 - Scatter N points randomly in 2 or 3 dimensions
3 - Move the N points around until distance distribution resembles D
What are some issues with t-SNE?
Explain UMAP
What are some issues with both t-SNE and UMAP?