What is Dimensionality Reduction?
It is the process of reducing the number of variables under consideration by obtaining a smaller set of principal variables.
Methods to implement dimensionality reduction?
It can be implemented in two ways:
1. Feature Selection
2. Feature Extraction
What is Feature Selection?
Here we are interested in finding k of the total n features that give us the most information, and we discard the other (n-k) dimensions.
eg: subset selection method(forward and backward selection)
What is Feature Extraction?
Here, we are interested in finding a new set of k features that are combinations of original features.
How the error is measured in machine learning problems?
In regression:
We usually use the Mean Squared Error(MSE) or the Root Mean Squared Error(RMSE).
In Classification:
Here, we may use the misclassification rate as the measure of error.
What is MSE?
It is the sum of the square of the difference between the predicted and the actual target variables, divided by the number of data points.
What is the misclassification rate?
It is the ratio of misclassified examples by the total number of examples.
Why is dimensionality reduction useful?
What is Subset Selection?
It is also known as feature selection or variable selection or attribute selection. It is the process of selecting a subset of relevant features for use in model selection.
What are the two approaches in subset selection
What is Forward selection? Explain in detail.
Here, we start with no variables and add them one by one such that the variables that decrease the error at the most get added. The addition occurs until there is no more decrease in the error rate.
Explain the algorithm.
What is Backward Selection? Explain in detail.
Here, we start with the set containing all the variables(features) and at each step, we remove that variable that causes the least error.
Explain the algorithm.
What is a Principal Component Analysis (PCA)?
It is a statistical procedure thet uses orthogonal transformation convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.
The number of principal components is always less than or equal to the number of original number of variables/observations.
What are the various steps in PCA? Explain.