What is the purpose of boosting methods in ensemble techniques?
To train a sequence of simple predictors that correct the errors of previous ones
Boosting methods improve the performance of weak classifiers.
Name the two boosting methods discussed.
These methods are used to enhance the performance of weak classifiers.
In AdaBoost, what happens to the training data instances when training the ith predictor?
Instances are weighted to increase the influence of those misclassified by the ensemble
This helps improve the accuracy of the ensemble model.
What is the AdaBoost model based on?
A weighted voting classifier
The predictions are made based on the performance of the predictors.
In AdaBoost, what does a weight of 2 for a training instance imply?
The instance contributes twice as much to the objective function
This means it has a greater influence on the training process.
What is the initial weight of each training instance in AdaBoost?
w(i)1 = 1/m for all i
Here, m is the number of training instances.
What is a key characteristic of Gradient Boosting?
Each predictor is trained to directly correct the errors of the previous ones
This sequential training improves the overall model accuracy.
In Gradient Boosting for Regression, what is the initial value of F0(x)?
F0(x) = 0
This serves as the starting point for the predictions.
What does the learning rate parameter (η) do in Gradient Boosting?
Controls the contribution of each predictor to the final prediction
Affects how quickly the model learns.
True or false: In AdaBoost, the estimators can be trained in parallel.
FALSE
Estimators in AdaBoost must be trained sequentially.
What is a common base classifier used in AdaBoost?
Small decision trees
However, any classifier can be used.
What is a drawback of ensemble methods like AdaBoost and Gradient Boosting?
Loss of interpretability
The complexity of the model makes it harder to understand.
What is the final ensemble in Gradient Boosting based on?
The sum of the predictions of the base predictors
This aggregation improves the overall prediction accuracy.
What is the role of validation in boosting methods?
To determine the learning rate and the number of predictors to use
Helps in tuning the model for better performance.
What is the main goal of ensemble learning?
To combine the predictions of many weak predictors to make a strong predictor
This enhances the overall predictive performance.
What is the curse of dimensionality?
The phenomenon where algorithms and ML models see degradation in performance on high-dimensional data
This can affect computational cost and model performance.
List two issues caused by the curse of dimensionality.
High-dimensional space has counter-intuitive geometry leading to these issues.
What does dimensionality reduction aim to achieve?
Creating a new representation of a data set in fewer dimensions while preserving its structural properties
This process can help mitigate issues related to high-dimensional data.
Name two benefits of dimensionality reduction.
Some models perform better in lower dimensions.
What is a major drawback of dimensionality reduction?
Loses information
Ground truth decision boundaries may become more complex.
What are projection techniques in the context of dimensionality reduction?
A class of linear dimensionality reduction techniques
Examples include principle component analysis (PCA) and random projection.
What does Principle Component Analysis (PCA) find?
The dimensions of maximum variation in the data
Finding principle components is equivalent to finding a singular value decomposition (SVD).
What is the equation for the PCA decomposition?
X = UΣV⊺
Where U, V, and Σ are matrices representing the decomposition.
How can you reduce dimensions using PCA?
By projecting to the first k principle components: Tk = XVk
Vk contains the first k columns of V.