Purpose of tree-based algorithms
Prediction
( applicable to regression and classification )
Impurity measures
Used to determine the quality of the split
- Gini Index
- Entropy
- Re-substitution error
Explain decision tree alg
(0. Pre-processing e.g. binarization, else use regression tree instead of classification tree)
1. Recursive binary splitting and determine splits via impurity measures
2. Improve with Cost complexity pruning –> Grow large tree and prune it back.
( Select tuning parameter using (k-fold) cross-validation )
Decision Tree advantages
Decision Tree disadvantages
Effect of Bagging, Boosting, RF
+ Increase predictive accuracy (lower variance)
RF: Adds random selection to bagging, to produce uncorrelated trees. Less risk of overfitting
Explaining Bagging
Explain Boosting
System of ensemble learners, using a gradient boosting function to iteratively train models that use data values that have been modelled poorly in previous iterations.
( Bags chosen at random with replacement = bootstrap samples)
+ Improve performance and reduce variance
Boosting vs. RF vs. Bagging
Boosting:
— Selects from all predictive variables
— Sequentially depends on error rate of previous iteration
— 3 Tuning Param
Benefit: Learns with previous error term
RF:
— Selects from subset of predictive variables
— Built independently at each iteration
— 2 Tuning Param
Bagging:
— Aggregation (by mean) of bootstrap samples