what are Decision Trees (or CARTS)
they partition the predictor space by simple binary splitting rules. Think ‘flow chart’
What does an excessive amount of splitting lead to?
Overfitting
Goal of growing a Regression Tree?
Predict the mean response for the observations that fall into each region Rj from that training set
Goal of growing a Classification Tree?
Predict the categorical response as the most commonly occurring in. each region Rj from the training set
How do we ‘best’ partition the predictor space for regression?
find binary splits which minimize the RSS