Equation for Entropy
SUM(
-P(outcomeN)*log2(outcomeN)
)
Equation for Information Gain
Entropy(S) -
Sum
(
P(Rows with Feature=Value in Set)
*
Entropy(Feature=Value)
)
What are the two type of Decision Trees?
Classification(Target is discrete)
Regression(Target is continuous)
What are the stopping criteria for decision trees?
1) All or nearly all data has same class label
2) When there are no features to further split upon
3) When tree reaches a predefined depth
What are feature selection measures?
deciding which features are to be considered as the root node and at each level.
Eg: Gini Index, Information Gain