What’s the most common splitting criterion?
information gain
What’s the role of Decision Trees?
Create a formula/algorithm that evaluates how well each attribute splits a set of example into segments, with respect to a chosen target variable
To what does disorder correspond to?
to how mixed (impure) the segment is with respec to values of attribute of interest
Formula of Entropy
-p1 log(p1) – p2 log (p2) ….
Define Pi
probability of value i within the set (relative percentage/share)
When is Pi = 1?
when all members of set have attribute i
When is Pi = 0?
when no members of the set have the attributte i
What is the parent set?
the original set of examples
What does an attribute do?
It segments a set of instances into several k subsets.
What are K children sets?
The result of splitting on the attribute values.
How does Information gain measure?
Formula IG(parent)
IG(parent) = Entropy(parent) – p(c1) entropy(c1) – p(c2) entropy(c2) ….
Formula Entropy (HS = square)
Formula Entropy (HS = cricle)
Formula IG = entropy (Write-off)..
What reduces entropy substantially?
splitting parents data set by body shape attribute
How do you find the best attribute to partition the sets?
recursively apply attribute selection
Disadvantages of ID3
List ANN (artificial nerual networks)
Define neurons
cells (processing elements) of a biological or artifical neural network
Define the nucleus
the central processing portion of a neuron
Define the dendrite
the part of a biological neuron tha tprovides inputs to the cell
Define the axon
an outgoing connection (i.e., terminal) from a biological neuron
Define synapse
the connection (where the weights are) between processing elements in a neural network