What is the Goal of Classification
Previously unseen records should be assigned a class from a given set of classes as accurate as possible
Explain the Approach of Classification
Which variants of Classification exist
What are the steps in the Model Learning and Model Application Process?
Training Set -> (Induction) -> Learn Model -> Model
-> Apply Model -> Deduction -> Unseen records
Give some classification examples
List seven classification techniques
Explain the approach of K-nearest-neighbor
It requires:
Approach:
For each unknown record:
1. Compute distance to each training record
2. Identify k-nearest neighbors
3. Use class labels of nearest neighbors to determine class of unknown record
- majority vote or
- weighting vote according to distance
What is a k-nearest neighbor ?
If you have a unknown record x the k-nearest neighbors are data points that have the k smallest distance to x
How to choose a good value for K?
Role of Thumb: Test k values between 1 and 20
If k too small: result is sensitive to noise points
If k is too large: neighborhood may include points from other classes
What are the pros and cons of k-nearest-neighbor classification
+ Often very accurate (for optical character recognition)
- but slow (unseen records are compared to all training examples)
Neutral aspects:
Describe Lazy Learning
Single Goal: Classify unseen records as accurately as possible
Example: KNN
Describe Eager Learning
Compared to lazy learning, eager learning has two goals:
Examples: decision tree learning, rule learning
What are the components of a decision tree classifier
Why are the decision boundaries of a decision tree parallel to the axes?
Because the test condition involves a single attribute at-a-time
-> It is a step by step division of the space into areas parallel to the axes
How to learn a decision tree from training data?
Example Algorithms
Explain Hunt’s algorithm (decision tree)
O1 ) If Dt only contains records of the same class then t is a leaf node. O2 ) If Dt contains records with more than one class use attribute test in order to split data into subsets with a higher purity
Recursively apply procedure to each subset
What are the design issues for learning decision trees?
How can you split nominal attributes in decision trees
- Binary split: Divides values into two subsets
How can you split ordinal attributes in decision trees
- Binary split: Divides values into two subsets while keeping the order
How can you split continuous attributes in decision trees
Explain equal-interval binning (decision tree splitting based on continuous attributes)
For values of the attribute age of a person:
0, 4, 12, 16, 16, 18, 24, 26, 28
Specify a bin width e.g. 10
Bin 1: [-,10)
Bin 2: [10,20]
Bin 3: [20,+]
Explain equal-frequency binning (decision tree splitting based on continuous attributes)
For values of the attribute age of a person:
0, 4, 12, 16, 16, 18, 24, 26, 28
Specify a din density e.g. 3:
Bin 1: [-,14]
Bin 2: [14,21]
Bin 3: [21,+]
How to find the best attribute split for decision trees?
Greedy approach: Test all splits and choose the one with the most homogenous (pure) nodes (the one with the lowest impurity measure after splitting)
Requires: measure of node impurity
Common Measure:
Which bucket has the higher degree of impurity?
C0: 5 C0:9
C1: 5 C1: 1
The left one