What are two models for supervised learning?
What is the K-nearest neighbours (KNN) algorithm?
A supervised machine learning algorithm that classifies a new data point into the target class, depending on the features of its neighbouring data points.
What are the steps in using the KNN algorithm?
What is Kappa in the KNN output?
Kappa = (Accuracy - random accuracy) / (1 - random accuracy)
A negative value would mean that random assignment would do better than our prediction.
What are decision trees used for?
Trees are a way to split data into purer subsets. We try to find the easiest way to group data in the most homogeneous categories, iteratively.
What does the process of decision trees look like?
How do we find the best split for decision trees?
The Gini impurity.
What is a limitation of the Gini impurity?
It is computed only for the possible splits of one node, so it can only reach a local minimum (algorithm might not identify the optimal tree overall).
What are some limitations of decision trees classifiers?