Non-Parametric Classifiers Flashcards

Question 1

Q

How is a non-parametric classifier different to a parametric classifier

Answer

A

Non-parametric classifiers do not make assumptions about undelying data distribution and can adapt more flexibly as more data is available

Question 2

Q

5 factors of non-parametric classifiers

Answer

A

Non-parametric classifiers use the data directly at classification time

No explicit model/distribution of the data i.e. so they’re not governed by parameters. The complexity of the model grows with the size of the dataset

No explicit learning stage

Infinite parameters (in theory), the number of parameters grows as more data is added, which allows for greater flexibility in fitting the data

Rely heavily on the training data to define the decision boundary between classes

Question 3

Q

Advantages of parametric classifiers

Answer

A

The number of parameters is fixed and typically uses less training data

Once parameters are known, the training data can be discarded

Question 4

Q

Advantages and disadvantages of non-parametric classifiers

Answer

A

More flexible

Often expensive and often require a lot of data to learn things that were assumed by parametric approaches

Poor choice when data is known to come from simple distributions

Question 5

Q

What is K-NN

Answer

A

K Nearest Neighbour assigns a class to a new input based on the majority class of its k nearest neighbours in the feature space

Question 6

Q

Strengths and weakness of KNN

Answer

A

S: simple to understand, no need for training, flexible with more data

W: computationally expensive for large datasets, performance can degrade with high-dimensional data

Question 7

Q

How does changing k affect the decision boundaries

Answer

A

The greater the value of k, the smoother the decision boundaries, less jagged. the smaller the value of k the more aggressive the decision boundaries are and the more they are affected by outliers but can be more accurate

the smaller the value the more chance of overfitting, the larger the value the more chance of underfitting

Question 8

Q

What is a decision tree

Answer

A

Builds a tree structure where each internal node represents a decision based on an attribute, and the bottom row of nodes (leaf node) each represents a class label

Question 9

Q

Strengths of a decision tree

Answer

A

Intuitive, can handle both numerical and categorical data, no need for data scaling

Question 10

Q

Weaknesses of a decision tree

Answer

A

Can easily overfit, especially with small datasets, unless pruning or regularisation techniques are used

Question 11

Q

What is a random forest

Answer

A

Combines multiple decision trees (each trained on different samples of the data) to improve classification performance

Question 12

Q

Strengths of a random forest

Answer

A

Reduces the risk of overfitting, highly accurate, handles large datasets and features well

Question 13

Q

Weaknesses of a random forest

Answer

A

Less interpretable than a single decision tree, requires careful timing

Question 14

Q

What is kernel density estimation

Answer

A

Estimates the pdf of the data by summing the influence of kernels placed at each point

Question 15

Q

What is a kernel

Answer

A

If a sandbag is placed at each datapoint, the datapoints which are very close together will pile up into a hill, this is where a kernel has lots of influence

Question 16

Q

How is the pdf calculated in Kernel density estimation

Answer

Study These Flashcards

A

The pdf is computed using a kernel function which is a function of the distance from the training point and a width parameter h. the formula:

Question 17

Q

Strengths of KDE

Answer

Study These Flashcards

A

Can model complex distributions, no need for a fixed number of parameters

Question 18

Q

Weaknesses of KDE

Answer

Study These Flashcards

A

High computational cost with large datasets, kernel choice and bandwidth (h) selection are very important

Question 19

Q

How is KDE used for classification

Answer

Study These Flashcards

A

For each feature we use KDE to estimate how likely a given data point is to belong to each class, then using these probabilities use Bayes’ theorem to classify each datapoint based on which class has the highest likelihood

Non-Parametric Classifiers Flashcards

(19 cards)