Introduction To Machine Learning Flashcards

(17 cards)

1
Q

What is the input/feature

A

The variables you use to make a prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the output/target

A

What you are trying to predict

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a data matrix

A

A table of many observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a target vector

A

The column of correct answers/labels for each row

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a model

A

A mathematical function that maps inputs to outputs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are parameters

A

The numbers inside a model that get adjusted during training

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is training

A

Tuning the parameters so the model fits the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is machine learning about fundamentally

A

Finding patterns in data to make predictions on new, unseen data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the idea behind the 1-Nearest Neighbour algorithm

A

To classify a new point, find the closest training example and assign then same label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the decision boundary in the 1-Nearest Neighbour

A

If you apply 1-NN everywhere, you can draw a line separating the regions where the algorithm predicts X vs Y. This is the decision boundary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the main problem with 1-NN

A

An outlier would create a small ‘island’ of the wrong class in the middle of the correct region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the fix to the outlier problem in 1-NN

A

k-Nearest Neighbours

Instead of 1 neighbour, use the k nearest neighbours and take a majority vote

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the tradeoff when choosing the size of k

A

k too small = sensitive to noise
k too big = ignores local structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Euclidian distance formula

A

Distance = sqrt ((x1 - x2)^2 + (y1 - y2)^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the 2 causes of uncertainty in ML outcomes

A
  1. True randomness (noise)
  2. Lacking information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Difference between classification and regression

A

In classification, the target is discrete (labels like Cat/Dog)

In regression, the target is a real number

17
Q

What is the goal of minimising the sum of squared distances

A

To train or optimise by tuning its parameters so that the predicted function fits the observed data as closely as possible