Introduction To Machine Learning Flashcards

Question 1

Q

What is the input/feature

Answer

A

The variables you use to make a prediction

Question 2

Q

What is the output/target

Answer

A

What you are trying to predict

Question 3

Q

What is a data matrix

Answer

A

A table of many observations

Question 4

Q

What is a target vector

Answer

A

The column of correct answers/labels for each row

Question 5

Q

What is a model

Answer

A

A mathematical function that maps inputs to outputs

Question 6

Q

What are parameters

Answer

A

The numbers inside a model that get adjusted during training

Question 7

Q

What is training

Answer

A

Tuning the parameters so the model fits the data

Question 8

Q

What is machine learning about fundamentally

Answer

A

Finding patterns in data to make predictions on new, unseen data

Question 9

Q

What is the idea behind the 1-Nearest Neighbour algorithm

Answer

A

To classify a new point, find the closest training example and assign then same label

Question 10

Q

What is the decision boundary in the 1-Nearest Neighbour

Answer

A

If you apply 1-NN everywhere, you can draw a line separating the regions where the algorithm predicts X vs Y. This is the decision boundary

Question 11

Q

What is the main problem with 1-NN

Answer

A

An outlier would create a small ‘island’ of the wrong class in the middle of the correct region.

Question 12

Q

What is the fix to the outlier problem in 1-NN

Answer

A

k-Nearest Neighbours

Instead of 1 neighbour, use the k nearest neighbours and take a majority vote

Question 13

Q

What is the tradeoff when choosing the size of k

Answer

A

k too small = sensitive to noise
k too big = ignores local structure

Question 14

Q

Euclidian distance formula

Answer

A

Distance = sqrt ((x1 - x2)^2 + (y1 - y2)^2)

Question 15

Q

What are the 2 causes of uncertainty in ML outcomes

Answer

A

True randomness (noise)
Lacking information

Question 16

Q

Difference between classification and regression

Answer

Study These Flashcards

A

In classification, the target is discrete (labels like Cat/Dog)

In regression, the target is a real number

Question 17

Q

What is the goal of minimising the sum of squared distances

Answer

Study These Flashcards

A

To train or optimise by tuning its parameters so that the predicted function fits the observed data as closely as possible

Introduction To Machine Learning Flashcards

(17 cards)