ML Fundamentals Principles, Supervised and Unsupervised Learning Flashcards

Question 1

Q

What does the cost function measure

Answer

A

Mismatch between model and data

Question 2

Q

What are the 2 types of supervised learning

Answer

A

Regression
Classification

Question 3

Q

What is the core assumption behind k-NN (smoothness assumption)

Answer

A

If observations are close in the input space, they are also close in the output space

Question 4

Q

What are the steps in k-NN

Answer

A

Compute distances from new point to all training points.

Pick the k closest.

Classification → majority vote

Regression → average their outputs

Question 5

Q

What is overfitting

Answer

A

Tuning the model parameters too closely to the noise in the measurements, which prevents it from generalizing well

Low training error, high test error

Question 6

Q

What is underfitting

Answer

A

Using a model that is too simple (e.g. a straight line for non-linear data) to make good predictions

High training error, high test error

Question 7

Q

How do we test how well a model generalises

Answer

A

The model is trained on the training set, and its performance is evaluated by how well it predicts outputs for the test set.

Low test error = good generalisation

Question 8

Q

What is Ockham’s razor

Answer

A

A principle suggesting the simplest solution is usually the best

Question 9

Q

What is the goal of unsupervised learning

Answer

A

Goal is not prediction, but gaining insight into the phenomenon itself by finding internal relationships and patterns

Question 10

Q

What is clustering

Answer

A

Finding groups of similar observations

Question 11

Q

What is dimensionality reduction

Answer

A

Compressing data while preserving structure

Question 12

Q

K-Means clustering process

Answer

A

Choose the number of clusters (K).
Select K initial “centroids”.
Assign points to the nearest centroid.
Update centroid positions.
Repeat until convergence.

Question 13

Q

What does supervised learning mean

Answer

A

The dataset contains inputs and correct outputs

Question 14

Q

How to find test error

Answer

A

∑(ytest−ypred)2

Question 15

Q

What are the sources of bias in machine learning

Answer

A

Biased datasets
Wrong labels
Missing data
Unbalanced data
Poor feature choice

Question 16

Q

What matters more, data quality or quantity

Answer

Study These Flashcards

A

Quality

Question 17

Q

Supervised vs Unsupervised Learning

Answer

Study These Flashcards

A

Supervised - Have labels, predict target

Unsupervised - No labels, understand data structure

ML Fundamentals Principles, Supervised and Unsupervised Learning Flashcards

(17 cards)