L7: Machine Learning Flashcards

Question 1

Q

Generally, what is Machine Learning?

Question 2

Q

What are the Key categories of ML?

Question 3

Q

What are the two main tasks of ML?

What are some other tasks?

Question 4

Q

What are the common ways to measure the performance for classifications?

Answer

A

accuracy
Precision
Recall
F₁ or more generally F

Question 5

Q

What are the common ways to measure the performance for classifications?

Answer

A

Mean Squared Error MSE
Root Mean Squared Error RMSE
Mean Absolute Error
R-squared

Question 6

Q

What is our basic notation?

a single example
a feature or attribute
dataset
sample or example
label or target
design matrix X

Question 7

Q

What does a design matrix look like? What is the notation?

With this notation, what is the goal of supervised learning?

Question 8

Q

What is a train-test split?

Question 9

Q

Standard pipeline for machine learning?

Question 10

Q

What model will we generally use for regression ML?

Question 11

Q

In our regression models, what do we seek to minimise?

Question 12

Q

How do we go about minimising the MSE in practice?

Question 13

Q

How can I visualise this algebra?

Question 14

Q

What is the intercept in linear regression models referred to?

Question 15

Q

What are training errors, and what are test/generalisation errors?

What are underfitting and overfitting, and how do they relate to these errors?

Question 16

Q

What is a way to control the capacity?

Answer

Study These Flashcards

A

while the data appears non-linearly(x’s) it is a linear model in the weights
There is an optimal place in which we minimise the generalisation gap.

Question 17

Q

What is Regularisation?

Answer

Study These Flashcards

A

Question 18

Q

How does Regularisation apply to linear regression?

Answer

Study These Flashcards

A

Ridge regression is the baseline standard (like using Gaussian)
Lasso is useful as it will set some weights exactly to zero –.> could allow you to throw out those variables from the model to make your algorithm more efficient

Question 19

Q

What are hyperparameters?

Answer

Study These Flashcards

A

Question 20

Q

What is the standard training cross-validation protocol?

Answer

Study These Flashcards

A

Question 21

Q

What is Cross-validation? How do we use it to pick the best Hyperparameters for our model?

Answer

Study These Flashcards

A

WE dont know which are the best hyperparameters to use for the model that minimise our errors or maximise our performance scores - hence the need for cross validation –> test how my data performance on the hyperparameters (using grid search and then use the best one to train my model on).

Question 22

Q

What is the k-fold cross-validation method?

What is grid search for hyperparameter search?

Answer

Study These Flashcards

A

SPlit the data in k folds and and then split the data again into 5, I will then use k-1 on the fold to train the data for our hyperparameter, then I would tell on the remaining fold in our split. THen I would repeat k number of times aross each split testing on each of the folds throughout the split
There is usually just one best hyperparameter so we use grid search to test what value or order of magnitude of the hyperparameter we want, this can be done logarithm me on even spacing and we plot the hyperparameter against the score on a graph (grid search is more efficient in low dimensions).
or you can do a random shot gun approach through random search

L7: Machine Learning Flashcards

(22 cards)