Evaluating ML/DL models Flashcards

(16 cards)

1
Q

How to avoid overfitting and find the best hyper parameters’ combination?

A

Split data methods:
1. Holdout: train-validation-test data sets;
2. Cross-validation: k-fold cross validation;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is cross validation - k fold

A

The KCV
consists in splitting a dataset into k subsets; then, iteratively, some of them are
used to learn the model, while the others are exploited to assess its
performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is hyper parameter optimisation

A

❑ Hyperparameters influence model performance and generalisation.
❑ Examples: Learning rate, batch size, number of layers, activation functions etc.
❑ Goal: Find the best combination to maximise validation accuracy and minimise overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is grid search

A

❑ Exhaustive search over a manually specified set of hyperparameters.
❑ Evaluates all possible combinations systematically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Give an example of grid search: If learning rate = [0.01, 0.001] and optimiser = [adam, sgd]

A

Grid Search will test 2 × 2 = 4 combinations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the pros of grid search

A

❑ Systematic.
❑ Suitable for small hyper parameter spaces.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the cons of grid search

A

❑ Computationally expensive as dimensions increase.
❑ Not suitable for high-dimensional or continuous spaces.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is random search

A

❑ Randomly samples hyperparameters from specified distributions.
❑ Number of trials is predefined.
❑ More efficient as it avoids exhaustive combinations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Exampe random search: With 10 trials, Random Search might sample:

A

(0.01, adam), (0.001, sgd), (0.01, sgd), …

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the pros of random search

A

❑ More efficient than Grid Search in high dimensions.
❑ Works well with continues hyperparameters and larger hyperparameter
space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the cons of random search

A

❑ No guarantee of finding the absolute best combination.
❑ Performance depends on the number of trials.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Grid search vs random search vs advanced methods

A

❑ Grid Search: Thorough but expensive. Use for small parameter sets.
❑ Random Search: Efficient for large or continuous spaces.
❑ Consider advanced methods (e.g., Bayesian) for complex problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does a classification model aim to provide

A

A classification model aims to provide the
current label for each input.
* Binary classification.
* Multi – class.
* Output is commonly the probability of an
input belonging to a class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the formula for accuracy

A

Accuracy = (TP + TN)/ (TP + TN + FP + FN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is accuracy not the best classification metric

A

While 91% accuracy may seem good at first glance, another tumor-classifier model that always predicts benign would
achieve the exact same accuracy (91/100 correct predictions) on our examples.

Accuracy alone doesn’t tell the full story when you’re working with a class-imbalanced data set, like this one, where
there is a significant disparity between the number of positive and negative labels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly