ML Model Analysis Flashcards

Question

What does **R² (R-squared)** indicate?

Answer 1

How well the line fits the dots ## Footnote A value close to 1 indicates a good fit, while close to 0 indicates a poor fit.

Answer 2

Decent fit ## Footnote This is a rule of thumb for evaluating model performance.

Answer 3

Counts predicted vs actual ## Footnote It helps derive important metrics for evaluating classification models.

Answer 4

True Positive ## Footnote TP indicates predicted sunny days that were actually sunny.

Answer 5

How many did I get right overall? ## Footnote Accuracy is useful only if classes are balanced.

Answer 6

Of all real sunny days, how many did I actually catch? ## Footnote Recall measures the ability to identify true positives.

Answer 7

When I said 'sunny', how often was I right? ## Footnote Precision assesses the reliability of positive predictions.

Answer 8

A balance between precision and recall ## Footnote Use the F1 score when both precision and recall are important.

Answer 9

Of all real NOT-sunny days, how many did I correctly detect? ## Footnote Specificity evaluates the model's ability to identify true negatives.

Answer 10

FALSE ## Footnote Acceptable error metrics depend on the specific problem being addressed.

Answer 11

How wrong the model is ## Footnote They provide insight into the model's prediction accuracy.

Answer 12

* Accuracy: overall right * Precision: trust my positives * Recall: did I miss real positives ## Footnote These metrics are crucial for evaluating classification performance.

Answer 13

* MAE, RMSE → how wrong * R² → how well it fits ## Footnote This summarizes the key points for regression evaluation.

Answer 14

* Accuracy → overall right * Precision → trust my positives * Recall → did I miss real positives * F1 → balance ## Footnote This summarizes the key points for classification evaluation.

Answer 15

Silhouette score ## Footnote It assesses how well data points fit within their clusters and how distinct those clusters are from one another.

Answer 16

* Eye size * Length * Weight ## Footnote These features are used to group fish into clusters without predefined labels.

Answer 17

Silhouette score ## Footnote It measures the compactness of clusters and their separation from each other.

Answer 18

Very good clustering ## Footnote It means points fit well within their cluster and clusters are well separated.

Answer 19

Overlapping groups ## Footnote It suggests that a point is sitting between clusters.

Answer 20

Point probably belongs to a different cluster ## Footnote This suggests poor clustering performance.

Answer 21

* Points inside are close together * Different clusters are far apart ## Footnote Silhouette score checks for these conditions to evaluate clustering quality.

Answer 22

* Shows silhouette score of every point * Grouped by cluster ## Footnote It helps visualize clustering performance and identify issues with cluster sizes or averages.

Answer 23

FALSE ## Footnote It only indicates whether the grouping is compact and well separated, not the actual true grouping.

Answer 24

* Fit inside your cluster * Separation from other clusters ## Footnote This is essential for evaluating the effectiveness of clustering without labels.

Answer 25

+1 to -1 ## Footnote A score close to +1 indicates good clustering, while a negative score indicates poor clustering.

Answer 26

how wrong the model is ## Footnote Bias occurs because the model is too simple.

Answer 27

under-learning ## Footnote This happens when the model cannot capture the underlying pattern of the data.

Answer 28

how much the performance changes ## Footnote Variance measures the difference in performance between training data and new data.

Answer 29

over-learning ## Footnote This occurs when the model is too complex and memorizes the training data.

Answer 30

* Bias goes down * Variance goes up ## Footnote This relationship is crucial for understanding model performance.

Answer 31

* Low bias * Low variance ## Footnote Achieving this means the model learns the real pattern and performs well on new data.

Answer 32

Total error is smallest somewhere in the middle ## Footnote As complexity increases, bias decreases and variance increases.

Answer 33

too simple, high bias ## Footnote An example is a straight line model that misses the pattern.

Answer 34

too complex, high variance ## Footnote An example is a zig-zag line model that memorizes the data.

Answer 35

a straight line ## Footnote This model is too simple and cannot follow the real shape of the data.

Answer 36

a crazy zig-zag line ## Footnote This model is too complicated and performs poorly on new data.

Answer 37

variance problem ## Footnote This highlights the issue of overfitting when a model performs well on training data but poorly on test data.

Answer 38

Too simple model that fails to learn the real pattern ## Footnote Underfitting results in bad performance on both training and test data.

Answer 39

Too complex model that memorizes training data but fails on new data ## Footnote Overfitting results in very good training performance but bad test performance.

Answer 40

Works well on new (unseen) data ## Footnote This is referred to as generalisation.

Answer 41

* Training performance: bad * Test performance: bad ## Footnote The model is not smart enough for the problem.

Answer 42

* Not enough data * Model too simple * Not enough useful features * Bad hyperparameters ## Footnote Underfitting is associated with a high bias problem.

Answer 43

* Training performance: very good * Test performance: bad ## Footnote The model learned noise instead of the real pattern.

Answer 44

* Model too complex * Too many features * Too little useful data * Bad hyperparameters * Not enough variety in data ## Footnote Overfitting is associated with a high variance problem.

Answer 45

Compare training score and test score ## Footnote This helps identify underfitting, overfitting, or a good fit.

Answer 46

* Training performance: good * Test performance: slightly worse ## Footnote This indicates a normal and healthy model.

Answer 47

FALSE ## Footnote The goal is to achieve good performance on unseen data, not perfect training fit.

Answer 48

The data may not contain a useful pattern ## Footnote Not every dataset can predict the future.

Answer 49

Too simple ## Footnote Underfitting results in bad performance on both training and test data.

Answer 50

Too complex ## Footnote Overfitting results in great performance on training but bad on test.

Answer 51

In the middle ## Footnote A good model generalises well.

Answer 52

To compare models more fairly by using multiple splits of the data ## Footnote It helps avoid relying on a single validation split that may be misleading.

Answer 53

You may choose the wrong model due to a lucky or weird validation chunk ## Footnote This can lead to unreliable model selection.

Answer 54

Using a single validation set to test different models and settings ## Footnote It is fast but has high variance due to relying on one try.

Answer 55

* Split data into k equal parts * Use k-1 folds for training and 1 fold for validation * Rotate folds for validation ## Footnote This method averages scores for a more reliable comparison.

Answer 56

* Every data point gets to be a validation point * Lower variance * More trustworthy comparison ## Footnote It avoids reliance on a single lucky split.

Answer 57

Keep a separate test set used only at the very end ## Footnote Cross-validation is done on training and validation data only.

Answer 58

* k = 5 * k = 10 ## Footnote More k values provide more reliability but increase computation time.

Answer 59

Each run uses 1 data point for validation and all others for training ## Footnote It is very low variance but extremely expensive in terms of computation.

Answer 60

Leave out p points for validation each time ## Footnote It is a compromise between LOOCV and k-fold.

Answer 61

| Method | Good | Bad | |-----|-----|-----| | Hold-out | fast, cheap | only one try → unreliable | | k-fold | much more reliable | slower (run k times) | ## Footnote This table summarizes the pros and cons of each method.

Answer 62

* Choosing the best algorithm * Choosing the best hyperparameters ## Footnote It is not used for final performance reporting.

Answer 63

compare models more fairly ## Footnote k-fold involves rotating the validation set and averaging scores while keeping the test set untouched.

Answer 64

* outcomes * metrics * outputs * heuristics * data ## Footnote An ML project is not about models but about agreeing on these aspects with the client.

Answer 65

What do we want to achieve? ## Footnote Examples include predicting revenue, detecting spam, approving/rejecting requests, and grouping customers.

Answer 66

It depends on the type of problem: * Regression: MAE, MSE, RMSE, R² * Classification: accuracy, precision, recall, F1, specificity * Clustering: silhouette score ## Footnote Each metric measures different aspects of prediction accuracy.

Answer 67

* a number (e.g. predicted revenue) * a label (go/no-go, spam/not spam) * a cluster ID ## Footnote The output determines the model used and the necessary inputs.

Answer 68

A simple rule of thumb that is fast, cheap, and not perfect ## Footnote Example: 'If an email is from this address → it’s spam.'

Answer 69

If your fancy ML system is no better than your heuristic, keep the heuristic ## Footnote It is often cheaper.

Answer 70

* databases (SQL, NoSQL) * APIs * CSV/Excel * JSON * text files * images * video * audio * cloud storage * GitHub ## Footnote Data can be structured or unstructured.

Answer 71

* errors * missing values * bad formatting * corrupt files * inconsistent types * out-of-date data ## Footnote Always check data quality as old data can be stale.

Answer 72

* what outcome you want * which metric defines success * what output the system must give * whether a heuristic is already good enough * where the data comes from and how reliable it is ## Footnote These agreements are crucial for project success.

ML Model Analysis Flashcards

(96 cards)