L8: Classification Problems Flashcards

(26 cards)

1
Q

What are the three important approaches for classification problems?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SVM: what is the high-level intuition behind this?

A
  • Imagine each column of our design matrix X formed a hyperplane –> in which we end up with two characteristics we want to separate the data on –>
  • Ideally, these would be linearly separable data and see on different sides of the hyperplane –> and we are trying to find out how to split this plane to categorise our data correctly.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

SVM what is the Affine function for our hyperplane?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

SVM: What is our support vector and the notion of the “best hyperplane”?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

SVM: How do you actually form the support vector?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

SVM: What do we do when our features are generally not linearly separable?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SVM: What is our dual optimisation problem when we account for soft margins?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some non-linearly separable problems?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are Kernels and the Kernel trick?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the RBF Kernel?

How does SVM look on a graph with a linear kernel vs an RBF kernel?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Summary of SVM?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Logistic Regression? What is the output of the model?

What does it estimate?

What is the log-loss function and what are we trying to do with it?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Optimisation 1?

What is a convex and non-convex loss function?

What is the difference between a local minimum/global minimum, and a unique minimum?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the optimisation problem in our linear regression?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What 3 pieces of terminology is used interchangeably within optimisation problems?

Do they have any nuanced differences?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Generally, what does constrained optimisation look like?

17
Q

What is the ordinal encoding of categorical features?

18
Q

What is one-hot and dummy encoding of categorical data?

19
Q

What are performance measures?

What are the common ones for classifications?

What are the common ones for regression?

20
Q

What is a confusion matrix?

A

Imagine we are testing for something bad e.g. if someone has an illness)

  • False Positive = false alarm –> you say something is bad when its actually fine (waste resources and trust)
  • False negative = missed detection –> you say something is fine when it is actually bad (cause real-world harm and risk)
21
Q

How do we calculate accuracy as a performance measure?

What is it importance and it limitations?

22
Q

What is precision and recall as performance measures?

23
Q

What is F1 or, more generally, F as a performance measure?

24
Q

What is the Receiver Operating Characteristic Curve?

What is the Area Under the Curve (AUC)?

25
What is the summary of all the performance metrics?
26
Examples of when we would want different F_b measures?
Example 1: Medical Cancer Screening (False negative is worse!) so B >1 True Positive --> Patient has cancer and test detects False Positive --> Patient doesnt have cancer but tests positive for it (worrisome but not life threating) False Negative --> Patient has cancer but tests negative --> Thinks they are find but could die Example 2: Emall Spam Filter (False positives are worse) so B< 1 True Positive --> detects spam and deletes email and was spam False Positive --> detects spam and deletes but was an important personal email --> missed legal document this is bad! False negative --> allows spam email in which is annoying but not as bad as a false positive Example 3: Credit Card Fraud Detection --> beta=1 as only dataset will be unbalanced to normal people using their credit cards day to day True Positive --> detects someone is committing credit card fraud and they are False Positive --> Shut down normal peoples accounts --> angry customers and loss of business False Negative --> direct financial loss and fraud growth