StatQuest Flashcards

(13 cards)

1
Q

What are the two main types of problems?

A

Regression and Classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain regression type problems

A

When something changes, by how much does the dependent variable also change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain classification type problems

A

In which category does ‘x’ fall into

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Categories of data

A

Training Data
Test Data
Validation Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you know data best for training vs testing and what’s the process called?

A

Cross-validation
Split it into 10 sections for best results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

For regression problems, how do you know if you should use a straight line through the graph or follow the points?

A

See the difference between the predicted vs actual predications from both and total these up and the one with the smallest differences should be the model you pick

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a trend line?

A

The line that goes through the center of the training data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is ‘Bias-Variance Tradeoff’?

A

Fitting the Training Data well but making poor predictions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is cross validation?

A

Allows us to test many different ML methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s another word for ‘estimating the parameters’?

A

Training the algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What do you call it if you splice your data up into 10 chunks?

A

Ten-Fold Cross Validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is ‘Leave One Out Cross Validation’?

A

Happens when you call out each sample for cross validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Confusion Matrix

A

Splits data into false positives, false negatives, true positives and true negatives

Negatives means 0 and Positives means 1

True means the model got it right and ‘False’ means they got it wrong

Where the columns match is the correct number of predictions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly