Statistical Learning Flashcards

(37 cards)

1
Q

What are predictors (features)?

A

The input variables x1,x2,…,xp. (Ideally) Independent variables used to predict y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the response variable?

A

The output y. Also called the target or dependent variable. The thing we are trying to predict.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the goal of statistical learning?

A

Estimate a function f(x) that relates predictors to response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the general statistical model?

A

Y=f(X)+ε where ε is random error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why can’t we perfectly predict Y?

A

Because of irreducible error ε.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What type of response variable gives a regression problem?

A

Quantitative (numeric).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What type of response variable gives a classification problem?

A

Qualitative (categorical).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Give 2 examples of regression problems.

A

House price prediction, life expectancy prediction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Give 2 examples of classification problems.

A

Fraud detection, disease diagnosis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a linear regression model with p predictors?

A

Y=β0+β1X1+⋯+βpXp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What do the coefficients βj represent?

A

The change in Y for a 1-unit increase in X (slope of each feature)j, holding others fixed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What assumption does linear regression make?

A

The relationship between predictors and response is linear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is polynomial regression?

A

A regression model that includes powers of a predictor (e.g. Y=β0+β1X1+⋯+βp(X_p)^p).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Mean Squared Error (MSE)?

A

average of difference b/w outcomes and prediction, errors, squared. u = (u_1, … , u_p) and v =(v_1, … , v_p)

MSE=(1/p) * ∑(u_i−v_i)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why square the errors in MSE?

A

Penalizes large errors more heavily and ensures no negative errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is MAE?

A

Mean Absolute Error. Average of absolute difference b/w outcomes and predictions.. u = (u_1, … , u_p) and v =(v_1, … , v_p)

MSE=(1/p) * ∑|u_i−v_i|

17
Q

Which metric penalizes large errors more, MSE or MAE?

A

MSE. bc squared

18
Q

What is training data used for?

A

to train y=f(x) to obtain a learned model y=f_hat(x)

19
Q

What is testing data used for?

A

Evaluating accuracy on the learned model

20
Q

what is training error used for?

A

quantify difference b/w true response & models predicted response for the data that the model was fit to.

21
Q

Why is testing error important?

A

It reflects how the model performs on data/information it has never seen before

22
Q

We always expect training error to be ___ testing error.

23
Q

What is overfitting?

A

when the model matches the training so well that there ends up being Low training error, high testing error

24
Q

Why does overfitting happen?

A

Model is too flexible and captures noise. Picking up random trends in data rather than just an underlying pattern

25
Flexible models have ___ bias.
Low
26
Rigid models have ___ bias.
High
27
Flexible models have ___ variance.
High
28
Rigid models have ___ variance.
Low
29
What is the bias-variance tradeoff?
obtaining a small testing MSE requires finding a model f_hat w/ low bias & variance
30
What is irreducible error?
Error from noise 𝜖 that cannot be eliminated or improved upon.(b/c data noise cannot be predicted from model)
31
What is reducible error?
error we can,in theory, improve upon w/ a better modelling choice
32
what is variance?
measures how f_hat changes w/ training data
33
what is bias?
the error obtained by estimating f w/ f_hat. deviance from the truth
34
a sample size n large, few predictor (p small) is
flexible. w/ fewer predictors, harder to overfit
35
many predictors (p large), few observations (n small): ___
inflexible/rigid,more likely to overfit
36
when the relationship is highly nonlinear__-
flexible model. w/inflexibles models its less likely
37
whats the tierlist of stuff from high to low interpretability?
1. Subset Selection Lasso 2. Least Squares 3. Generalized Additive Models/Trees 4. Bagging Boosting 5. Support Vector Machine and Deep Learning