Session 4.1 Flashcards

Question 1

Q

Bias-Variance tradeoff

When trying the optimal model we are in fact trying to find…

Answer

A

the optimal tradeoff between bias and variance

Question 2

Q

Bias-Variance tradeoff

We can reduce variance by

Answer

A

by putting many models together and aggregating their outcomes

Question 3

Q

Bagging (or bootstrap aggregation) creates

Answer

A

multiple data sets from the original training data by bootstrapping – re-sample with repetition.

Runs several models and aggregates output with a voting system

Question 4

Q

Other ensemble methods

Random Forest

Answer

A

combines bagging with random selection of features (or predictors)

Question 5

Q

Other ensemble methods

Boosting

Answer

A

applies classifiers sequentially, assigning higher weights to observations that have been mis-classified by the previous methods

Question 6

Q

A table model

Answer

A

memorizes the training data and performs no generalization

Useless in practice! Previously unseen customers would all end up with
“0% likelihood of churning”

Question 7

Q

Generalization

Answer

A

is the property of a model or modeling process whereby
the model applies to data that were not used to build the model

If models do not generalize at all, they fit perfectly to the training data !
–> they overfit

Question 8

Q

Overfitting

Answer

A

is the tendency to tailor models to the training data, at the expense of generalization to previously unseen data points.

Question 9

Q

Holdout Validation

Answer

A

Given only one data set, we split it into a a training set used for fitting the model and a test set used for evaluating the model
Performance is evaluated based on accuracy in the test data a.k.a. “holdout accuracy”
Holdout accuracy is an estimate of “generalization accuracy”

Question 10

Q

As a model gets more complex, it is allowed to pick up harmful spurious correlations

Answer

A

These correlations do not represent characteristics of the population in general
They may become harmful when they produce incorrect generalizations in the model

This phenomenon is not particular to decision trees

It is also not because of atypical training data
There is no general analytic way to avoid overfitting

Question 11

Q

Simplest method to limit tree size:

Answer

A

specify a minimum number of instances that must be present in a leaf

Question 12

Q

Just as with trees, as you increase the dimensionality,

Answer

A

you can perfectly fit larger and larger sets of arbitrary points

Often, modelers manually prune the attributes in order to avoid overfitting
There are ways to select attributes automatically

Question 13

Q

Why is overfitting bad?

Answer

A

A small imbalance in the training data can be ’learned’ by the tree and erroneously propagated

Question 14

Q

Why is the phenomenon of overfitting not particular to decision trees

Answer

A

It is also not because of atypical training data

- There is no general analytic way to avoid overfitting

Session 4.1 Flashcards

(14 cards)