Image classification Flashcards

Question 1

Q

Name some properties of good features in images

Answer

A

Distinct and discriminative
Local
Invariant to translations and transformations
Invariant to brightness change
Efficient to compute
Robust to noise/ blurring

Question 2

Q

How can the pixel values of a patch be calculated using the integral image

Answer

A

Bottom right corner - bottom left - top right + top left

Question 3

Q

What is the idea behind ensemble learning.

Answer

A

Aggregate the results of several predictors into one prediction

Question 4

Q

In what case will ensamble learning not lead to improved results over one predictor.

Answer

A

If all predictors are higly correlated, so they give the same output.

Question 5

Q

What kind of error do sequential parallel learners reduce?

Answer

A

Sequential learners reduce mostly bias, while parallel reduce variance.

Question 6

Q

Name 4 different ensamble methods, and which of them are parallel/ sequential.

Answer

A

Parallel:
Bagging ( Bootstrap aggregation)
Voting
Random forrests

Sequential:
Boosting

Question 7

Q

How does Bagging work?

Answer

A

Create N versions of the training set, using sampling with replacement and train a weak learner on each one. Use Averaging for regression, and majority voting for classification.

Question 8

Q

What is Out-Of-Bag error for bagging?

Answer

A

For each training sample, xi:

Find the predicition, y_hat_i, of all classifiers not containing xi in the training set.
Average y_hat_i
repeat and average for all training samples.

Question 9

Q

What is boosting?

Answer

A

We compute classifiers sequential, by increasing the weight of missclassified examples each run.

Question 10

Q

Describe the adaboost algorithm

Answer

A

Initialize all weights equally
Training subsets are bootstrapped from the full dataset using weighted sampling
Fit a classifier to the new set
Up-weigh missclassified weights, down-weigh the rest and repeat

Question 11

Q

How can we calculate the probability that a decision tree prediction is correct?

Answer

A

We know how many samples from each class of the test set each node(prediction) contained.

Question 12

Q

Describe the decision tree optimization for creating a feature space partition

Answer

A

At each node Sj
  for each feature
    for each value of this feature:
      evaulate I(Sj, Aj)
 chose the best feature and value for splitting
reapeat

Question 13

Q

What are the two choices for categorizing tree optimization cost functions, I(Sj, Aj)?

Answer

A

Information Gain

2. Gini Index.

Question 14

Q

Desribe the information gain cost function, I(Sj, Aj)

Answer

A

I(Sj, Aj) = Entropy parent - weighted average entropy of children.
Entropy = - sum p(xj) log(p(xj))

Question 15

Q

Describe the Gini Index

Answer

A

Indicates how mixed classes are, perfect seperation results in score 0, 50/50 seperation results in score 0.5.

Gini = 1 - sum (p(yk))**2 
Final = weighted average of the Ginis

Question 16

Q

What loss do we normally use for regression trees?

Answer

Study These Flashcards

A

weighted average of MSE

Question 17

Q

What can we do to prevent overfitting in decision trees?

Answer

Study These Flashcards

A

Combine them to an esamble (Forests).

Question 18

Q

For random forrests, how does the:
1. Number of features selected per node
2. Number of trees
3. Max depth
Affect bias and variance?

Answer

Study These Flashcards

A

More features will increase variance, but might reduce bias.
More trees will reduce variance
Deeper trees will increase variance, but might reduce bias

Question 19

Q

What is the difference between a random forrest and a boosting algorithm with decision trees as weak learners?

Answer

Study These Flashcards

A

The trees differ as the splitting is randomized.

Question 20

Q

Name some advantages of decision trees and dissadvantages of decissions trees?

Answer

Study These Flashcards

A

Advantages:

Explainable model
Can handle multi class problems
Can handle categorical and continous variables
Requires little preprossesing
The cost is logarithmic in the number of samples

Disadvantages:

Prone to overfitting
Biassed towards classes with more datapoints

Image classification Flashcards

(20 cards)