Chapter 5.2 - Classification (Part II) - Kernel Methods Flashcards

(24 cards)

1
Q

What is SVM, and how does it relate to SVC?

A

H

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we use basis expansion in SVC optimisations to enlarge the space of features by including squared terms?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If we use a basis expansion of cubic polynomials on (x(1), x(2)</sup), how does the SVC optimisation problem change?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the problem with specifying a large basis?

What method do we use to efficiently enlarge our feature space to accommodate a nonlinear boundary between classes?

What is the key idea about SVC we will be using?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can both the MMC and the SVC be represented as their inner product?

A
  • Complexity of inner product = O(p) –> cost saving over standard basis expansion
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the informal idea behind using the kernel and its association with the inner product?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we conduct SVM using a kernel?

A
  • Benefit of the kernel method: Dont need to know what high-dimensional feature we are working in, or even need to compute the inner products, if we have a function k(x,x’) that denotes the relationship of the inner products when transformed into the higher-dimensional spaces.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can we take the original ridge regression optimisation problem and represent it in the form of an inner product?

Thus, can we apply the kernel method under the ridge regression?

A
  • Feb 19th (2 ) March –> 37mins –> WORK through algebra
  • Matrix XXT every element becomes the pairwise inner product
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the summary of the intuition behind using the kernel?

So what question do we have when using the kernel?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the mathematical setup, features and variables used in the kernel method?

(6 bullet points)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we define the (real) vector space?

What is an inner product space? What properties does it have to satisfy?

What is the notion of “length” given by in this space?

What is a Hilbert space?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the formal definition of a Kernel?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the formal definition of a (Positive semi-definite kernel)?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the definition of a positive semi-definite matrix?

A

kernel <–> positive-semi definite kernel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Given k1 and k2 are kernels, what else is a kernel?

A
  • third is the pointwise limit
  • if ki = constant –> this is always a kernel
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the:

  • Linear kernel
  • polynomial kernel
  • Gaussian kernel
17
Q

What are the three equivalent definitions of a Kernel: Moore-Aronszajn Theorem?

18
Q

How do we build towards the reproducing kernel Hilbert space ( RKHS)?

A
  • We can construct the Hilbert spaces from any linear combinations of the φj
  • z = predictor
19
Q

What is the formal definition of a Reproducing Kernel Hilbert Space (RKHS)?

A
  • The reproducing kernel of any RKHS is a kernel
  • For any positive semi-definite kernel k, there exists a (unique) associated RKHS with k being its reproducing kernel (Moore–Aronszajn theorem).
20
Q

What is the Representer Theorem?

A
  • You are trying to find F that minimises the empirical risk + penalty term with the penalty being a strictly increase funciton of the norm of f squared. –> where you are working in a RKHS
21
Q

With a kernel, what does the SVM classifier for a new observation depend on?

22
Q

What are four popular kernels?

23
Q

How do SVM work with a radial kernel?

24
Q

What is the advantage of kernel methods?