Chapter 4.5: Regularisation- Extension and related other problems Flashcards

(8 cards)

1
Q

Can we use a lq penalty for Lasso?

A
  • When looking at the nature of shrinkage, we look at the contours
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do Lasso and Ridge compare when it comes to variable selection?

Is there any way to get the “best of both worlds”?

A
  • Lasso can be unstable, may just pick and shrink to say two out of a pool of 10 variables, but as they are highly correlated, we might want to account for all 10 of these.

Cant do Lq –> So could do Elastic net, Group Lasso

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the elastic net penalty?

A

alpha is how much you put on the l1 penalty versus l2

  • So you are doing proportional shrinkage + soft thresholding: you are doing the proportional shrinkage from the Ridge part, but also soft thresholding in the Lasso part.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the Group Lasso penalty?

A
  • partition highly correlated variables into different groups, and you want all variables to be selected (all have non-zero estimates) or not selected (all zero estimates) at the same time
  • So you use select sparse groups (i.e. l1 Lasso regression) but within groups you are using Ridge regression shrinkage.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Fused, Lasso?

A
  • encourages neighbouring variables to have similar coefficients or the same coefficients
  • This is particularly useful when you have a natural ordering of the variable or some partial ordering of the variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What happens when you use l2 penalty with a logistic regression?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we perform a regression without specifying what order?

A
  • allows you to not necessarily need to state a hypothesis class we want to work in
  • allows for very flexible function classes –> still need them to be possibly continuous or possibly to be differentiable (i.e. smoothness condition in there)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is one way of implementing non-parametric regression?

What are the special cases?

What do the smoothing splines minimise? What is the form of the unique minimiser?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly