Chapter 4.5: Regularisation- Extension and related other problems Flashcards by Dylan Ottey

Can we use a l_q penalty for Lasso?

How well did you know this?

Not at all

Perfectly

How do Lasso and Ridge compare when it comes to variable selection?

Is there any way to get the “best of both worlds”?

Lasso can be unstable, may just pick and shrink to say two out of a pool of 10 variables, but as they are highly correlated, we might want to account for all 10 of these.

Cant do L_q –> So could do Elastic net, Group Lasso

How well did you know this?

Not at all

Perfectly

What is the elastic net penalty?

alpha is how much you put on the l₁ penalty versus l₂

So you are doing proportional shrinkage + soft thresholding: you are doing the proportional shrinkage from the Ridge part, but also soft thresholding in the Lasso part.

How well did you know this?

Not at all

Perfectly

What is the Group Lasso penalty?

partition highly correlated variables into different groups, and you want all variables to be selected (all have non-zero estimates) or not selected (all zero estimates) at the same time
So you use select sparse groups (i.e. l₁ Lasso regression) but within groups you are using Ridge regression shrinkage.

How well did you know this?

Not at all

Perfectly

What is Fused, Lasso?

encourages neighbouring variables to have similar coefficients or the same coefficients
This is particularly useful when you have a natural ordering of the variable or some partial ordering of the variables

How well did you know this?

Not at all

Perfectly

What happens when you use l₂ penalty with a logistic regression?

How well did you know this?

Not at all

Perfectly

How do we perform a regression without specifying what order?

allows you to not necessarily need to state a hypothesis class we want to work in
allows for very flexible function classes –> still need them to be possibly continuous or possibly to be differentiable (i.e. smoothness condition in there)

How well did you know this?

Not at all

Perfectly

What is one way of implementing non-parametric regression?

What are the special cases?

What do the smoothing splines minimise? What is the form of the unique minimiser?

How well did you know this?

Not at all

Perfectly

Chapter 4.5: Regularisation- Extension and related other problems Flashcards