Activation Functions Flashcards

(7 cards)

1
Q

Softmax

A
  • activation function
  • common in multi-class classification
  • transforms a vector of real numbers into a probability distribution, where each element represents the probability of the input belonging to a specific class.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ReLU

A
  • activation function
  • Rectified Linear Unit
  • Outputs input directly if it’s positive. Outputs zero otherwise.
  • ReLU(x) = max(0, x)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the benefits of ReLU?

A
  • Adds non-linearity to the network enabling it to learn complex patterns in the data
  • Computationally inexpensive
  • Alleviates vanishing gradient problem that can occur with sigmoid and tanh fns during backpropagation in deep networks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sigmoid

A
  • activation function
  • transforms input into an output between 0 and 1 (a probability)
  • often used in binary classification
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

tanh

A
  • hyperbolic tangent
  • activation function
  • squashing function
  • maps a wide range of values to -1 to 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what’s faster to train with sigmoid or tanh?

A
  • tanh because it is zero centered
  • tanh has steeper gradient around the center compared to sigmoid which can mitigate the vanishing gradient problem during backpropagation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Vanishing gradients

A

occurs when gradients, used to update neural network weights during training, become extremely small as they propagate backward through the network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly