Hinge loss
Hinge loss is a loss function used primarily with Support Vector Machine (SVM) models. It measures the error made by the model and aids in maximizing the margin between the decision boundary and the closest instances from different classes in the training dataset.
Hinge loss is used for “maximum-margin” classification, most notably for support vector machines (SVMs). The hinge loss function encourages the model to correctly classify instances and simultaneously pushes the decision boundary away from instances.
Hinge loss is mathematically defined as max(0, 1 - t), where t is the raw model output (t = y * f(x)). If the instance is on the correct side and outside the margin, the loss is zero. If the instance is on the correct side but inside the margin, or on the wrong side, the loss is proportional to the distance to the margin.
In SVMs, the model output f(x) is the result of a dot product between the instance vector and the model’s weight vector, offset by the model’s bias term. The SVM learning algorithm adjusts the weights and bias to minimize a combination of the hinge loss and a regularizing term that encourages small weights.
Hinge loss allows for efficient computation and optimization, and it promotes sparsity, meaning many of the weights in the learned weight vector will be zero. This can make the resulting SVM model compact and efficient for prediction.
Hinge loss is not differentiable at t = 1, which can pose problems for optimization algorithms requiring differentiability. However, this issue is typically addressed using specific optimization algorithms such as sub-gradient descent.
Unlike mean square error or cross-entropy loss which penalize all misclassifications equally, hinge loss does not penalize errors as heavily if the model’s prediction was “close” to being correct. This property allows SVMs to focus more on the hardest instances near the decision boundary.
Hinge loss can also be used for multi-class classification tasks. In this case, the loss is defined with respect to the correct class and the maximum scoring incorrect class.