what is the difference between the output of a linear regression model and a softmax regression model?
linear regression - a scalar value that represents the prediction of the dependent variable based on the input features
softmax regression - a probability distribution over all possible classes, the predicted class is usually the one with the highest probability
why must logits be passed to the softmax function?