MAE RMSE RSE RAE R2
The five Metrics for evaluating Regression model Performance for a completed job
Two can be applied to models with different units
Relative Squared and Relative Absolute, since they are relative~
TP TN FP FN
The Confusion Matrix found in Evaluation Results for a completed Classification job
The Confusion Matrix shows (for binary results):
- True Positives - Both predicted and actual values were both 1. Top Left
- True Negatives - Both predicted and actual values were both 0. Bottom Right
- False Positives - The prediction is 1, but the actual is 0. Top Right
- False Negatives - The prediction is 0, but the actual is 1. Bottom Left
For multi-class results, for N-number of possible classifications, there would be an NxN matrix counting the results for each possibility.
A P R F1
The four metrics derived from the results of a Confusion Matrix for Classification Models
For metrics related to Classification Models, the one that is the most intuitive, but is potentially misleading
The most intuitive obviously is Accuracy, though care must be given when using it w.r.t. how well the model works. Ex. 3% of the population has cold sores. Your model could ALWAYS predict 0 and it would be 97% accurate. Notice that this model isn’t at all helpful in predicting cold sores in people?
The ROC Curve and AUC Metric for evaluating Classification Model results.
The ROC Curve - Receiver Operating Characteristic Curve. This is the plot of the False Positive Rate (x-axis) vs Recall/The True Positive Rate (y-axis) for every possible threshold value between 0 and 1. Ideally goes all the way up the left side then curves across the top
The AUC Metric - Area Under the ROC Curve. Measures the quality of the model’s predictions irrespective of what classification threshold is chosen. The larger this area is, the better the model is performing. Imagine a pure coin flip; 50% right 50% wrong. The graph for this would be a straight diagonal line from the origin f(x) = y. The area UNDER is 0.5. The greater this area gets (the greater the line curves upward) the better.
ELI5: It’s ability to guess between Positive and Negative classes.
For evaluating Classification Model results:
The True Positive Rate formula
The False Positive Rate formula
True Positive Rate - TPR = TP/(TP + FN)
FN meaning predicted false but actually true
False Positive Rate - FPR = FP/(FP + TN)
TN meaning predicted false and actually false
The Residual Histogram for evaluating Regression/Forecasting
Remember, residuals are the number of prediction errors vs the frequency in which the error values occur. A “good” chart shows that most errors happen near zero, meaning the majority of error values are very close to or at zero. Whereas ideally you want to see larger error values (either negative error or positive errors) with lower to no frequency (either end of of the chart). If you see the opposite, then your model is shit:
Predicted vs True Chart for evaluating Regression/Forecasting
Chart with a dotted line meaning “ideal” predictions from your model, compared to to the average actual predictions. A “good” charts shows both these lines as close as possible:
The Forecast Horizon Chart for evaluating Time Series Forecasting.
The x and y axis represent…
Each part of the chart left of and right of the y-axis represents…
Plots the relationship between predicted values and the actual values mapped over time per cross validation folds, up to 5 folds.