What are the two main types of problems?
Regression and Classification
Explain regression type problems
When something changes, by how much does the dependent variable also change
Explain classification type problems
In which category does ‘x’ fall into
Categories of data
Training Data
Test Data
Validation Data
How do you know data best for training vs testing and what’s the process called?
Cross-validation
Split it into 10 sections for best results
For regression problems, how do you know if you should use a straight line through the graph or follow the points?
See the difference between the predicted vs actual predications from both and total these up and the one with the smallest differences should be the model you pick
What is a trend line?
The line that goes through the center of the training data
What is ‘Bias-Variance Tradeoff’?
Fitting the Training Data well but making poor predictions
What is cross validation?
Allows us to test many different ML methods
What’s another word for ‘estimating the parameters’?
Training the algorithm
What do you call it if you splice your data up into 10 chunks?
Ten-Fold Cross Validation
What is ‘Leave One Out Cross Validation’?
Happens when you call out each sample for cross validation
Confusion Matrix
Splits data into false positives, false negatives, true positives and true negatives
Negatives means 0 and Positives means 1
True means the model got it right and ‘False’ means they got it wrong
Where the columns match is the correct number of predictions