Famous datasets
Types of datasets
Common issues with datasets
Augmenting datasets
Baseline
= simple, easily implemented and understood model that illustrates the pb and the ‘worst case scenario’ for a model that learns nothing
- nearest neighbor (1 or k) is usually a good baseline