TIDY data
Filter Transform Aggregate Sort Join/Merge (Inner/Outer)
Common Data Problems
In different source systems Messy: Missing Invalids Errors Different levels
Common Data Types
Flat File
-Not used as much now
-Each field is placed in a fixed position (e.g. first 5 bytes of the file)
CSV
-Values are separated by commas (very common as many systems can export a CSV
Delimited File
-Pipe (example is Air BnB)
-Tab, etc.
Proprietary (e.g. SAS, SPSS, Workday etc.)