Why might some data be outliers
Could be accurate but extreme values
Incorrectly recorded
Things to look out for
Outliers
Missing data
Written in the wrong format or order
Different units or symbols used
When there are problems with the Data what is it
It is raw data
What does cleaning the data mean
Fixing problems - by removing/correcting inaccuracies and missing data, cleaning with genuine outliers, putting the data in the same format and removing units and symbols
What might removing the outliers do
Lead to inaccurate results so should try and record it again but may be time consuming