Define discrete data
Data that can only take set values in a range of
Define continuous
Data that can take any value in a range
Define outlier
Extreme value which doesn’t fit the same patterns as the rest of the data set. You can never remove them unless there is a genuine reason to do so.
Define anomalies
An outlier where there is a genuine reason why it shouldn’t be there. You can remove anomalies from the data called cleaning the data.
How can you use mean and standard deviation to identify outlier
Any value which is more than 3 standard deviations from the mean is considered an outlier
How can you use tukey fences to identify outliers
The lower tukey fence is Q1-1.5 IQR
The upper tukey fence Q3+1.5 IQR
Any value outside the tukey fence is considered an outlier