Use median when
When data set is skewed
Measures of central tendency
Mean median and mode
When to use mean
When data is not skewed unless question asks for mode
Data set that is not skewed
Symmetric
Measure of variability
What and why is it useful
A measure of variability (aka measure of variation, dispersion, or spread) is a measure of the amount of spread of the data set (sample or population) on a number line
Useful bc it gives us an idea of the “give and take” we can expect around a measure of central tendency
Range
Range is the difference between the maximum and minimum value of a data set
Ex. X={1,2,3,8)
Range = 8-1=7
Range in notation
Range = max(x) - min(x)
Range in excel
=max()-min()
Mean absolute deviation
Average value of the difference in absolute value; of each observation from the mean
Mean absolute deviation
Excel
=avedev()
Variance
Variance of a sample or of a population is (approximately) the average squared deviation of each observation from the mean
Population variance excel
=var.p()
Sample variance excel
=var.s()
Standard deviation
Standard deviation s of a sample is the square root of the sample variance. The population standard deviation is the square root of the population variance
Standard deviation s of a sample
Square root of the sample variance
Population Standard deviation
The square root of the population variance
Advantage of standard deviation
Variance is in squared units, standard deviation is in the same u its as the original data
Ex. If you’re measuring weights in KG and wanted to measure the spread of the weights, the variance would give you a measure in kg2 and the standard deviation would give you a measure in kg
Sample standard deviation excel
=stdev.s()
Population Standard deviation excel formula
=stdev.p()
When should you use the range
If you want a quick and dirty understanding of how spread out your data set is. Careful of extreme values
When should you use mean absolute deviation MAD
If you wanna measure of the spread that’s not as sensitive to extreme values and is relatively straightforward to explain
When should you use variance
If you need to for mathematical proofs, etc.
When should you use standard deviation
Use it by default. It’s the most commonly used, and it’s used in many subsequent parts of statistical analysis. Like what we’re about to do
Properties of variation