Descriptive statistics
Used to numerically describe the characteristics of the data set
Inferential statistics
Used to make inferences from data from the sample and draw it to the target population
Distribution
The way in which data is distributed
How can we show distribution
A list
A table
A graph
Normal distribution
The default assumption
Distribution occurs symmetirically around the middle (mean/ central score)
As you get away from the mean on either sides equally, fewer individuals achieve extreme scores
Measures of central tendency
Tells us the average
Mode
The most frequent value in he distribution
Median
The middle value when all scores are ordered from lowest to highest
Mean
Statistical average:
Sum of all values/ number of values
Bi modal distribution
When there are 2 modal values
Multimodal distribution
When there are 3 or more modal values
Pros of the mode
Represents the largest number of people with this score = most common score
Score will act be in the sample
Can be useful in nominal data
Unaffected by outliers
Cons of the mode
Won’t represent every value other than most frequent one
May neglect important outlier
Not useful if the difference in frequencies of scores is very little
Median pros
Not influences by outliers
Used for ordinal and nominal level data
Median cons
May not actually appear in the sample - if you find MP between values
Not reliable estimate - based off on rank position to be selected to represent average
Mean pros
Population mean symbol
mu (μ)
Numbers in sample symbol
X
Sample mean symbol
(x̄) X bar
N
Sample size
n
Size of sub groups of samples
Pros of mean
All scores contribute to the final value
Sensitive and accurate
Powerful way to estimate population mean
Cons of mean
Values don’t make sense e.g. can be 1.3 people
Influenced/ skewed by outliers
Only for interval/ ratio data