Variability defined
degrees to which scores in distribution
cluster together
spread apart
Lower variability
score close together
small differences between scores
High Variability
scores spread out
- large differences between scores
Purpose of variability
describe distribution
expected distance between one score and another
expected distance between one score and the mean
Two measure of variability
range
standard deviation
Range
distance between largest and smallest score
- range - xmax- xmin
- if you have score 1-5
- you will have a range of 4
Trouble on the Range
range does not consider all scores
only highest and lowest
ignores scores in the middle
can be distorted by extreme scores
one can stretch the range
crude/unreliable measure of variability
Upper and Lower real limits and range
using the previous definition of range you may lose data
for continous varaible you should use an upper real limit and lower real limit
add - 0.5 to the highest value and 0/5 from the lower
URL and LRL when scores are whole numbers
you can define it as the number of categories ex= scores of 1-5
5 possibility
range = (xmax- xmin) + 1
Deviation
distance from the mean
x- , μ
Easter Egg
standard deviation valentines serenade
Variance formula
Σ(x-μ)^2/N
x = a score
μ = mean for population
Σ = add it up
N = number of people
Sum of squares
Σ(x-μ)^2
- given on exam
Finding sum of squares
basic concepts - remember
- we are talking about population
- we are talking about parameters
- we’ll get to sample/statistics soon
- START with deviations
deviation = difference bw on score (x) and the mean
- without squaring you get zero
- because conceptually we are talking about a bell curve, there will be data on both positive.negative side
Sum of Squares Alternate Formula
Σx^2 - (Σx)^2/N
population variance symbol (Mean Squared Deviation)
σ^2
Population Standard Deviation
standard deviation (SD) approximate average distance form mean.
mean of distribution is a reference point
considers distance bw each individual score and the mean
a standard interval, representing average distance from the mean
Population variance vs Population Standard deviation
population variance = average of all squared deviations from the mean
- squared being the key word
Population standard deviation
- approx average distance from mean
- take square root of variance
The variability problem of samples
samples are smaller than populations
less variable than populations
less extreme populations scores may not be represented
sample variability is biased
underestimates population variability
needs to be corrected
Understanding standard Deviation
how spread out the scores are
-scores in a distribution can be
close to the mean
far from the mean
standard deviation tells us the typical/standard distance from mean
Correcting the sample variability bias
first fix notation in computing SS
- use M (sample mean) instead of population mean
- use n (sample size) instead of N (population size)
- second
- rename parameters as equivalent “statistics”
- use s^2 (sample variance)
- us s (sample SD)
third
- adjust variance to accurate/unbiased representation of population
- use n-1 to compute sample S2 and s it corrects bias
Degrees of Freedom
sample variability underestimates population variability
samples are smaller
samples may miss extremes
give us baised statistic
using n-1 gives us unbiased estimate of population variance/SD
## makes your answers more conservative
Low variability
easier to see a pattern
High variability
harder to see a pattern