Grouped vs Ungrouped Frequency Distributions
Ungrouped: Only one value of the variable is recorded (eg. each value is recorded individually)
Grouped: Uses intervals. (eg. Instead of each value being shown, values are grouped into classes)
Examples of Ungrouped charts
Bar graph, pie chart
Examples of Grouped charts
Histogram, pareto chart
How do we ensure there are no gaps in a histogram?
We subtract from the lower limit and add to the upper limit to find a middle point
What is a range within a frequency distribution called?
A class
What is a commonly used formula to find the # of classes?
1+ 3.3Log(n)
(where n is the number of data points)
How to determine the width of each bar in a histogram?
(Largest value - smallest value) / #Classes
How to find the lower limits of a histogram?
Use the smallest value or one slightly smaller than it
Add width until desired number of values is reached
How to find the upper limits of a histogram?
Subtract 1 (or another appropriate number) from the minimums until desired number of values is reached
What is the relative frequency?
(hint: relative to the rest of the data)
The frequency divided by n, the percentage of the data that falls inside the given class
What is the cumulative frequency?
The sum of each frequency one after another. Represents the amount of data that exists before the following class(es)
What is the relative cumulative frequency?
The cumulative frequency divided by n. Represents the percentage of data that exists before the following class(es)
What are examples of central tendencies? (4)
Mean
Median
Mode
Midrange
Which central tendencies are more affected by outliers? Why?
Mean and midrange tend to be more influenced by outliers since they deal with the extremes of the data.
Median and mode aren’t usually affected by outliers since they deal with the order and frequency of the data.
What is an Outlier?
Data that does not fit in with the rest. Abnormally different data.
What are the two main measures of position? (in regards to data in a data set)
Percentile
Quartile
How to find the percentile VALUE of a data set?
How to find a given quartile of a data set?
Quartiles = P25, p50, p75
Follow same steps to find percentile.
Do we always indicate the upper / lower limits on a box and whisker plot?
No. If no outliers are present, the upper/ lower limits become the smallest/ largest values.
What is the IQR value of a box plot?
Q3 - Q1
How to find the upper / lower limits of a box plot?
Lower: Q1 - (IQR x 1.5)
Upper: Q3 + (IQR x 1.5)
How to find weighted mean?
Sum of data times their frequencies divided by the sum of the frequencies
What are the properties of a normal distribution? (3)
1) Symmetric about the mean
2) Mean median and mode all lie at the same position, all are equal
3) The curve’s size and shape are determined by the mean and standard deviation
What is the empirical rule?
On BOTH sides of a distribution:
- 68% of data falls into one SD of the middle
- 95% Falls within two SD of the middle
- 99.7% falls within three SD of the middle