What is a variable?
Any characteristic of an individual that can be measured or reported like age, sex or BMI
Variables can be classified as either numerical or categorical. Describe numerical variables
Variables can be classified as either numerical or categorical. Describe categorical variables
Draw a flow diagram to explain the difference between the types of variables

What are the methods of summarising each data type?
For numerical:
- Measures of central tendency (mean, median) if data is not normally distributed. Measures of spread if the data is normally distributed. (standard deviation, range)
For categorical:
- Frequencies
- Proportions
- Percentages
- Use tables & charts to do this
What is the difference between mean and median?
Mean is simply the average of all the values. Sum up all individual values & divide by number of ppl.
Median is the value such that 50% of data points lie at or above the median & 50% at or below it
Order data from low to high, take the middle value. If there is an even number = take average of central 2 values
When should we use mean vs median?
MEAN is good measure of the centre of a symmetrical distribution
– Much more useful in practice
– But over influenced by extreme values
MEDIAN is better for skewed distributions because it is only slightly affected by extreme values (no matter how big they are)

Describe what distribution curves show

How do you estimate a 95% reference range?
We are interested in the range of values from (apparently) healthy individuals for a particular measurement. Range may vary by sub-groups (age, gender)
Mean ± 1.96 x SD
→ 95% of the data lies between these limits IF data are normally distributed