What is the defining feature of categorical data?
What are the three sub-types of categorical data? Give examples
Defining feature of categorical data = no units
What are the two types of continuous data? Give examples
How do we summarise categorical data?
- Express these as proportions/percentages of the total no. of individuals
What are the two ways in which we can display categorical data?
2. Graphically (e.g. Bar chart, pie chart)
What are the differences between a bar chart and a pie chart?
A bar chart consists of a bar for each category with length of bars proportional to the frequencies. Bars do not touch as data are not continuous but fall into distinct categories
In a pie chart the area of each segment is proportional to the frequency in that category (e.g. If 50% are smokers then the angle would be 360/2 = 180)
What are the three ways in which we can summarise continuous data?
Describe the production of a histogram
The bars will touch each other to indicate data is continuous. The area in each bar is proportional to the no. of ppl in that range
A histogram can demonstrate normal (Gaussian) distribution. Describe the characteristics of this
If data is not normally distributed, what is it?
Skewed distribution
What are the two main elements used to describe data?
2. Spread: how much variation is there in the data?
When should you use mean and when should you use median?
If data is symmetrical (normally distributed), then the mean is fine.
If data is skewed, median should be used as this is more resistant to outliers
How can skewed data affect the mean and median?
If data is symmetrical, mean = median
If data is left (negative) skew, meanmedian
What are the three main measures of spread with regards to descriptive stats?
How do you calculate standard deviation?
How do you calculate IQR?
IQR = QU - QL
The SD tells us about the spread of the data because if the data are normally distributed………?
- approx 95% of the readings lie within 2SD of the mean
What are the five ways in which data normality can be assessed?
What are the three main measures of location with regards to descriptive statistics?