Define statistics
Statistics refers to the scientific method of collecting, organising, analysing, and presenting data in meaningful ways and using it to draw conclusions or make decisions.
Purpose of statistics
Explain descriptive statistics
Descriptive statistics are methods used to summarise, organise, and present the main features of a dataset in a meaningful way. They help you understand the basic characteristics of the data without making inferences or generalisations about a larger population.
The goal is to transform raw data into a form that’s easy to interpret, often using simple numerical summaries and graphical representations.
Explain frequency distribution
A frequency distribution is a table or graph that shows how often each value or range of values occurs in a dataset.
It helps to organize raw data into a clearer form so that patterns and trends can be easily seen.
Explain measure of central tendency
Measures of central tendency describe the center or typical value of a dataset, where most of the data tend to cluster.
They give a single value that best represents the whole data set.
Characteristics of a representative average
3 ways a statistical series may differ from each other
What is arithmetic mean?
It may be defined as the sum of the separate scores or other measures divided by their number.
What is a median?
The median is the middle value in a set of numbers when they are arranged in order. If there’s an odd number of values, the median is the exact middle number. If there’s an even number of values, the median is the average of the two middle numbers.
What is a mode?
The mode is the value that appears most frequently in a set of data.
What is variance?
Variance is a measure of dispersion of data points from the mean.
What is standard deviation?
Standard deviation is defined as the square root of the variance
Properties of a normal distribution
What is a normal distribution?
A normal distribution is a continuous probability distribution that describes data which are symmetrically distributed around a central value, forming a bell-shaped curve.
What are parametric tests?
Parametric tests are statistical tests that assume the data follow a specific distribution (usually normal). They include ANOVA and T test
What are non parametric tests?
Non parametric tests are statistical tests that do not assume any specific data distribution
Define inferential statistics
Inferential statistics refers to the techniques of interpreting the values resulting from the descriptive techniques and then using them to make decisions and draw conclusions about the population.
Explain parametric tests
Parametric tests are statistical tests that make assumptions about the population from which the sample is drawn. The main assumption is that the data follows a normal distribution. They also usually assume equal variances and that the data is measured on an interval or ratio scale.
Because these assumptions are made, parametric tests use population parameters (like the mean and standard deviation) to test hypotheses. When the assumptions are met, parametric tests are more powerful and precise, meaning they are better at detecting true differences or relationships.
Common examples: t-test(Compares the mean of two groups), ANOVA(Comapres means across three or more groups)
Explain non parametric tests
Non-parametric tests do not assume a specific distribution for the data. They are used when the data is not normally distributed, when sample sizes are small, or when the data is measured on ordinal or nominal scales.
Instead of using raw scores and parameters like the mean, non-parametric tests often work with ranks, frequencies, or medians. This makes them more flexible but usually less powerful than parametric tests.
Common examples: Mann–Whitney U test, Wilcoxon signed-rank test
Explain skewness
Skewness is a measure of how symmetrical or asymmetrical a distribution is. When a distribution is symmetrical about the mean, the skewness is equal to 0. If the probability histogram has a longer tail to the right than to the left, the measure of skewness is positive and we say that distribution is skewed positively or to the right. And vice versa
Explain kurtosis
Kurtosis is the degree of peak peakness of a distribution relative to a normal distribution.
Differences between parametric and non parametric tests
Parametric vs Non parametric
- Independent samples T test vs Mann Whitney U test
- Paired samples T test to Wilcoxon signed rank test
- One way ANOVA vs Kruskal Wallis H Test
- One way repeated measures ANOVA vs Friedman Test
Explain platykurtic distribution
A platykurtic distribution has a flat, wide peak and light tails. This means data are more spread out and there are fewer extreme values.
Explain leptokurtic distribution
A leptokurtic distribution has a tall, sharp peak and heavy tails. This means data are closely clustered around the mean but there are more extreme values (outliers).