Experiment example
A researcher was interested in whether animals could be trained to dance. He took 200 cats and tried to train them to dance by giving them either food or affection as a reward for dance-like behaviour. At the end of the week he counted how many animals could dance and how many could not. There are two categorical variables here: training (the animal was trained using either food or affection, not both) and dance (the animal either learned to dance or it did not). By combining categories, we end up with four different categories. All we then need to do is to count how many cats fall into each category.
Look at picture 1 to see the contingency table of this data
What is χ2 test?
A chi-squared test, is any statistical hypothesis test wherein the sampling distribution of the test statistic is a chi-squared distribution when the null hypothesis is true
- Often used as short for Pearson’s chi-squared test
What does χ2 test measure?
The association between two categorical variables
What is the central idea of Person’s χ2 test?
Based on the idea of comparing frequencies we observe in certain categories to frequencies you might expect to get in those categories by chance
What is the formula we use to calculate χ2? Explain each part of the formula
Picture 2
- We divide by the model scores - same process as dividing by degrees of freedom to get the mean squares (standardizes the deviation of each observation)
- i - rows in the contingency table; j - columns
- Observed data - the frequencies
What is the model in the formula of χ2?
We calculate the expected frequencies for each cell in the table using the column and row totals for that cell
What do we use to display data and calculate χ2?
Contingency tables
Picture 5
What is the χ2 distribution?
It describes the test statistic χ2 under the assumption of the null hypothesis and is used to obtain the p-value corresponding to the value of the χ2-statistic
How is its shape determined and how do we obtain a p-value from it?
What happes to χ2 statistic’s approximation as the sample increases? How is that different with small samples?
The chi-square statistic has a sampling distribution that approximates a chi-square distribution, and this approximation improves as the sample size increases
What happens if the expected frequencies in a χ2 test are too low (small sample)?
The sampling distribution of the test statistic deviates too much from the chi-square distribution, making the test inaccurate
What is the Fisher’s test?
Calculates exact χ2 for small samples (one of the cells’ expected frequency less than 5)
- uses 2x2 contingecy table (i.e., two categorical variables each with two categories)
Can Fisher’s test be used for larger samples or tables?
Yes, but it’s unnecessary and can be computationally intensive
What is an alternative to Pearson’s χ2?
Likelihood ratio statistic which is based on maximum likelihood theory
How do we compute the likelhood ratio statistic?
Johnny had this in his slides but skipped it. The book talked very little about it
Formula is in picture 8, i and j are the rows and columns of the contingency table and ln is the natural logarithm
What is the Yates’s correction?
For 2 x 2 contingency tables, Yates’s correction is to prevent overestimation of statistical significance for small data (at least one cell of the table has an expected count smaller than 5)
What is the problem with Yates’s correction?
It overcorrects since it lowers the value of χ2 statistic which makes it less significant
The book said rather ignore it, Johnny was a bit wary about it as well but didn’t make such a definite statement
What are three measures of the strength of the association between two categorical variables?
These measures modify the chi-square statistic to account for sample size and degrees of freedom, aiming to restrict the range of the test statistic from 0 to 1, similar to a correlation coefficient:
Johnny didn’t mention this in the lecture
How can we represent the χ2 in a linear model? Apply it to the experiment example
It’s the same as with factorial design since we have two predictors and an interaction between them (training x dance)
picture 10.1
picture 10.2 - the outcome variable is categorical so the assumption of linearity is broken → the outcome variable gets transformed to log values (which also affects the error term)
picture 10.3 - the predicted values of the outcome (error = 0)
We replace training and dance variables with 0 or 1 depending on the category that we are trying to calculate
How do we calculate each variable in the observed linear model?
How do we calculate the expected frequencies model?
The χ2 test looks at whether 2 variables are independent (interaction = 0)
Remove the interaction term and we can get two scenarios:
1. The model is still a good fit to the data, the interaction effect isn’t contributing to the fit to the model → variables are not dependent
2. The model is a poor fir to the data, the interaction term is contributing a lot to the model which implies that the variables are dependent
Picture 10.9 - formula for the predicted number of cats in each category (we took the b-variable for interaction effect out)
- Now, we use the expected values that we already computed from picture 4 and calculate the b-values: picture 10.10, 10.11 (main effect of the type of training), 10.12 (the main effect of whether the cat danced), 10.13 (just to double check whether it fits)
- Putting all this together we get the predicted values from the model - picture 10.14
- We can rearrange the formula of the model (picture 10.15) to get the residuals (picture 10.16) and eventually the χ2
What are the assumptions when analysing categorical data?
How do we calculate effect size of the χ2 test?
Using odds ratio based on the observed values
- not useful for larger tables than 2x2
Formula on how to calculate it in picture 11
Example in picture 12
What does the odds ratio represent?
It takes into account both of the levels of both of our variables. So we are not just saying how many more odds numbers than even numbers are perceived as female but we are making it relative to the male part as well.
= How many times is it more likely to be female/male, relative to the other variable odd/even
- It’s generally easier to talk about odds ratio that exceed 1 so we can flip the variables if we need to (1/x = z; x = 1/z)