Inferential statistics
Attempting to see if we can:
- Infer an alternative explanation that the difference observed is NOT due to chance e.g. due to IV
-Or if it is a result of random variance caused by sampling error, e.g. the IV has no effect
Null hypothesis symbol
Hₒ
Null hypothesis meaning
In the population we’re sampling, there is no relationship between the variables being tested (IV effect on DV)
What does it mean to assume null hypothesis is true
We say…
Under Hₒ there will be no statistically significant differences between groups (experimental condition)
When can we reject the null hypothesis?
If we were to assume this is true…
But turns out the probability it is due to chance/ sampling error is actually very low
Then we can reject it
Symbol for the alternative hypothesis
H₁
What is the alternative hypothesis?
In the population sampled, there is a significant difference between experimental conditions scores, i.e. the IV had an effect on the DV
If the null hypothesis is rejected, is the alternative hypothesis therefore true?
Maybe but maybe not
Because the tests tell us the probability of the data being obtained due to chance, assuming the null hypothesis is true
Does not tell us the probability of either hypothesis being true as a result of what data we obtained
How can we argue for the alternative hypothesis being true?
Conditions are identical and controlled in every way - all confounding variables controlled for
Why might we not be able to conclude the alternative hypothesis is true?
Assumption we have controlled everything is wrong
Experimenter bias, small errors e.g. we are not aware of
there may be other plausaible mechanisms and explanations for the results
NHST
Null hypothesis significance testing
What are NHSTs?
Allow us to test for statistically significant differences between groups = how probable this observation is if its purely due to chance
And rule out sampling error
What do NHSTs use?
Z scores and p values
Using z scores recap
Transform scores into z scores = how many SDs this score is away from the mean
Score taken away from mean
Divide by the standard deviation
Use a table to obtain probability of obtaining a data score with this z value/ the percentage of scores below this, above this etc
p value
A probability value of obtaining any given score
This is the value looked up in tables of z value that shows probability of a score being in smaller/ larger portion/ mean to z portion
How to use z values to determine significance
Based on standardised distribution…
z value of +-1.96 will be 5% of the population in total
If, given the sample size, we can find the number of Ps we will expect to have this z score (5% of the population)
If we were to randomly select participants from this sample
How to use z values to determine p value
Use same principle as determining outliers:
Think of 2 conditions as having their own standardised distributions that overlap
alpha sign
Threshold of significance
NHST on paired distributions
How many scores would we expect above the threshold (alpha) if null hypothesis is true
E.g. what percentage of scores would need to be present in the overlap of both distributions for each condition
(If null hypothesis is true then the distributions should be very similar and have a lot of overlap)
p value for null hypothesis significance test
We obtain the probability that, if assuming the H0 is true, we expect to observe results as extremely different as this a percentage of times equal to the p value
E.g if p = 0.001 then 0.1% of the time
If the null hypothesis is true, then group differences that are extreme are…
unlikely but not impossible
Equation to show means are identical aka nul hypothesis is true
μ₁ - μ₂ = 0
aka the means of both sample are the same
Sampling distrubtion of a null hypothesis (plotting means of study that had been done again and again)
For each condition, the means make up its own singular curve:
But they are identical because all the means are the same
And as we take more samples, converges on the population mean closer and closer
What happens if we assume the null hypothesis is true but we obtain large difference in means?
If we obtain a difference in means that is quite large, when plotting a sampling mean assuming the null hypothesis is true,
It is still possible to have a large difference in means even if the results are due to chance by sampling error alone