Significance Testing,Sampling Distributions & Z-Scores Flashcards

(47 cards)

1
Q

What are descriptive stats

A
  • they describe our data through measures of central tendency and measures of disperson.
  • only tell us about our own data, cant generalise to population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What do descriptive statistics ntot tell us

A

us whether the difference between these groups can be inferred beyond our sample to the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what do inferential statistics generate and what can they help with

A

p-value = help us understanding if there is a difference in our general population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is a p-value/ inferential statistic dependant on

A

what type of data level is used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an inferential statistic

A
  • uses a random sample of data from pop to help make inferences about the pop.
    -helps make inferences that go beyond our data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the two types of inferential statistics

A

frequentist ( focused on in this lecture)

bayes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a null hypothesis

what is an alternative hypothesis

(BASIC UNDERSTANDING)

A

We define a null hypothesis. This means that there is no difference between the groups we are looking at.

An alternative hypothesis states that there is a difference in the results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what do we compare the null hypothesis

A

with the alternative hypothesis to see a contrast

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the purpose of null hypothesis testing?

A

It estimates the probability of obtaining a result /pvalue (or one more extreme) by chance, assuming the null hypothesis of no difference or association is true.
If the result is unlikely or extreme under this assumption, we reject the null and conclude there is evidence of a real difference or association.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

for p-values what value is known as satistically significant

A

0.05/ 5%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the alpha level

A

the leve whcih we accept a result to be significant (0.05 )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the true definition of a p-value in frequent statistics

A

Probability the result we found (or one more extreme) occurred by chance assuming the null hypothesis is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the misinterpretation about p-values

A

p-values tell us the likelihood that our (alternative) hypothesis is real / true. - It can’t be. The p-value is specific to an experiment and Null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

if p i slower than 5% what does this indicate

A

we should reject the null. As it is suprising enough to be real

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the normal distribution tell us in hypothesis testing?

A

Normal Distribution:

A bell curve that shows the range of possible outcomes.

Used to determine how likely your sample result is if the null hypothesis is true.

P-value:

The probability of obtaining a result as extreme (or more extreme) than your sample result, if the null hypothesis is true.

Low p-value (e.g., < 0.05) → Reject the null hypothesis (your result is unlikely under the null).

High p-value → Fail to reject the null hypothesis (your result is not unusual).

Null hypothesis (H₀): The average height is 160 cm.

Sample result: Average height = 165 cm.

P-value: 0.02 → Reject the null hypothesis (165 cm is unlikely if the true average is 160 cm).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What percentages of observations fall within standard deviations of the mean in a normal distribution?

A

68% within ±1 SD

~95% within ±2 SDs

In a normal distribution, the ±1.96 standard deviations (SDs) from the mean cover 95% of the data, leaving only 5% outside of this range—split between the two tails of the distribution.

So in hypothesis testing, we often use 1.96 to determine the boundaries for a 95% confidence interval or the critical region for rejecting the null hypothesis at a 5% significance level (α = 0.05). This means that

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what are the region of rejections

A

They are the extreme ends (tails) of a distribution where results are unlikely if the null is true.
If a test statistic falls in this region (e.g., p < 0.05, beyond ±1.96 SDs), we reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What if a sample has an extreme (<5%) probability under the null hypothesis?

A

It’s very unlikely to happen by chance, so it might come from a different population — suggesting a real difference or effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How do effect sizes (Cohen’s d) relate to p-values?

what is the definition of cohen d.

A

When Cohen’s d = 0 → no real effect → p-values are usually not significant (p > .05).

When Cohen’s d = 1–2 → large real effect → p-values are usually significant (p < .05).

Example: Height difference between men and women (d = 1.72) shows a strong, real difference that’s almost always statistically significant.

Cohen’s d is a measure of effect size that tells you how large the difference is between two groups (or conditions), relative to the variability within those groups.

20
Q

what is a 1-tailed test

A

used for directional hypothesis . The 5 percent significance levels is concentrated on one tail

21
Q

what is a 2-tailed test

A

non-directional hypothesis
5 percent significance levels are split across both tails. We only use 2-tailed tests are used

22
Q

what is the 5 percent significance zone

A

part of our results that are too extreme/ random

23
Q

What are Cohen’s d effect size bins

A

Cohen provided guidelines to help interpret the magnitude of the effect size, or what we often call effect size bins.

0.2 = small

0.5 = medium

0.8 = large

24
Q

How do effect size and p-value differ?

A

Effect size (d): how big the difference is.

p-value: how surprising the result is (depends on sample size).
✅ Big effects can be non-significant, and small effects can be significant.

25
what are p-values dependant on
sample size- meaning that the power is small as we dont have enough data points,
26
Describe the aim of sampling
to learn more about the population, the larger the sample , the more we learn about the population
27
what is sample variability
Each time we draw a sample from the same population, we make different observations, meaning if we calculate a statistic for each sample. the statistic will also be different
28
What happens if we take an infinite number of sample means?
we get a normal distribution of sample means because recording the average over random samples, means that the extreme values will be cancelled out leaving a normal distribution
29
what is the sampling distribution of the mean
It’s the distribution of means from many samples of the same size drawn from a population. The mean of this distribution = expected value . Sample mean is equal to population mean The spread = standard error (SE)
30
what des the standard error tell us
SE measures how much sample means vary from one sample to another due to random chance. Larger SE → more spread out sample means Smaller SE → sample means cluster closer to the population mean
31
How does sample size affect the sampling distribution?
Small sample (e.g., 3): wider distribution, larger SE, noisier estimate Large sample (e.g., 15): narrower distribution, smaller SE, more accurate estimate
32
What does the Central Limit Theorem (CLT) state
the means of many samples will alwaus form a normal distribution using the mean of the population and the standard error, even if the population isn’t normal.
33
what are the implications of the central limit theorem
always has a normal distribution even if the underlying distribution is not , as well as being able to connect out sample scores to the population of values
34
what is a thought experiment ( calculating standard error)
magining infinite samples to understand the sampling distribution.
35
what does a thought experiment help with
connecting the sample with a population . every pop has a sample distribution of the mean for a specific sample sizes
36
for calculating standard error what have mathematics found
the mean of the sampling distribution is the mean of population . the standard deviation of the sample is: d/ square root of n side ways d ( tail to right) - the sd of values in pop square root of n- the size of sample
37
How do we estimate the standard error if we don’t know the population standard deviation?
We use the sample SD as an estimate. If the sample is large enough, this gives a good approximation for the standard error.
38
what would the equation be
the standard deviation of the values in sample divided by the square root of the size of sample
39
What is a 95% confidence interval?
An interval around the sample mean formed by adding and subtracting ±1.96 standard errors. It has a 95% probability of containing the population mean. ## Footnote range where u believe the true mean live
40
the approximation for standard error can be used to calculate the condidence level of our sample . what does the interval tell us
It gives a range of plausible values for the population mean based on our sample. Shows how close the sample mean is likely to be to the population mean.
41
What affects the width of a confidence interval and what does it show?
Wider CI if sample size is smaller (SE larger) Wider CI if population SD is larger Shows how close the sample mean is likely to be to the population mean
42
What is a standard normal distribution (z distribution)?
A normal distribution that hasa been rescaled so it has with mean = 0 and standard deviation = 1.
43
In a standard normal distribution, what interval contains 95% of values?
Approximately [-1.96, +1.96] standard deviations from the mean.
44
Why is the standard normal distribution called a z distribution?
Because any normal distribution can be converted into a standard normal using a z-transformation, which rescales it to mean = 0 and SD = 1.
45
how do we calculate a z transformation and why is it useful
Subtract the mean and divide by the SD for each value: z= value - mean /sd ​ Resulting z-scores tell how many SDs a value is from the mean. Allows comparison across different scales by normalizing the data.
46
Why is the standard normal distribution useful?
By converting raw scores into standardized z-scores, we can use the standard normal distribution to make predictions and calculate probabilities for any normal dataset.
47
what are another use of z-transofrmations
Another use of z transformation is to make scores from different scales comparable to infer things about behaviour