Data distributions Flashcards

(10 cards)

1
Q

Define distributions of data

A

The manner in which data for a particular variable is spread over its range. Commonly visualised by a histogram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the problem with skewed distributions?

A

Mean is distorted by tails

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the problem with bimodal distributions?

A

Mean not representative since two distinct populations are identified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What properties are specified by a normal distribution shape?

A

Mean (centre peak) and standard deviation (spread)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the formula to calculate a z-score?

A

z = x - population mean/population sd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do z-scores tell us?

A

How many SDs a datapoint is from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you calculate probability of selecting someone above/below a specific datapoint in a normal distribution?

A

1) Calculate z-score for datapoint
2) Look up the associated p-value in z-score table (tells us the area)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you calculate the standard error (SD of the sampling distribution of the mean)

A

SD of parent population/SQRT of sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the central limit theorem state?

A

Given a population with a mean and SD, the sampling distribution of the mean approaches a normal distribution with mean and SD as the sample size increases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you calculate a z-score for a sample mean?

A

Sample mean - population mean / standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly