Lecture 6: Supplemental Material Flashcards

(56 cards)

1
Q

What will most studies do before data collection begins?

A

attempt to determine size of the sample needed to achieve certain degree of accuracy in estimation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Two reasons why estimating minumum sample size commonly done with population proportions?

A
  1. With population proportions, you do not need to make separate guesses about the population mean and standard deviation
  2. With population proportions, it is easy to identify a conservative mean, and the bias does not vary much
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is estimating minimum sample size less commonly done with population means?

A
  1. With population means, you need to make separate guesses about the population mean and standard deviation
  2. We generally have a hard time making a good guess about a population standard deviation without measuring it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Three things that must set to calculate the desired sample size

A
  1. precision: decide on maximum error B around the proportion (i.e. how wide is the desired confidence interval)
  2. Confidence: set probability with which the specified precision/this interval is achieved (ie.e what is the desired confidence coefficient)
  3. Proportion: select a value of the population proportion pi (either using best available data or conservative estimate)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

why must we select a
value of the population proportion – π (most conservative is 0.5)

A
  1. This is because the spread of the sampling
    distribution depends on the value of π
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why is the most conservative guess of pi 0.5?

A

this value produces the largest standard error (most spread/variance); you’ll know you got the right sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is you guessed that pi is 0.5, but it turns out that it is was actually 0.8?

A

Either way, it would’ve said you needed a smaller sample size, less observations than you originally thought, your result will be even more precise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What equation do you use to answer the question “for a given level of precision, how many observations do I need to ensure that election night +/- value that I want?”

A

n = π( 1− π)
( z/B)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

in the equation n = π( 1− π)
( z/B)^2, what does B denote?

A

the maximum error around the proporMon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

in the equation n = π( 1− π)
( z/B)^2, what does n denote?

A

denotes the sample size ensuring that, with fixed probability,
the error of esMmaMon of π by the sample proporMon is no
greater than B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

in the equation n = π( 1− π)
( z/B)^2, what does zw denote?

A

the corresponding z-score for a confidence interval
with a confidence coefficient equal to the fixed probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In the equation n = π( 1− π)
( z/B)^2, can you throw in a value for pi?

A

NO - that is the true population proportion (you don’t know that, and you can’t estimate it by throwing in sample value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Two solutions to not knowing pi in n the equation n = π( 1− π)
( z/B)^2

A
  1. If another researcher collected data before, maybe could utilize that value as best guess.
  2. Most of the time, researchers use a specific value (0.5) to figure out how many people they need to sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What equation do you use to calculate the necessary sample size for estimating means?

A

n = σ^2(z/B)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What equation do you use to calculate the necessary sample size for estimating proportions?

A

n = π( 1− π)( z/B)^2,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What must you specify when calculating sample size for estimating means? n = σ^2(z/B)^2

2 things

A

Desired confidence coefficient
population standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What must you specify when calculating sample size for estimating means? n = σ^2(z/B)^2 , how might you specify the standard deviation?

A

*use standard deviation from past research, or related population data if possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

when dealing with calculating sample size for estimating means: the __ the spread of the population
distribution, as measured by the __, the __ the
sample size needed to achieve a certain accuracy

A

greater
standard deviation
larger

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Two types of estimators?

A

Biased
Unbiased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Unbiased estimator

A

the average sample statistic (from an indefinitely large number of samples) equals the population parameter in the long run/ (average of sampling distributions’ statistics will equal parameter)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

T/F: When you have an unbiased estimator, for some samples a statistic may underestimate the
parameter of interest and for others it may overesMmate
the parameter

A

True - but in the long run the estimates will “average” themselves out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Biased estimator/statistic

A

In the long run, the statistic consistently over or underestimates the parameter it is estimating (the average of all possible statistics is not equal to the parameter)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Why is range a biased estimator?

A

when find all possible sample ranges and take the average, doesn’t equal population range, since almost of all sample ranges might be like 4 or 7, only combos including outlier of 10 million have the true range captured

24
Q

Why is standard deviation a biased estimator?

A

end to underestimate, since most samples don’t contain the max and min value; the true population parameter will always be a bit larger, also using a different equation

25
Does CLT still apply when you have biased estimators? and what's the caveat?
Yes - you'll still have data clustering, mode/median/mean will be equal, just won't be at the population parameter
26
Examples (2) of biased estimators
Range Standard deviation
27
Why do we care if estimator is biased or unbiased?
if biased, have implications for how to calculate interval; the logic of interval estimate requires that the value be in the middle, unbiased
28
What makes a statistic POSITIVELY biased?
If it tends to overestimate the parameter
29
What makes a statistic NEGATIVELY biased?
If it tends to underestimate the parameter
30
T/F: An unbiased statistic is always an accurate statistic
FALSE - If a statistic is sometimes much too high and sometimes much too low, it can still be unbiased. It would be very imprecise, however.
31
What is one possible saving situation for biased statistics?
A statistic can be slightly biased but still be efficient if it systemically results in a very small over or underestimate of a parameter
32
T/F: If the population distribution is positively skewed (skewed to the right), then the expected value of the sampling distribution of the sample mean might not be equal to the population mean
FALSE - it still will be!
33
T/F: If the population distribution is positively skewed (skewed to the right), then the expected value of the sampling distribution of the sample median might not be equal to the population mean
TRUE - value is consistently less than the true population mean, so it's a biased estimator of the population parameter when the population distribution is positively skewed
34
What is efficiency/precision? (2 definitions)
Statistic more erfficient/precise if its standard error smaller relative to others: Idea that statistic obtained from any single sample from a population is close to the value of the parameter being estimated (not much error) how stable a statistic is from sample to sample
35
The less subject to sampling fluctuation a statistic is, is it more or less efficient?
more efficient!
36
Why is the efficiency of statistics often called the relative efficiency?
It's measured relative to the efficiency of other statistics
37
What might cause us to say that statistic A is more efficient than statistic B?
If statistic A has a smaller standard error than statistic B
38
T/F: If an estimator is biased, it is automatically inconsistent and inefficient
NO - something could be highly efficient and biased
39
How do decide if one equation is more efficient/precise estimator?
Whichever one produces the smaller standard error
40
Most important thing about efficiency/precision?
They are RELATIVE concepts - for two equations...which ones have smaller standard error?
41
Area where efficiency/precision might be applied (and you can see it's relativity shine)
For a given parameter, might have multiple equations for how to get the interval (and some of those equations might have less error, give you tighter interval)
42
T/F: The mean is always more efficient than the median
FALSE - relative efficiency of the two statistics may depend on distribution involved. For instance, the mean is more efficient than the median for normal distributions but not for some extremely skewed distributions
43
What key factor does the relative efficiency of two statics depend on?
the distribution involved (normal or skewed)
44
The more __ the statistic, the more __ the statistic is as an estimator of the parameter
efficient precise
45
Three things to keep in mind when selecting a point estimate
Unbiased vs. biased Efficiency/precision consistency
46
Consistency of an estimator
An estimator is consistent if the estimator tends to get closer to the parameter it is estimating as the sample size increases/error goes down
47
Difference between consistent and unbiased estimators?
consistent estimators get better as the size of the sample increases, whereas unbiased estimators get better as you take the average of the statistic from a large number of samples
48
What might cause you to label an estimator as inconsistent?
If, as you increased sample size, the error went up
49
Why does consistency matter?
In general want biggest possible sample, but sometime it hurts
50
What important concept does consistency (as sample size gets larger, a statistic gets closer and better at estimating the parameter) connect to?
LLN
51
What does B entail
“Beta” term, aka the maximum error around the proportion
52
If a confidence interval is NOT about where the parameter falls, what IS it about?
It's about having a good/reasonable guess of what the parameter is. Then its the classique interpretation of "we are x% confident our parameter falls within the (x,y) c.i." There is a trade off between precision and confidence level.
53
What are the tradeoffs between precision and confidence levels (for intervals?) - careful, different type of precision
as precision increases, confidence level decreases
54
How do you select the value of the population proportion pi when estimating B (2 options) - ## Footnote what do I mean here for estimatign b??
1. using best available data 2. conservative estimate
55
Example of interpretation for estimating sample size of proportion, where n = 600.25?
n = 600.25, so we should select a minimum of 601 houses for our sample
56