Lecture 6: Statistical Inference/Estimation Flashcards by Max O

Given n = 250 and ý = 27, what can be say according to the LLN?

sample size of 250 means its closer than if we had 200 or 20

How well did you know this?

Not at all

Perfectly

Given n = 250 and ý = 27, what can be say according to the CLT?

Sampling distribution of the sample mean should be approximately normal in shape (functional form), mode=median=mean of the sampling distribution is in the middle and equal to the population mean; standard deviation of sampling distribution is sigma/squared n

How well did you know this?

Not at all

Perfectly

Why is CLT cool?

As n gets larger, the sapling distribution approaches normality, even if the pdf of X in the population is not normal

How well did you know this?

Not at all

Perfectly

Point estimate

a single number, calculated from a set of data, that is the best single guess for the population parameter

How well did you know this?

Not at all

Perfectly

Point estimator

a sample statistic that predicts the
value of that parameter; can think of this as the
equation used to produce the point estimate

How well did you know this?

Not at all

Perfectly

Problems with a point estimate?

Maybe high probability of being highly wrong

How well did you know this?

Not at all

Perfectly

What problem do you still have even with CLT and LLN?

Though we know the outcome of increasing sample size, is it clear how good of a guest our sample statistics are relative to the population parameter?

How well did you know this?

Not at all

Perfectly

Interval estimate

consists of a range of numbers around the point estimate, within which the parameter is believed to fall

How well did you know this?

Not at all

Perfectly

What is another word for the interval estimate?

the confidence interval

How well did you know this?

Not at all

Perfectly

What does interval estimate allow us to do?

Gauge accuracy of a point estimate using probability

How well did you know this?

Not at all

Perfectly

Why are we able to gauge the accuracy of a point estimate using probability?

We know the probability of the population parameter falling in a given interval

How well did you know this?

Not at all

Perfectly

What is confidence interval based on? (2)

A point estimator and the spread of the sampling distribution of that estimator

How well did you know this?

Not at all

Perfectly

What assumption must be met for you to use confidence intervals?

that the sampling distribution is approximately normal

How well did you know this?

Not at all

Perfectly

How do you construct a confidence interval?

Make sure the sampling distribution is approximately normal
Adding to and subtracting from the point estimate some multiple of its standard error (i.e. a z-score)

How well did you know this?

Not at all

Perfectly

Why is knowing “If we know the parameters of a population, specifically the mean and standard deviation, then we can predict the chance that a given sample of size n will have a sample mean within a certain distance of the population” useful?

We can reverse it to find confidence interval

How well did you know this?

Not at all

Perfectly

What is the implication of a 95% confidence?

5% chance that the sample mean does not fall within the interval you get a mean that range does not include the true population parameter

How well did you know this?

Not at all

Perfectly

Is it true that there’s a 95% chance that the mean will fall within the interval?

No, it either does or it does not. Rather, 95% of possible means will contain the population mean

How well did you know this?

Not at all

Perfectly

How do we reverse the logic?

given a sample of size n and a sample standard deviation and mean, we predict the chance that the unknown population mean is within a certain distance of the sample mean

How well did you know this?

Not at all

Perfectly

Once a sample mean is calculated, if the sample
mean does fall within the interval
µ−1.96σÝ and µ+1.96σŶ, then…

the interval from Ý−1.96σ Ý and Ý +1.96σ Ý
contains μ

How well did you know this?

Not at all

Perfectly

What is the equation for confidence interval?

ci= Ý ± z ˆσý

How well did you know this?

Not at all

Perfectly

For confidence interval equation - how do we get z?

Usually not given z. instead start with desired confidence interval, then select appropriate z-score.

How well did you know this?

Not at all

Perfectly

Confidence coefficient

The probability that the interval estimate contains the parameter. Typical confidence coeffieincets are .99 and .95

How well did you know this?

Not at all

Perfectly

What drives the z-score that you use for confidence interval calculations?

the confidence coefficient

How well did you know this?

Not at all

Perfectly

If you decide on confidence coefficient of .95, what z-score would you use?

1.96 (0.025 of the distribution falls outside this z-score)

How well did you know this?

Not at all

Perfectly

When calculating confidence interval for a mean, we do not have the σ (standard error of the population). What to do?

If the sample has 30 observations or more, can substitute in s (standard deviation for the sample)

When should you be skeptical if a paper estimates the standard error? (2)

If they do it with less than 30 observation If they have 30 or more observations but there is a lot of skew

If you have 30 observations but a lot of skew, can you still estimate the standard error?

You need 100 or more observations

What is the interpretation of an estimated standard error of 5?

This is the standard deviation of the sampling distribution, so on average, a sample mean we would derive from our sample of 900 is 5 bucks away from the true population mean

Interpretation of 95% confidence interval and interval ($290.2, $309.8)?

We are 95% confident that the true population parameter (average monthly food expenditure) falls in that interval

What happened if you didn't observe the true population parameter in the interval? (2)

1. maybe used inappropriate sampling techniques (want to daycare to get age) 2. maybe got unlucky/randomly picked 100 people really different from the pop.

What three components drive the precision of the interval (contained in the formula)?

1. The number of observations (n) 2. z-score/confidence coefficient 3. The s

What is the cost of having more confidence?

You get a wider interval and less precision

Can you control the s (one of the components that drives the precision of the interval)?

As the probability of population parameter falling within the interval increase...

the precision of the interval decreases

Error probability

the probability that a confidence interval does not contain the population parameter

T/F: If confidence interval is (49%, 54%), you can say it will be closer to 54 than 49 and that candidate likely to win, since the majority of values are 50% or greater

FALSE - never can say where within a parameter a confidence interval falls

Why can't you say where within a parameter a confidence interval falls?

Saying that would mean you know where the sample mean falls on the x-axis (but the fact you don't know where the sample mean is relative to the population mean is the whole problem!) maybe if had multiple samples coudl triangulate, button if you only have one

Interpretation of 52+/-3

This is the confidence interval - does not specify the confidence LEVEL

What is the equation to calculate confidence interval for proportions/qualitative data?

point estimate ± z (standard error)

When working with proportions, what do we use as the point estimate of the population proportion?

The sample proportion

Why is the sample proportion good?

Unbiasd and efficient point estimator of the population proportion

How is the proportion treated?

As a special mean

What does π represent

The population proportion

What does π /the population proportion mean?

It is the mean of the probability distribution having probability π for 1 and (1- π) for 0

Equation for the standard deviation of the probability distribution for π?

σ = (sq of)π (1− π)

When is the sampling distribution of the sample proportion approximately normal about the parameter pi?

When n is equal or greater than 30

What is σˆ/piˆ

estimate standard error of the distribution of al sample proportions

What does the "estimate standard error of the distrbution of all sample proportions" measure?

measures how much sample proportion piˆ varies from sample to sample, around the true proportion pi

What does "the standard deviation of the probability diststribution" measure

Measures how much individual outcomes (success/failures) vary around the mean proportion pi

Do we feel better about estimating standard error for quantiative data or for the binary?

Binary

Why do we feel better about estimating standard error for quantiative data or for the binary? (a few reasons)

1. don't have to worry about Jeff Bazos with 1/0 2. limited number of possibilities 3. discrete 4. within a bound range

As sample size increases, standard error gets smaller, and the sample proportion...

tends to fall closer to the population proportion

As the sample size increases, the precision of the confidence interval

will increase

As standard error and standard deviation increase, the precision of the confidence interval...

will decrease

If have 99% confidence interval, (.52, .54), interpretation is

99% confident true population parameter falls in the interval (proportion of individuals int the population that voted in the last election is somewhere between .52 and .54)

First step when calculating confidence interval?

Determining whether data is qualitative or quantitative

when you calculate the sample mean from a sample, what do you not know? (2)

If the sample mean is right or not if the samples you took are representative of the population

What is the confidence interval for a mean really predicting?

the chance that the unknown population mean is within a certain distance of the sample mean

Dark implication of Thus, with probability .95 a sample mean occurs such that the interval?

there is a probability of .05 that the sample mean does not fall within the interval

Lecture 6: Statistical Inference/Estimation Flashcards

(60 cards)