10.2: Confidence Intervals and t-distribution Flashcards

Question 1

Q

What is a point estimate? How is it computed? Provide an example.

Answer

A

Point estimates are single sample values used to estimate population parameters.

Computation:
mean = sum of single sample values/size of sample

The value generated is called the point estimate of the mean.

Question 2

Q

What is student’s t-distribution and when is it used? How does it compare to the normal distribution?

Answer

A

Student’s t-distribution is a bell-shaped probability distribution that is symmetrical about its mean. It is used when constructing confidence intervals based on small smalls (where n < 30) from populations with unknown variance and a normal distribution.

Compared to normal distribution, t-distribution is flatter with fatter tails.

Question 3

Q

What are the properties of student’s t-distribution?

Answer

A

Symmetrical
Defined by a single parameter, the degrees of freedom, which equals to n - 1
Has fatter sales than the normal distribution.
As the degrees of freedom (sample size) gets larger, the shape of the t-distribution approaches a normal distribution.

Question 4

Q

What happens to t-distribution when the degrees of freedom increases? What happens when degrees of freedom increases without bounds?

Answer

A

When degrees of freedom increases, the centre becomes more spiked and its tails become thinner.

When degrees of freedom increases without bounds, t-distribution converges to the standard normal distribution (z-distribution).

Question 5

Q

What is degrees of freedom?

Answer

A

Degrees of freedom is the number of observations, which is calculated as n - 1.

Question 6

Q

What are fat tails an indication of?

Answer

A

Fat tails mean that there are more outliers (observations away from the centre of the distribution).

Question 7

Q

How are confidence intervals for a random variable that follows a t-distribution related to degrees of freedom?

Answer

A

Confidence intervals for a random variable that follows a t-distribution must be wider when the degrees of freedom are less (fatter tails) for a given significance level, and narrower when the degrees of freedom are more (thinner tails) for a given significance level.

Question 8

Q

What is a confidence interval?

Answer

A

Confidence interval estimates result in a range of values within which the actual value of a parameter will lie, given the probability of 1 - alpha which is referred to as the degree of confidence.

Question 9

Q

What is alpha?

Answer

A

Alpha is the level of significance for confidence interval.

Question 10

Q

How are confidence intervals constructed?

Answer

A

CIs are constructed by adding or subtracting an appropriate value from the point estimate.

Point estimate plus minus (reliability factor x standard error)

Question 11

Q

How is the confidence interval for the population mean calculated, given that the population has a normal distribution with a known variance?

Answer

A

With known variance and normal distribution, CI is calculated as:

Point estimate for population mean plus minus reliability factor times standard deviation over the square root of sample size

Question 12

Q

What is the reliability factor for 90% CI?
What is the reliability factor for 95% CI?
What is the reliability factor for 99% CI?

Answer

A

Reliability factor for 90% CI = 1.645 (significance level is 10%, 5% in each tail)
Reliability factor for 95% CI = 1.960 (significance level is 5%, 2.5% in each tail)
Reliability factor for 99% CI = 2.575 (significance level is 1%, 0.5% in each tail)

Question 13

Q

How is the confidence interval for the population mean calculated, given that the population has a normal distribution with an unknown variance?

Answer

A

With unknown variance and normal distribution, CI is calculated as:

Point estimate for population mean plus minus t-reliability factor, corresponding to degrees of freedom 1 - n, times the standard deviation over the square root of the sample size.

Question 14

Q

How is the confidence interval created for a non-normal distribution?

Answer

A

If the sample size is less than 30 (n < 30), confidence intervals cannot be constructed.

If the sample size is greater than 20 (n > 30)

variance is known, use z-statistic
variance is unknown, use t-statistic

Question 15

Q

What are the two limitations to using a larger sample size?

Answer

A

Larger sample sizes may contain observations from a different population, which can reduce the precision of population parameter estimates.
Cost of using a larger sample should be weighed against the value of the increase in precision from the increase in sample size.

Question 16

Q

What is data mining? What are the warning signs of data mining?

Answer

Study These Flashcards

A

Data mining occurs when analysts repeatedly use the same database to search for patterns or trading rules until one that works is discovered.

Warning signs:

Evidence that many different variables were tested, most of which are unreported, until significant ones were found.
The lack of any economic theory that is consistent with the empirical results.

Question 17

Q

What is data-mining bias?

Answer

Study These Flashcards

A

Data mining bias refers to results where the statistical significance of the pattern is overestimated because the results were found through data mining.

Question 18

Q

What is the best way to avoid data mining?

Answer

Study These Flashcards

A

The best way to avoid data mining is to test a potentially profitable trading rule on a data set different from the one used to develop the rule.

Question 19

Q

What is sample selection bias?

Answer

Study These Flashcards

A

Sample selection bias occurs when some data is systematically excluded from the analysis, because of the lack of availability. This results in a non-random observed sample and any conclusions drawn from this sample cannot be applied to the population.

Question 20

Q

What is survivorship bias? What is an example of survivorship bias? What is the solution?

Answer

Study These Flashcards

A

Survivorship bias is a result of excluding data that no longer exist from the sample so that the result is an overestimation.

Example: mutual funds

Solution: use a sample that all started at the same time and do not exclude data that have been removed

Question 21

Q

What is look-ahead bias?

Answer

Study These Flashcards

A

Look-ahead bias occurs when a study tests a relationship using sample data that was not available on the test date.

Question 22

Q

What is time-period bias?

Answer

Study These Flashcards

A

Time-period bias can result if the time period over which the data is gathered is either too short or too long.

10.2: Confidence Intervals and t-distribution Flashcards

(22 cards)