Ch.6 Flashcards

Question 1

Q

probability mass

Answer

A

In the probability distribution of a discrete random variable, each value of the random variable holds probability mass, such that the sum of all probabilities is equal to 1.

Question 2

Q

continuous random variable

Answer

A

A continuous random variable, unlike the discrete case, is defined over an uncountable set of values.

For these distributions it doesn’t make much sense to talk about P(X = x). After all, nobody is exactly 180cm tall! Instead, when working with continuous distributions we will only find probabilities for ranges of values, not individual values.

For continuous random variables we must use a formula, and we call that the density function. The density function defines a curve such that the total area under the curve is equal to 1.

Question 3

Q

discrete random variables

Answer

A

For discrete random variables we could list the values and probabilities in a table or use a formula to define them. The binomial distribution uses a formula to define the probabilities.

Question 4

Q

density curve

Answer

A

A density curve graphically displays a continuous probability distribution in the exact same way that a probability histogram graphically displays a discrete probability distribution. That is,

• The density curve of a random variable X is drawn above the horizontal axis.
• All possible values of X are listed on the horizontal axis, while the corresponding densities are listed on the vertical axis.
• The probability that X is contained in any interval is equal to the area contained within this interval under the density curve.
• The total area under a density curve is equal to 1

Question 5

Q

Regardless of whether we are working with a discrete or continuous random variable (_______)

Answer

A

the probability of an event is simply the total area, in the probability distribution, corresponding to the event.

Question 6

Q

normal distribution

Answer

A

• continuous distribution

• It is visually represented by the normal density curve, which is the familiar “bell curve”.

• The normal distribution is characterized by two parameters: μ and σ, i.e. mean and std. deviation, together with the density function.

Notation: X~N(μ,σ) is used to indicate that X follows a normal distribution with mean μ and standard deviation σ

• Normal probabilities are approximated using a Z-table.

Question 7

Q

Basic properties of the normal distribution:

Answer

A

The total area under the curve is 1.0.
The curve is bell-shaped and symmetric about the mean
The two tails of the curve extend indefinitely.

• there is not just one normal curve, but a family of normal curves.
— Each different set of μ and σ gives a different curve.

• μ determines the center of the distribution and σ gives the spread of the curve.
— increase μ, the distribution shifts to the right, and as we decrease μ, the distribution shifts to the left.
— as we increase σ, the distribution becomes more disperse (wider and flatter), and as we decrease σ, the distribution becomes less disperse (narrower and taller).

Question 8

Q

standard normal distribution

Answer

A

• has mean of 0 and standard deviation of 1.

• is particularly relevant since any normal random variable can be converted to a standard normal random variable through standardization.

• standardized version of a normal random variable is itself a normal random variable, which is said to follow the standard normal distribution.

Notation: That is, if X follows a normal distribution with mean and standard deviation of μ and σ, then the standardized version of X.

• If 𝑋 has mean 𝜇, then, on average, 𝑋 is equal to 𝜇.
= 𝑌 = 𝑋 − 𝜇, then 𝑌 is equal to 0, on average.

• if 𝑋 has standard deviation 𝜎, then it is typical for 𝑋 to be within 𝜎 units 𝜇
= 𝑋 − 𝜇 is typically within 𝜎 units of 0.

• if 𝜎 is the typical distance between 𝑋 − 𝜇 and 0
= then 1 is the typical distance between (𝑋 − 𝜇)/𝜎 and 0.

Question 9

Q

area preserving

Answer

A

The process of standardizing a normal random variable is of particular relevance to us, because the process is area preserving.

• Standardization is a form of shifting and rescaling data.

• Data are shifted when a constant is added to or subtracted from each value.
— Measures of position (mean, median, percentiles, etc) are all increased/decreased by the value of said constant.
— Measures of spread (range, I.Q.R. standard deviation) remain unchanged.

• Data are rescaled when each value is multiplied or divided by a constant value.
— Measures of position are also multiplied/divided by that constant.
— Measures of spread are also multiplied/divided by that constant.

• Note: The standard deviation is multiplied/divided by that constant, which means that the variance is multiplied/divided by that constant squared

Question 10

Q

z-tables

Answer

A

• contain many values and their corresponding probabilities from the standard normal distribution.

• The probability that a standard normal distribution is less than a certain value, P(Z < a), is obtained by,

— finding the first two digits of that value in the column at the edge,
— then finding the third digit in the row at the top of the table.
— the corresponding probability is found in the cell where the row and column intersect

Notation: z_α is used to denote the z-score with area of α to its right, under the standard normal curve. The probabilities in the z-table typically give P(Z < z_α).

That is: 𝑧< z_α is the value in the Z-distribution (Normal with mean 0 and standard deviation).

Question 11

Q

To find the probability that a standard normal distribution is larger than a value, or contained in an interval, we need to re-arrange and re-express the probability so the table can be used.

Answer

A

Tips & tricks:
• Diagrams are a very helpful first step

• Rule1:Complementrule:P(Z>z)=1–P(Z<z)
- complement rule

• Rule2:Symmetry rule:P(Z > z) = P(Z < –z)
- property applies to all distributions that are symmetric about 0

• Rule3:P(a<Z<b)=P(Z<b)–P(Z<a)
- a combination of the complement rule, the symmetry rule, and the special addition rule

• Rule 4: Since Z follows a continuous distribution, it follows that P(Z=z)=0 for
any value of z. For this reason, we no longer need to distinguish between “≤” and “<” , or between “>” and “<”

Question 12

Q

How to determine a probability for a normally distributed random variable

Answer

A

i. Sketch the normal curve for the random variable

ii. Shade the region of interest and mark its delimiting x-value(s)

iii. Find the corresponding z-score(s) through standardization, i.e.

iv. Use the z-tables to find the area under the standard normal curve delimited by the z-score(s)

Question 13

Q

How to determine the value corresponding to a given probability for a normally distributed random variable

Answer

A

i. Sketch the normal curve for the random variable

ii. Shade the region of interest

iii. Use the z-tables to find the z-score(s) delimiting this region

iv. Find the x-value(s) using x=μ + zσ (the opposite of standardizing)

Question 14

Q

Probability for a normally distributed random variable
and
Value corresponding to a given probability for a normally distributed random variable

Answer

A

• are essentially inverses of one another.

• in the first case, find an unknown probability corresponding to an area bounded by some given value(s).

• in the second case, compute unknown value(s) corresponding to a particular region with some given probability.

• i.e. find p for given x or find x for given p.

Question 15

Q

Assessing Normality

Answer

A

• Sometimes required to assume that a population is normally distributed. This assumption can be checked in several ways; all of which look at the shape of the sample.

• If the sample looks somewhat normal in shape, it’s reasonable to assume the population is approximately normal.

• if the sample look skewed, it is not reasonable to assume the population is approximately normal.

• One way to check normality is to create a histogram from the data: if the histogram looks reasonably bell shaped, then the normality assumption seems reasonable.
— method is not ideal, since it requires binning the data and such a method may fail to indicate normal features even if the data are somewhat normal.

• more preferred and effective means of assessing normality is through the use of a normal probability plot.

Question 16

Q

normal probability plot

Answer

Study These Flashcards

A

• plots the observed values on the horizontal axis and the corresponding normal values on the vertical axis.

• There are two variants of the normal probability plot:
- Q-Q plot
- P-P plot

Question 17

Q

Q-Q plot

Answer

Study These Flashcards

A

A Q-Q plot which plots the observed normal quantiles against the observed quantiles (quantile vs quantile).

• if the data is approximately normal, then the observed values will fall close to the expected normal scores, so that the points in the plot appear to form a diagonal line.

• if the data are non-normal, the plotted points typically make a curved shape, or possibly an S-shape

Question 18

Q

P-P plot

Answer

Study These Flashcards

A

• A P-P plot plots normal cumulative probabilities against the observed cumulative probabilities (more specifically cumulative relative frequencies).

• if the data is approximately normal, then the observed values will fall close to the expected normal scores, so that the points in the plot appear to form a diagonal line.

• if the data are non-normal, the plotted points typically make a curved shape, or possibly an S-shape

Question 19

Q

Construction of the probability-probability plot can be outlined as follows

Answer

Study These Flashcards

A

Determine n, the number of observations in the sample
Order the observations from smallest to largest
Obtain the expected z-scores from Table III in Textbook Formula/TableCard
on meskanas. (This is not on the Stat 151 formula sheet, but if you need it on an
exam it will be provided.)
Plot the observed value (on x-axis) vs expected score (on y-axis).

Question 20

Q

If P-P or Q-Q plot concave’s upwards it’s _____ skewed, if it concave’s downwards it’s _____ skewed. If it’s in a diagonal line it’s reasonable to assume the data come from a population which is ______ _____.

Answer

Study These Flashcards

A

If P-P or Q-Q plot concave’s upwards it’s Right skewed, if it concave’s downwards it’s Left skewed. If it’s in a diagonal line it’s reasonable to assume the data come from a population which is Approximately Normal

Ch.6 Flashcards

(20 cards)