discrete random variable
A discrete random variable is a random variable whose possible values can be listed. That is, the number of distinct values is either finite or “countably infinite”.
There are two required properties for discrete random variables
1) 0 ≤ P(X) ≤ 1 for all x
2) ∑ 𝑃(𝑥) = 1
Any list of values and probabilities which satisfies these properties defines a discrete random variable. The values and probabilities could be represented in a table, a list, or
even a formula.
inclusive and exclusive inequalities
Inclusive inequalities include their boundary point(s), while exclusive do not.
Inclusive language: at least, at most, less than or equal to, greater than or equal to, no less than, no more than, etc.
Exclusive language: more than, greater than, less than, up to but not including, etc.
mean of a discrete random variable
This can only be calculated when we know the probability distribution for a random variable.
The mean or a random variable is sometimes referred to as the expected value or expectation of X, denoted by µ = E(X).
i. µ is the expected value of X, but that doesn’t mean X may actually attain the value of µ.
ii. If a random variable describes a random observation from a population, then the mean of that random variable is equal to the population mean.
variance of a discrete random variable
is simply the square root of its variance:
The Binomial Distribution
For the binomial distribution we will find formulas for probabilities. This allows us to quickly find all of the probabilities we need, without needing to write out a complete table
Bernoulli Trials
Repetition is an important part of experimentation that helps to ensure that observations are typical, and not just due to chance. When repeating an experiment, each
repetition is referred to as a trial.
Bernoulli trials are a sequence of trials that satisfy the following conditions:
i. Each trial has 2 possible outcomes, success or failure, denoted S and F.
ii. Each trial is independent of all other trials.
iii. The probability of success is the same for each trial. We denote the probability of success as p, where
𝑝 = 𝑃(𝑆) = 𝑃(𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙 𝑡𝑟𝑖𝑎𝑙).
By the complement rule, it follows that
1 − 𝑝 = 𝑃(𝐹) = 𝑃(𝐹𝑎𝑖𝑙𝑢𝑟𝑒)
Binomial distribution
The Binomial distribution is the probability distribution of the random variable that gives the number of successes in a sequence of n Bernoulli trials. More formally, let X denote the number of success in a sequence of “n” Bernoulli trials, each with success probability “p”. Then X is called a Binomial random variable, and is said to follow the Binomial distribution with parameters n and p.
Notation: X ~ Bin(n,p) means X follows the Binomial distribution with parameters n and p.
binomial probability formula is:
P(X = k) = (𝑛 over 𝑘) 𝑝^𝑘 (1 − 𝑝)^𝑛−𝑘, 𝑘 = 0,1, ⋯ n
where (𝑛 over 𝑘)=nCk is the number of ways to choose k successes among n trials, p^k is the probability of k consecutive successes, and (1-p)^n-k is the probability of n-k consecutive failures.
Breakdown of the binomial formula
= P(1st = success ∩ 2nd = success ∩ ⋯ ∩ kth = 𝑠𝑢𝑐𝑐𝑒𝑠
∩ (𝑘 + 1)𝑡ℎ = 𝑓𝑎𝑖𝑙 ∩ (𝑘 + 2)𝑛𝑑 = 𝑓𝑎𝑖𝑙 ∩ ⋯ ∩ 𝑛𝑡ℎ = 𝑓𝑎𝑖𝑙)
= P(First = successs)P(Second = success) ⋯ P(kth = 𝑠𝑢𝑐𝑐𝑒𝑠𝑠)
P((k + 1)^th = 𝑓𝑎𝑖𝑙) ⋯ 𝑃(𝑛^𝑡ℎ = 𝑓𝑎𝑖𝑙)
= p ⋅ p ⋅ ⋯ ⋅ p ⋅ (1 − p) ⋅ ⋯ ⋅ (1 − 𝑝)
= 𝑝^𝑘 (1 − 𝑝)^𝑛−𝑘
But this is the probability of an ordered sequence of k successes followed by n-k fails. The probably of k successes in any order is, therefore,
P(X = k) = (n over k) 𝑝^𝑘 (1 − 𝑝)^𝑛−𝑘
Shape of a Binomial Distribution
symmetric about its mean, i.e. the distribution of probability mass is identical on either side of the median=mean.
In fact, this is true for any binomial random variable with p=0.5, since the probability of success is equal to the probability of failure (i.e. p=1-p=0.5) and thus, the probability of k successes is equal to the probability of k failures
𝑃(𝑋 = 𝑘) = (𝑛 over 𝑘)(0.5)^𝑘 (1 − 0.5)^𝑛−𝑘 = (𝑛 over 𝑘)(0.5)^𝑛
and
𝑃(𝑋 = 𝑛 − 𝑘) = (n over n - k) = (0.5)^𝑛−𝑘 (1 − 0.5)^𝑛−(𝑛−𝑘) = (𝑛 over 𝑛−𝑘)(0.5)^n
(𝑛 over 𝑘) =n!/ k!(n-k)! =n! / (n-k)1(n-(n-k))! = (n over n-k)
it follows that P(X=k)=P(X=n-k).
On the other hand, if p<0.5, the binomial distribution is right skewed since more of its probability mass lies to the left of the median (and the opposite is true if p>0.5).
1-2-3 Rule