problem with z-scores
z-score formula requires more information than is usually available. Specifically, a z-score requires that we know the value of the population standard deviation (or variance), which is needed to compute the standard error.
equation for estimated standard error
standard error (sm) = s / square root of n
estimated standard error
The estimated standard error is used as an estimate of the real standard error when the value of the standard deviation is unknown. It is computed from the sample variance or sample standard deviation and provides an estimate of the standard distance between a sample mean M and the population mean μ.
There are two reasons for making this shift from standard deviation to variance:
t-statistic equation
t = M-mu / sm (mean - population mean / estimated standard error)
t statistic
The t statistic is used to test hypotheses about an unknown population mean, μ, when the value of the standard deviation is unknown. The formula for the t statistic has the same structure as the z-score formula, except that the t statistic uses the estimated standard error in the denominator.
degrees of freedom
describe the number of scores in a sample that are independent and free to vary. Because the sample mean places a restriction on the value of one score in the sample, there are n − 1 degrees of freedom for a sample with n scores
a t distribution approximates a normal distributor
How well a t distribution approximates a normal distributor is determined by degrees of freedom. In general, the greater the sample size (n) is, the larger the degrees of freedom are, and the better the t distribution approximates the normal distribution.
t distribution
The distribution of t statistics is symmetrical and centered at zero like a normal distribution. A t distribution is flatter and more spread out than the normal distribution, but approaches a normal shape as df increases.
shape of the t distribution
When these values are used in the t formula, the result becomes
t = sample mean (from the data) - population mean (hypothesised from the null hypothesis) / estimated standard error (computed from the sample data)
Hypothesis testing steps
assumptions of the t-test
Cohen’s d or estimated d
estimated d = mean difference / sample standard deviation = M-mu / s
proportion or percentage of the total variability
variability accounted for / total variability
percentage of variance accounted for by the treatment (r^2)
A measure of effect size that determines what portion of the variability in the scores can be accounted for by the treatment effect.
r^2 = t^2 / t^2 + df
confidence intervals
A confidence interval is an interval, or range of values centered around a sample statistic. The logic behind a confidence interval is that a sample statistic, such as a sample mean, should be relatively near to the corresponding population parameter.
equation to find mean population
M +- t*sm
Two characteristics of the confidence interval should be noted.