correlation
A statistical value that measures and describes the direction and degree of relationship between two variables. The sign (+/−) indicates the direction of the relationship. The numerical value (0.0 to 1.0) indicates the strength or consistency of the relationship. The type (Pearson or Spearman) indicates the form of the relationship. Also known as correlation coefficient.
A correlation is a numerical value that describes and measures three characteristics of the relationship between X and Y.
positive correlation
In a positive correlation, the two variables tend to change in the same direction: as the value of the X variable increases from one individual to another, the Y variable also tends to increase; when the X variable decreases, the Y variable also decreases.
negative correlation
In a negative correlation, the two variables tend to go in opposite directions. As the X variable increases, the Y variable decreases. That is, it is an inverse relationship.
pearson correlation
The Pearson correlation measures the degree and the direction of the linear relationship between two variables.
The Pearson correlation for a sample is identified by the letter r
r = degree to which X and Y vary together/degree to which X and Y vary separately = covariability of X and Y / variability of X and Y seperately
sum of products
A measure of the degree of covariability between two variables; the degree to which they vary together.
definitional formula for the sum of products
SP = sigma (X-Mx)(Y-My)
computational formula for the sum of products
SP = sigma XY - sigmaX sigma Y / n
the formula for the Pearson correlation becomes
r = SP/square root of SSx * SSy
sample r
r = sigmazxzy / (n-1)
population r
p = sigma zxzy / N
Where and why Correlation Are Used
When you encounter correlations, there are four additional considerations that you should bear in mind.
coefficient of determination
The value r^2 is called the coefficient of determination because it measures the proportion of variability in one variable that can be determined from the relationship with the other variable. A correlation of r = 0.80 (or −0.80), for example, means that R^2 = 0.64 (or 64%) of the variability in the Y scores can be predicted from the relationship with X.
regression towards the mean
When there is a less-than-perfect correlation between two variables, extreme scores (high or low) for one variable tend to be paired with the less extreme scores (more toward the mean) on the second variable. This fact is called regression toward the mean.
partial correlation between X and Y, holding Z constant, is determined by the formula
rxyz = rxy - (rxz*ryz) / square root of (1-r^2 xz) (1-r^2 yz)
partial correlation
A partial correlation measures the relationship between two variables while controlling the influence of a third variable by holding it constant.
When you obtain a nonzero correlation for a sample, the purpose of the hypothesis test is to decide between the following two interpretations.
t statistic equation
t = sample statistic - population parameter / standard error
standard error for r
r = sr = square root of 1-r^2 / n-2
t = r-p / square root of 1-r^2 / n-2
degrees of freedom for the t statistic
df = n-2
spearman correlation
A correlation calculated for ordinal data. Also used to measure the consistency of direction for a relationship.
To summarize, the Spearman correlation measures the relationship between two variables when both are measured on ordinal scales (ranks). There are two general situations in which the Spearman correlation is used.
Whenever two scores have exactly the same value, their ranks should also be the same. This is accomplished by the following procedure.