What relationship do we observe between Price and Mileage?
There is a negative relationship. As mileage increases, price decreases.
What does a negative relationship mean in this context (price and mileage)?
Cars with higher mileage tend to be cheaper, and cars with lower mileage tend to be more expensive.
Why is it important to understand the Price–Mileage relationship?
Because it:
Explains how car price changes with mileage
Helps us predict the average price given a mileage
Allows us to quantify how strong the relationship is
Helps measure the effect of mileage on price
What is the dependent variable in this price-milage analysis?
Price — it is the value we want to describe and predict.
(Plotted on the y-axis.)
What is the explanatory variable in this price-milage analysis?
Mileage — it may help explain Price.
(Plotted on the x-axis.)
What is the correlation coefficient used to measure?
The strength and direction of a linear relationship between two variables.
What is the formal name of the correlation coefficient?
The Pearson correlation coefficient.
What is the formula for the Pearson correlation coefficient?
r_xy = cov(x,y)/ (σ_x σ_y)
What does covariance measure?
The joint variability of two variables — how they move together.
What is the formula for covariance?
cov(x,y) = 1/n n∑j=1 (x_j - x̄)(y_j - ȳ)
Why do we divide covariance by the product of standard deviations in the correlation formula?
To scale the covariance so that correlation:
Is unit-free
Always lies between –1 and +1
What does a positive covariance mean?
When x is above its mean, y tends to be above its mean too (move together).
What does a negative covariance mean?
When x is above its mean, y tends to be below its mean (move in opposite directions).
Example: If x_j - x̄ is positive, and y_j - ȳ
is negative, what can we say?
Their product is negative, indicating a negative relationship.
What are the formulas for the variances of x and y?
σˆ2 _x = 1/n n∑j=1 (x_j - x̄)ˆ2
σˆ2 _y = 1/n n∑j=1 (y_j - ȳ)ˆ2
What does r= -1,0, and +1 mean?
–1 → perfect negative linear relationship
0 → no linear relationship
+1 → perfect positive linear relationship
What is the range of the correlation coefficient r?
r ranges from –1 to +1.
What does a larger value
∣r∣ indicate?
A stronger linear relationship between the variables.
What does r=0 tell us?
There is no linear relationship (but other, non-linear patterns may still exist).
What does r=+1 or r=−1 indicate?
A perfect linear relationship (very rare in real data).
What does the sign of r tell us?
Positive sign → positive relationship
Negative sign → negative relationship
What does the value (size) of r NOT tell us?
It tells us nothing about the steepness (slope) of the relationship.
State the hypotheses for testing correlation.
Null hypothesis: H_0 : p_x,y
= 0 (no population correlation)
Alternative hypothesis: H_1 : p_x,y ≠ 0
What is the difference between p_x,y and r_x,y?
p_x,y : population correlation coefficient
r_x,y : sample correlation coefficient