What is correlation between two variables X and Y at a high level?
A standardized measure of the strength and direction of a linear relationship between X and Y.
How is Pearson correlation coefficient defined in terms of covariance?
ρ(X,Y) = Cov(X,Y) / (σ_X σ_Y), where σ_X and σ_Y are the standard deviations of X and Y.
What are the possible values of Pearson correlation?
Between −1 and 1 inclusive, where −1 is perfect negative linear, 0 is no linear correlation, and 1 is perfect positive linear.
What does ρ(X,Y) = 0 imply about X and Y?
There is no linear correlation; they may still be dependent in a nonlinear way.
Does high correlation imply that X causes Y?
No; correlation alone does not establish causality, only association.
What is spurious correlation?
An observed correlation between variables that arises from coincidence or common causes rather than a direct relationship.
What is a confounder in causal reasoning?
A variable that influences both the potential cause and the outcome, potentially creating a misleading association.
How can confounding lead to incorrect causal conclusions?
If not controlled, changes in the confounder may be mistaken for causal effects of the variable of interest.
Why is ‘correlation ≠ causation’ particularly important in ML contexts?
Models can capture patterns that are predictive but unstable or non-causal; deploying them can fail when underlying associations change.
What is rank-based (Spearman) correlation at a high level?
A correlation measure based on ranks of the data, capturing monotonic relationships (not just linear) and being more robust to outliers.
When might Spearman correlation be preferred over Pearson?
When relationships are monotonic but not linear, or when outliers and non-Normality make Pearson unreliable.
What is partial correlation?
The correlation between two variables after removing the linear effects of one or more other variables.
Why can partial correlation be more informative than raw correlation?
It helps isolate associations that are not explained by obvious third variables, though it still does not prove causality.
What is dependence in probability theory?
Any situation where the joint distribution of variables does not factor into the product of their marginals; knowing one gives information about the other.
Why is independence stronger than zero correlation?
Independence implies zero correlation for variables with finite variance, but zero correlation does not rule out nonlinear dependence.
What is mutual information (MI) at a high level?
A nonnegative measure of how much knowing one variable reduces uncertainty about another, capturing general dependence, not just linear.
Why is mutual information useful in feature selection?
It can detect nonlinear and non-monotonic relationships between features and targets.
What is the danger of selecting features solely based on correlation with the label?
You may pick many redundant or spurious features and inflate false discoveries due to multiple testing.
What is the difference between predictive and causal relationships?
Predictive relationships help forecast outcomes, while causal relationships describe how interventions on one variable would change another.
Why can a non-causal feature still be useful in ML?
Even non-causal features can be strongly predictive if they are stable proxies or capture useful information about the outcome.
When can using non-causal features be dangerous?
When underlying associations can change under new policies, environments, or behaviors, causing models to fail or behave unfairly.
What is a causal effect at a high level?
The difference in an outcome that would occur under one intervention versus another, holding everything else constant in a conceptual experiment.
What is the role of randomized experiments in discovering causal effects?
Random assignment breaks confounding, so differences in outcomes between groups can be interpreted as causal under reasonable assumptions.
Why are randomized experiments not always feasible?
They can be expensive, unethical, or logistically impossible in some settings.