Define linear regression model.
A model that expresses the mean of a response as a linear function of predictors: y = Xβ + ε.
Intuition: a straight-line/plane relationship plus noise.
Example: Sales = β0 + β1(YouTube) + β2(Facebook) + ε.
Define response variable (y).
The outcome you want to explain or predict; treated as random before sampling, then observed/fixed in the dataset.
Intuition: what you care about.
Example: sales (thousands of units).
Define predictor / covariate (x).
An input variable used to explain/predict y; columns of X (besides the intercept).
Intuition: knobs you measure.
Example: YouTube budget.
Define parameter (β).
Unknown population quantity describing the relationship between predictors and mean response.
Intuition: the ‘true’ intercept/slopes.
Example: β1 = change in mean sales per $1k YouTube.
Define error term (ε).
Random deviation of an observation from its conditional mean: y = E[y|x] + ε.
Intuition: all unmodeled factors + noise.
Example: random market effects not in ad budgets.
Define residual (e_i or \hat{ε}_i).
Observed deviation from fitted line/surface: e_i = y_i − ŷ_i.
Intuition: ‘leftover’ after fitting.
Example: actual sales minus predicted sales for company i.
Define fitted value (ŷ_i).
Model’s predicted mean response at x_i using estimated coefficients: ŷ_i = x_i^T β̂.
Intuition: point on fitted line/surface.
Example: predicted sales for given budgets.
Define least squares.
Estimation method choosing β̂ to minimize RSS = Σ (y_i − ŷ_i)^2.
Intuition: make residuals small overall.
Example: pick the line with smallest total squared vertical distances.
Define residual sum of squares (RSS).
RSS = (y − Xβ)^T(y − Xβ) = Σ e_i^2.
Intuition: total squared error of fit.
Example: objective function minimized by OLS.
Define design matrix (X).
Matrix whose rows are observations and columns are predictors (first column often all 1s for intercept).
Intuition: table of inputs.
Example: [1, YouTube, Facebook, Newspaper].
Define intercept.
β0; expected response when all predictors equal 0 (if 0 is meaningful).
Intuition: baseline level.
Example: predicted sales when ad budgets are $0k.
Define slope / coefficient.
βj; change in E[y|x] for a one-unit increase in x_j holding other predictors fixed.
Intuition: partial effect.
Example: change in sales per $1k Facebook, holding YouTube/newspaper fixed.
Define simple linear regression (SLR).
Regression with one predictor: y_i = β0 + β1 x_i + ε_i.
Intuition: fit a line in 2D.
Example: turtle_rating vs income.
Define multiple linear regression (MLR).
Regression with multiple predictors: y = Xβ + ε.
Intuition: fit a plane/hyperplane.
Example: sales ~ YouTube + Facebook + newspaper.
Define overdetermined system.
More equations than unknowns (n observations > p+1 parameters), so y = Xβ typically has no exact solution.
Intuition: can’t hit every point exactly.
Example: 200 companies but only 4 parameters.
Define column space of X (Col(X)).
Set of all linear combinations of columns of X.
Intuition: all vectors you can represent as Xβ.
Example: all possible fitted value vectors ŷ.
Define projection.
Mapping a vector to the closest vector in a subspace (in least squares: project y onto Col(X)).
Intuition: ‘shadow’ of y on the model space.
Example: ŷ is the projection of y onto Col(X).
Define orthogonality.
Two vectors are orthogonal if their dot product is 0.
Intuition: 90° angle; no linear association in that geometry.
Example: residual vector is orthogonal to Col(X) at the OLS solution.
Define normal equations.
Equations X^T X β̂ = X^T y that characterize the OLS solution when X^T X is invertible.
Intuition: set derivative to zero.
Example: solve for β̂ via linear system.
Define hat matrix (H).
H = X(X^T X)^{-1}X^T; maps y to fitted values: ŷ = Hy.
Intuition: linear ‘smoother’ putting the ‘hat’ on y.
Example: compute leverage from diag(H).
Define leverage (h_ii).
Diagonal element of H; measures how extreme x_i is in predictor space.
Intuition: unusual x gives high leverage.
Example: a company with very large YouTube/Facebook spend.
Define identity matrix (I).
Square matrix with 1s on diagonal, 0 elsewhere; acts like 1 in matrix multiplication.
Intuition: ‘do nothing’ operator.
Example: (X^T X)^{-1}(X^T X) = I.
Define transpose.
Operation swapping rows/columns: (A^T){ij} = A{ji}.
Intuition: flip across diagonal.
Example: (X^T X) is symmetric.
Define symmetric matrix.
A matrix A such that A^T = A.
Intuition: mirror across diagonal.
Example: X^T X is always symmetric.