MLR & Assumptions Flashcards

(49 cards)

1
Q

What is statistical power?

A

The probability of detecting a true effect when the effect actually exists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is reporting only a p-value insufficient?

A

Because it does not convey the magnitude of the effect or whether the study was adequately powered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True / False: A statistically significant result always implies a practically important effect

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Type I error?

A

Rejecting the null hypothesis when it is actually true (false positive)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What values correspond to medium and large effects?

A

Medium:
𝑑 =0.5

Large:
𝑑 = 0.8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Power analysis depends on sample size, α, and _______

A

Effect size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What combination should be reported to strengthen scientific conclusions?

A

p-value

Effect size

Power analysis / sample size justification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True / False: Increasing sample size increases statistical power

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

According to Cohen, what is considered a small effect?

A

d=0.2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is leverage?

A

An observation with extreme X values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Residual = observed value minus ______ value

A

Predicted (fitted)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is collinearity?

A

High correlation between independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does a curved pattern in a residual plot indicate?

A

Violation of linearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

True / False: A good residual plot should show a random horizontal band

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When does the intercept have no intrinsic meaning?

A

When predictors never take the value 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are residuals in regression?

A

The unexplained error:

e𝑖 = y𝑖 − ŷ𝑖

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does VIF = 1 indicate?

A

No collinearity among predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is multicollinearity?

A

High correlation among more than two predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is an outlier in regression?

A

An observation with an extreme Y value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why are bivariate correlations insufficient to detect multicollinearity?

A

They cannot capture combined relationships among multiple predictors

21
Q

Why are regression diagnostics essential?

A

Because valid inference depends on assumptions being met

22
Q

What is the tolerance statistic?

A

The inverse of VIF; the proportion of variance not explained by other predictors.

23
Q

T/F: Multicollinearity violates OLS assumptions

A

False — but it makes estimation unreliable.

24
Q

Why should the global F-test be run before individual t-tests?

A

To control Type I error inflation.

25
T/F: Predictor variables must be completely uncorrelated to run MLR
False — perfect independence is unrealistic, but strong correlation causes problems.
26
What is multiple linear regression?
A method for modeling or predicting a continuous response variable using two or more independent variables with linear relationships
27
T/F: Multiple linear regression always improves model performance.
False — adding predictors can inflate R² without improving predictive value
28
What is the global (omnibus) F-test?
A test of whether any predictor contributes to explaining Y
29
State the hypotheses for the global F-test.
H₀: β₁ = β₂ = … = βₖ = 0 H₁: At least one βⱼ ≠ 0 (for j = 1, …, k)
30
What is a partial regression coefficient?
The effect of a predictor after controlling for all others.
31
T/F: The estimate of 𝜎^2 depends on the model specification
True
32
What does adjusted R² correct for?
The number of predictors in the model.
33
Why is multiple regression preferred over bivariate regression in practice?
It allows adjustment for confounders, improves precision, and better reflects real-world complexity.
34
T/F: Removing an insignificant predictor does not require refitting the model
False
35
Why can regression coefficients flip signs?
Omitted variables, suppression, or multicollinearity.
36
What does higher precision mean?
Smaller variance or standard error
37
In a curvilinear model, quadratic terms are written as ______.
𝑥^2
38
What test compares nested regression models?
Partial F-test
39
Fill in the blank – Partial regression coefficients measure the effect of a predictor ________ controlling for all other predictors
on the dependent variable
40
What is the main advantage of adding predictors in multiple regression?
Improves prediction/explanation by reducing unexplained variance and allowing control of extraneous variables
41
How is precision of a regression coefficient defined?
The inverse of its variance; smaller variance → higher precision.
42
What is suppression in regression?
A variable that increases the predictive power of other variables without being directly related to the dependent variable.
43
What does R² represent in multiple regression?
The proportion of variance in Y explained by the predictors.
44
What is the difference between a confidence interval and a prediction interval?
CI estimates variability in the mean response; PI estimates variability in individual predicted values.
45
Give an example of moderation in regression.
An interaction term: X×W→Y, e.g., diet program * BMI → weight loss.
46
T/F – Confounding is operationally present if: βcrude​ ≠ βadjusted​
True
47
Give an example of mediation in regression.
A causal pathway: X→M→Y, e.g., physical activity → caloric expenditure → weight loss.
48
How is multicollinearity detected?
Using Variance Inflation Factor (VIF) or tolerance. VIF > 10 → high multicollinearity Tolerance = 1/VIF
49
What is the formula for the Variance Inflation Factor (VIF) of a predictor Xj in multiple regression?
VIFj = 1 / (1 - Rj²)