Model Adequacy and Assumption Checking Flashcards

Question

What happens if a study is underpowered?

Answer 1

It may fail to detect real effects, increasing the likelihood of Type II errors

Answer 2

0.80 (80% probability of detecting a true effect).

Answer 3

The significance of individual predictors

Answer 4

Approximately 20–30 participants per group.

Answer 5

Effect size Sample size Significance level (α) Variability in the data

Answer 6

A general guideline used to estimate sample size when formal power analysis is unavailable

Answer 7

Around 30 participants

Answer 8

They do not account for effect size, study design, or variability

Answer 9

Estimate effect size, set α, choose desired power, and conduct a power analysis

Answer 10

The relationship between Y and a predictor after removing the effects of other predictors from both variables

Answer 11

Independent variables that are not correlated with each other

Answer 12

Remove redundant predictors or use ridge regression

Answer 13

The influence of a data point on its own predicted value

Answer 14

Residuals are not correlated with one another

Answer 15

Semi-partial removes shared variance from the predictor only, while partial removes shared variance from both the predictor and the outcome

Answer 16

A data point that significantly changes the regression coefficients or model fit

Answer 17

The influence of an observation on individual regression coefficients

Answer 18

The relationship between Y and a predictor after removing the effect of other predictors from that predictor only.

Answer 19

The overall influence of an observation on the regression model.

Answer 20

It equals the change in R² when that predictor is added to the model.

Answer 21

Residuals scaled by their estimated standard deviation that accounts for leverage

Answer 22

The proportion of variance in the dependent variable explained by the entire regression model.

Answer 23

R-student (external studentized) residuals

Answer 24

Because it shows total model fit but does not show the unique contribution of each predictor

Answer 25

Because predictors share overlapping variance, making it difficult to determine each predictor’s unique contribution

Answer 26

yes, when predictors are highly correlated and share overlapping variance.

Answer 27

To determine how much variance in the dependent variable (Y) is explained by a set of independent variables (predictors)

Answer 28

To assess normality of residuals, focusing on deviations in the tails

Answer 29

Breusch-Pagan test, Bartlett’s test, Levene’s test.

Answer 30

Patterns such as funnels (heteroscedasticity) or curves (nonlinearity); a random scatter indicates assumptions are met

Answer 31

To check linearity, constant variance (homoscedasticity), normality of residuals, and identify outliers or influential points

Answer 32

The individual R²s for IVs do not sum to the overall model R², making it harder to determine unique contributions

Answer 33

The correlation between Y and an IV after removing the variance explained by other IVs from that IV only

Answer 34

Cook’s Distance, DFFITS, DFBETAS

Answer 35

Standardized residuals, studentized (internal) residuals, PRESS residuals, R-student (external) residuals

Answer 36

When you are interested in the unique contribution of a given independent variable to Y.

Answer 37

Needed for t-tests, F-tests, and confidence intervals; estimates of coefficients can still be obtained if normality is violated, but inference may be invalid.

Answer 38

A regression method where each observation is weighted inversely to its error variance to handle heteroscedasticity

Answer 39

Improves interpretability (especially intercept), reduces multicollinearity for polynomials and interactions, without changing model fit.

Answer 40

To achieve linearity, stabilize residual variance (homoscedasticity), and improve normality of residuals.

Answer 41

To make the relationship between X and Y linear, meeting regression assumptions

Answer 42

Overfitting, extrapolation errors, multicollinearity, interpretability of coefficients.

Answer 43

1. Linearizing transformations – usually on predictors (X) 2. Variance-stabilizing transformations – usually on response (Y)

Answer 44

Logarithm, square root, power, reciprocal, exponential.

Answer 45

When skewness is severe -When traditional transformations do not sufficiently normalize the data -When the relationship between variance and mean is non-linear

Answer 46

By improving normality and stabilizing variance, it can increase the reliability and interpretability of correlations and regression results

Answer 47

They can automatically estimate the optimal λ and provide visualizations or likelihood plots to guide the transformation

Answer 48

Yes, it can handle positively or negatively skewed data, often by adjusting the data beforehand

Answer 49

To interpret results in the original units of Y.

Answer 50

Y^(0)=log(Y)

Answer 51

Y^(λ)=Yλ−1/λ

Answer 52

No transformation (approximately the original data)

Answer 53

To improve linear regression assumptions by reducing non-normality, non-linearity, and heteroscedasticity.

Answer 54

Y=(λ⋅Y^(λ)+1)^1/λ

Answer 55

As an approximate percentage change in Y for a 1-unit increase in X

Answer 56

Change in Y for a percentage change in X.

Answer 57

It makes results understandable and meaningful in real-world terms.

Answer 58

They apply to the transformed scale and must be interpreted carefully.

Answer 59

The type of transformation and the reason for using it.

Answer 60

Skewness in the data.

Answer 61

A pair: (theoretical quantile, sample quantile).

Answer 62

Theoretical probabilities vs. empirical probabilities.

Answer 63

P–P plot

Answer 64

Theoretical quantiles vs. sample quantiles.

Answer 65

Points lying close to a straight 45° line

Answer 66

For checking overall distribution fit.

Answer 67

|DFFITS| > 2√(p/n)

Answer 68

A point with an unusual X-value (far from the mean of predictors).

Answer 69

The point improves precision.

Answer 70

Use robust regression methods.

Answer 71

The point has little effect on model precision.

Answer 72

To identify points that distort model fit and estimates.

Answer 73

High leverage and a large residual.

Answer 74

Residual size and leverage.

Answer 75

The influence of a point on each regression coefficient β̂ⱼ.

Answer 76

A potential multivariate outlier.

Answer 77

No, only if it also has a large residual.

Answer 78

hᵢᵢ > 2p/n

Answer 79

If it is an error, invalid, or not part of the population.

Answer 80

To assess their impact on results.

Answer 81

The influence of a point on its own predicted value.

Answer 82

|DFBETAS| > 2/√n

Answer 83

Residuals converted into z-scores using a common variance estimate.

Answer 84

Data entry errors, measurement errors, or true extreme values.

Answer 85

Approximately a t-distribution.

Answer 86

Re-run the model with and without them.

Answer 87

They are more accurate because they account for unequal variance.

Answer 88

Is the model being overly influenced by a few unusual observations?

Answer 89

> 4/n or > 1

Answer 90

A case with an extreme value in the response variable (Y)

Answer 91

A point that significantly changes regression results when removed.

Answer 92

High collinearity (investigate)

Answer 93

Residuals deviate from normality.

Answer 94

Very high collinearity (problematic).

Answer 95

Moderate collinearity (monitor).

Answer 96

p-value for significance of predictors.

Answer 97

Variation explained by predictors.

Answer 98

Unique contribution of a predictor to the outcome.

Answer 99

When hᵢᵢ > 2p/n

Answer 100

Unique variance explained by that predictor.

Answer 101

Raw, uncontrolled relationships.

Answer 102

Normal-shaped (baseline).

Answer 103

Ratio of model variance to residual variance.

Answer 104

Low collinearity (acceptable)

Answer 105

Flatter with light tails (fewer outliers).

Model Adequacy and Assumption Checking Flashcards

(133 cards)