Refreshers Flashcards by Charron Davis

What is the purpose of studying the relationship between two quantitative variables?

To determine whether the variables are related, the strength and type of the relationship, and whether predictions can be made.

How well did you know this?

Not at all

Perfectly

A __________ is used to visually display the relationship between two quantitative variables.

Scatterplot

How well did you know this?

Not at all

Perfectly

True / False: A scatterplot can show direction, form, strength, and outliers in a relationship.

True

How well did you know this?

Not at all

Perfectly

What does the correlation coefficient (r) measure?

The strength and direction of a linear relationship between two quantitative variables.

How well did you know this?

Not at all

Perfectly

What is the range of the correlation coefficient r?
A) 0 to 1
B) –∞ to +∞
C) –1 to +1
D) –100 to +100

How well did you know this?

Not at all

Perfectly

True / False: A correlation of r = 0 implies no relationship of any kind between two variables

False (it means no linear relationship)

How well did you know this?

Not at all

Perfectly

A positive correlation means that as one variable increases, the other variable __________.

Also increases

How well did you know this?

Not at all

Perfectly

What are the null and alternative hypotheses for a correlation test?

H₀: ρ = 0
H₁: ρ ≠ 0

How well did you know this?

Not at all

Perfectly

When is the null hypothesis rejected in a correlation test?

When |r| is greater than the critical value.

How well did you know this?

Not at all

Perfectly

True / False: Correlation implies causation

False

How well did you know this?

Not at all

Perfectly

Why can combining different subgroups in a dataset produce misleading correlations?

Because relationships within groups can be masked or reversed when data are combined

How well did you know this?

Not at all

Perfectly

What is linear regression used for?

To model the relationship between variables, describe trends, and make predictions.

How well did you know this?

Not at all

Perfectly

The regression line is also called the line of __________

Best fit

How well did you know this?

Not at all

Perfectly

True / False: Regression should be performed even if the correlation is not significant

False

How well did you know this?

Not at all

Perfectly

What is an independent variable (IV)?

The variable that causes or predicts changes in another variable.

How well did you know this?

Not at all

Perfectly

What is a dependent variable (DV)?

The variable that is measured as the outcome.

How well did you know this?

Not at all

Perfectly

In a study on sleep and test scores, which is the IV and which is the DV?

IV: Hours of sleep
DV: Test score

How well did you know this?

Not at all

Perfectly

One way to identify the DV is to ask which variable occurs __________ in time.

Later

How well did you know this?

Not at all

Perfectly

What is the strongest evidence for establishing causation?

Experimental evidence with replication.

How well did you know this?

Not at all

Perfectly

What is the main purpose of linear regression?

Study These Flashcards

To model a linear relationship and make predictions about one quantitative variable from another

What method is used to find the regression line?

Study These Flashcards

Least squares

What is the “line of best fit”?

Study These Flashcards

The line that best represents the overall trend of the data.

Why are regression assumptions important?

Study These Flashcards

Violated assumptions can invalidate the model and its conclusions

What is a residual?

Study These Flashcards

The vertical distance between an observed value and its predicted value

Name the five key assumptions of linear regression

1. Linearity 2. Normality of residuals 3. Equal variance of residuals (homoscedasticity) 4. Independence of residuals 5. No multicollinearity

What type of variable must the dependent variable (DV) be in linear regression?

Continuous

True / False: Multicollinearity affects the dependent variable directly

False

True / False: Linearity refers to the distribution of the residuals.

False

What are residuals in regression?

The differences between observed values and predicted values.

Unequal variance of residuals is called __________

Heteroscedasticity

What does the normality of residuals assumption require?

Residuals should be approximately normally distributed

What is an observed value in regression?

The actual value of the dependent variable collected from data (yᵢ)

Can residuals be negative?

Yes, if the predicted value is larger than the observed value (yᵢ – ŷᵢ < 0)

What is a predicted value in regression?

The value estimated by the regression model for a given independent variable (ŷᵢ)

What is the general form of a multiple linear regression equation?

Where Y = DV, Xs = IVs, βs = coefficients, e = residual 𝑌𝑖 = 𝛽0 + 𝛽1(𝑋1𝑖) + 𝛽2 (𝑋2𝑖) + 𝑒𝑖

What is leverage?

Extreme values in independent variable(s) (X) that can disproportionately affect regression results

What VIF value indicates no collinearity?

VIF = 1

What VIF value often indicates problematic collinearity?

VIF > 10 (though problems can occur with lower values)

What is repetitiveness in MLR?

Multiple measures of the same construct included as IVs.

T/F: Perfectly correlated IVs (r = 1) cause standard errors to be computable

False — regression fails; matrix cannot be inverted

Name three ways to detect collinearity.

1. Large changes in regression coefficients when adding/removing IVs 2. Unexpected signs of coefficients or large standard errors 3. Variance Inflation Factor (VIF)

Name two effects of high collinearity on regression results

Misleading significance of IVs (some may appear non-significant) Inflated standard errors of coefficients

What is the Variance Inflation Factor (VIF)?

A measure of how much the variance of a regression coefficient is inflated due to collinearity among IVs

How are outliers detected?

By examining studentized residuals

What is effect size?

A standardized measure of the magnitude of a difference or relationship, independent of sample size

How is Type I and Type II error expressed probabilistically?

Type I: 𝑃(Reject H₀ | H₀ true) = 𝛼 Type II: 𝑃(Fail to reject H₀ | H₀ false) = 𝛽

How do you calculate Cohen’s d for independent samples?

Cohen's d = (M_2 - M_1) ⁄ SDpooled Where, SDpooled = √((SD_1^2 + SD_2^2) ⁄ 2)

Can a “small” effect size still be important?

Yes — even small effects can have practical or cumulative significance, especially in large populations.

What is an a priori power analysis?

Estimating the minimum sample size needed to detect a true effect or confirm no significant difference, considering sample size, alpha, and effect size

Refreshers Flashcards

(49 cards)