Unit 3 Flashcards

(31 cards)

1
Q

Univariate Data

A

data that consists of observations of only one characteristic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Bivariate Data

A

Data that includes observations of two characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explanatory variable

A

“x variable” variable may help predict or explain changes in the response variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Response variable

A

“y variable” measures the outcome of a study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Scatterplot

A

shows the relationship between 2 QUANTATATIVE variables on the same group of individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What to use to describe a scatterplot

A

Form
Outliers/unusual features
Direction
Strength

SEE PAGE 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describing scatterplots

Positive Association

A

Large values of the explanatory tend to associate with large values of the response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describing scatterplots

Negative association

A

Large values of the explanatory tend to associate with small values of the response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describing scatterplots

No association

A

knowing the value of one variable does NOT help us predict the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describing scatterplots

Direction

A

positive
negative
none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describing scatterplots

Form

A

Shape
Do the data points follow a linear or curved pattern

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describing scatterplots

Strength

A

weak
moderate
strong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describing scatterplots

Unusual Features

A

outliers, gaps, clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Correlation (definition and symbol)

A

Measures the direction and strength of a linear relationship between 2 quantatative variables

correlation = r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Important properties of the correlation r

A

Always between -1 and 1
Indicates direction by its sign (if r is less than 0 = negative, if r is greater than 0 = positive)
You can only get r = -1 or r = 1 if there is a perfect relationship
Stronger relationships are closer to 1 and -1 and weaker ones are closer to 0
ONLY can be used for LINEAR relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Caution about Correlation

A

Correlation does NOT imply CAUSATION
Correlation does NOT measure FORM
Correlation only describes LINEAR relationships
NOT RESISTANT to outliers

17
Q

Regression Line (definition and equation)

A

A line that measures how a response variable changes as the explanatory variable changes

Equation

ŷ = a + bx

ŷ = predicted response
a = y-intercept
b = slope
x = explanatory variable

18
Q

Extrapolation

A

Use of the regression line for predictions outside of our interval of x-values used to make a scatterplot

These predictions are NOT ACCURATE

19
Q

Residual

A

The difference between the actual value of y and the value of y predicted by the regression line

Equation

y-ŷ = residual

(Actual - Predicted, AP!)

20
Q

Best line to use

A

the Least Squares Regression Line (LSRL)

Because it is the line that minumizes the sum of the squared residual values

21
Q

A GOOD regression line

A

makes residuals as small as possible

22
Q

Least Squares Regression Line

A

the line that minimizes the the sum of the squared residual values

23
Q

Residual Plot

A

A scatterplot that displays residuals on the y-axis and the explanatory variable on the x-axis

24
Q

If a regression model is appropriate

A

there should be NO curved pattern in the residual plot

SEE PAGE 13

25
Standard deviation of residuals
The typical prediction error in y when using the LSRL SD of resids = s
26
R squared tells us
the % of the variation in y that is explained by the LSRL
27
Interpreting Computer Output
Constant Coefficient box = y-intercept Coefficient box beneath = slope S = standard deviation of the residuals SEE PAGE 15
28
LSRSL additional use/info
b = r (Sy/Sx) mean of y = a + b(mean of x) LSRL always passes through (mean of x, mean of y) SEE PAGE 16***
29
Correlation & Regression Wisdom
Be aware of limitations: Only describes LINEAR relationships Strong correlation DOES NOT imply CAUSATION Correlation and LSRL are NOT RESISTANT to outliers
30
See page 17
LOOK AT PAGE 17
31
Influential point
Any point when removed from a scatterplot and drastically changes LSRL or correlation is considered an influential point