Module 2 Flashcards

(26 cards)

1
Q

Distribution

A

Summarized listing of distinct data values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

3 Types of graphs to analyze distribution of value for numerical variables

A
  1. The Dot Plot - discrete data
  2. The Stem and Leaf Plot - discrete data
  3. The Histogram - discrete or continuous
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Dot Plots and how to construct

A

Graph where each observation is plotted as a dot at an appropriate place above a horizontal axis. Good for small number of observations

Step 1: Draw a horizontal axis that displays the possible values of the quantitative data. Label the axis with the variable name.
Step 2: Record each observation by placing a dot over the appropriate value on the horizontal axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Stem Plots and how to build

A

A graph where each observation is separated into two parts. The leaf is the right most digit. The stem is everything else.

  1. Write the stems from smallest to largest in a vertical column
  2. In another column, write the leaf in the row of the corresponding stem.
  3. Order the leaves in each row from smallest to largest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Histogram

A

Uses bars to represent the frequency (counts) or relative frequency of the observations falling into particular intervals (bins). Used for continuous data and large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Method of Left Inclusion

A

For each interval in a histogram, we use square brackets on the left end point and round brackets mean the number is NOT included in the interval.
Ex) Interval A: [25.5, 27.0)
Interval B: [27.0, 28.5)
Only interval B includes the number 27

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Skewness

A

Distribution is asymmetric and can be right-skewed or left-skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Modality

A

This is the number of peaks in a distribution. A peak has the largest frequency in a distribution. A distribution can be unimodal, bimodal, or multimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Uniform Distribution

A

A special, symmetric distribution where all possible values are observed equally with no peak.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Normal Distribution

A

aka the Bell Curve that is unimodal and symmetric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The Centre of Distribution

A

The value that is most likely to occur (mean, median, and mode)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Frequency

A

The number of times that a particular distinct value of a variable occurred in a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Contingency Table

A

tables that summarize the information of two (bivariate) categorical variables and help answer questions related to the relationship between the two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When do you use mean?

A

When the histogram has no skewness or the data has no outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When do you use median?

A

When the histogram has skewness or the data has outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When do you use the mode? And what is the mode?

A

The frequency of the distinct value.
When the data is categorical, but if no value occurs more than once, then the data set has no mode. Two modes are possible if there are two values that show up an equal amount and other values show up less.

17
Q

“The Spread”

A

variability of a distribution

18
Q

How do you calculate range?

A

Range = max(x1, . . . , xn) - min(x1, . . . ., xn)

aka range = biggest value - smallest value

19
Q

Lower Quartile

A

Q1 is the 25th percentile that separates the bottom 25% of the data from the top 75%

20
Q

Middle Quartile

A

Q2 is the 50th percentile that splits the data in half (median)

21
Q

Upper Quartile

A

Q3 is the 75th percentile that separates the bottom 75% of the data from the top 25%

22
Q

Inter-Quartile Range (IQR)

A

IGQ is the difference between the upper quartile and the lower quartile

IQR = Q3 - Q1

23
Q

How do you know if a value is an outlier?

A

If the value falls outside the interval [upper limit, lower limit]

Upper limit = Q3 + 1.5xIQR

Lower limit = Q1 - 1.5xIQR

24
Q

Random Variables

A

a measurable characteristic that varies from one member to another whose observed value depends on chance

25
Continuous Random Variables
A population of values cannot be listed individually, but rather into classes or intervals
26
3 Properties for Continuous Probability distribution
1. THe probability of a randomly selected member of the population falling into a specific interval is given by the area under the curve of the interval 2. The total area under the density curve is always = 1 3. There is no area under the density curve above a single point.