Module 2 Flashcards by Teren Hazzard

Distribution

Summarized listing of distinct data values

How well did you know this?

Not at all

Perfectly

3 Types of graphs to analyze distribution of value for numerical variables

The Dot Plot - discrete data
The Stem and Leaf Plot - discrete data
The Histogram - discrete or continuous

How well did you know this?

Not at all

Perfectly

Dot Plots and how to construct

Graph where each observation is plotted as a dot at an appropriate place above a horizontal axis. Good for small number of observations

Step 1: Draw a horizontal axis that displays the possible values of the quantitative data. Label the axis with the variable name.
Step 2: Record each observation by placing a dot over the appropriate value on the horizontal axis.

How well did you know this?

Not at all

Perfectly

Stem Plots and how to build

A graph where each observation is separated into two parts. The leaf is the right most digit. The stem is everything else.

Write the stems from smallest to largest in a vertical column
In another column, write the leaf in the row of the corresponding stem.
Order the leaves in each row from smallest to largest

How well did you know this?

Not at all

Perfectly

Histogram

Uses bars to represent the frequency (counts) or relative frequency of the observations falling into particular intervals (bins). Used for continuous data and large datasets.

How well did you know this?

Not at all

Perfectly

Method of Left Inclusion

For each interval in a histogram, we use square brackets on the left end point and round brackets mean the number is NOT included in the interval.
Ex) Interval A: [25.5, 27.0)
Interval B: [27.0, 28.5)
Only interval B includes the number 27

How well did you know this?

Not at all

Perfectly

Skewness

Distribution is asymmetric and can be right-skewed or left-skewed

How well did you know this?

Not at all

Perfectly

Modality

This is the number of peaks in a distribution. A peak has the largest frequency in a distribution. A distribution can be unimodal, bimodal, or multimodal

How well did you know this?

Not at all

Perfectly

Uniform Distribution

A special, symmetric distribution where all possible values are observed equally with no peak.

How well did you know this?

Not at all

Perfectly

Normal Distribution

aka the Bell Curve that is unimodal and symmetric

How well did you know this?

Not at all

Perfectly

The Centre of Distribution

The value that is most likely to occur (mean, median, and mode)

How well did you know this?

Not at all

Perfectly

Frequency

The number of times that a particular distinct value of a variable occurred in a sample

How well did you know this?

Not at all

Perfectly

Contingency Table

tables that summarize the information of two (bivariate) categorical variables and help answer questions related to the relationship between the two variables.

How well did you know this?

Not at all

Perfectly

When do you use mean?

When the histogram has no skewness or the data has no outliers

How well did you know this?

Not at all

Perfectly

When do you use median?

When the histogram has skewness or the data has outliers

How well did you know this?

Not at all

Perfectly

When do you use the mode? And what is the mode?

Study These Flashcards

The frequency of the distinct value.
When the data is categorical, but if no value occurs more than once, then the data set has no mode. Two modes are possible if there are two values that show up an equal amount and other values show up less.

“The Spread”

Study These Flashcards

variability of a distribution

How do you calculate range?

Study These Flashcards

Range = max(x1, . . . , xn) - min(x1, . . . ., xn)

aka range = biggest value - smallest value

Lower Quartile

Study These Flashcards

Q1 is the 25th percentile that separates the bottom 25% of the data from the top 75%

Middle Quartile

Study These Flashcards

Q2 is the 50th percentile that splits the data in half (median)

Upper Quartile

Study These Flashcards

Q3 is the 75th percentile that separates the bottom 75% of the data from the top 25%

Inter-Quartile Range (IQR)

Study These Flashcards

IGQ is the difference between the upper quartile and the lower quartile

IQR = Q3 - Q1

How do you know if a value is an outlier?

Study These Flashcards

If the value falls outside the interval [upper limit, lower limit]

Upper limit = Q3 + 1.5xIQR

Lower limit = Q1 - 1.5xIQR

Random Variables

Study These Flashcards

a measurable characteristic that varies from one member to another whose observed value depends on chance

Continuous Random Variables

A population of values cannot be listed individually, but rather into classes or intervals

3 Properties for Continuous Probability distribution

1. THe probability of a randomly selected member of the population falling into a specific interval is given by the area under the curve of the interval 2. The total area under the density curve is always = 1 3. There is no area under the density curve above a single point.

Module 2 Flashcards

(26 cards)