Descriptive Statistics Flashcards by Franchesca Reese Uy

Branch of statistics that describes or summarizes data.

Descriptive Statistics

How well did you know this?

Not at all

Perfectly

Differentiate Exploratory Data Analysis (EDA) from Descriptive Statistics

Exploratory Data Analysis (EDA) helps you understand your data.

Descriptive statistics help you explain your data to others.

How well did you know this?

Not at all

Perfectly

Ways of Describing Data

Frequency Distribution → shows values and how often they occur.

Bar Graph → for nominal/ordinal data.

Histogram → for interval/ratio data.

Frequency Polygon → plots points at class midpoints instead of bars.

How well did you know this?

Not at all

Perfectly

A way of describing data that presents the score values and their frequency of occurrence.

Frequency Distribution

How well did you know this?

Not at all

Perfectly

How frequency distributions of Nominal or Ordinal Data are customarily plotted

Bar Graph

How well did you know this?

Not at all

Perfectly

used to represent frequency distributions composed of interval or ratio data using bars

Histogram

How well did you know this?

Not at all

Perfectly

Used to represent interval or ratio data using a point that is plotted over the midpoint of each interval at a height corresponding to the frequency of the interval

Frequency Polygon

How well did you know this?

Not at all

Perfectly

indicates the proportion of the total number of scores in each interval.

Relative Frequency Distribution

How well did you know this?

Not at all

Perfectly

indicates the number of scores that fall below the upper limit of each interval.

Cumulative Frequency Distribution

How well did you know this?

Not at all

Perfectly

indicates the percentage of scores that fall below the upper limit of each interval.

Cumulative Percentage Distribution

How well did you know this?

Not at all

Perfectly

f/N

Relative Frequency

How well did you know this?

Not at all

Perfectly

frequency of interval + frequencies of all class intervals below it.

Cumulative Frequency

How well did you know this?

Not at all

Perfectly

cumulative f / N × 100

Cumulative Percentage

How well did you know this?

Not at all

Perfectly

also known as the Gaussian Distribution

Normal Distribution

How well did you know this?

Not at all

Perfectly

symmetrical and bell shaped

curves outwards at the top and then inwards nearer the bottom, the tails getting thinner and thinner

Normal Distribution

Note: As long as the distribution is close to a normal distribution, it will not matter too much.

How well did you know this?

Not at all

Perfectly

a non-symmetrical distribution

skewed distribution

How well did you know this?

Not at all

Perfectly

the curve rises rapidly and then drops off slowly

Positive Skew

📌 In simpler terms:

The tail of the distribution is stretched out to the right (higher values).

Most of the scores are low, but a few very high scores pull the mean upward.

How well did you know this?

Not at all

Perfectly

the curve rises slowly and then decreases rapidly

Negative Skew

📌 In simpler terms:

The tail of the distribution is stretched out to the right (higher values).

Most of the scores are low, but a few very high scores pull the mean upward.

How well did you know this?

Not at all

Perfectly

occurs when there are either too many people at the extremes of the scale, or not enough people at the extremes.

Kurtosis

How well did you know this?

Not at all

Perfectly

when there are insufficient people in the tail (ends) of the scores to make the distribution normal.

Positive Kurtosis

How well did you know this?

Not at all

Perfectly

when there are too many people, too far away, in the tails of the distribution.

Negative Kurtosis

How well did you know this?

Not at all

Perfectly

small number of data points that lie outside the distribution when the distribution is approximately normal. Usually easily spotted in histograms.

Outliers

How well did you know this?

Not at all

Perfectly

the most central value of a data set with different interpretations of the sense of “central.”

Central Tendency

How well did you know this?

Not at all

Perfectly

Measures of central tendency

Mean (x̄): sum of scores ÷ number of scores.

Median: middle score when ordered.

Mode: most frequent score.

How well did you know this?

Not at all

Perfectly

Measures of Dispersion

Range = highest – lowest. Interquartile Range (IQR) = Q3 – Q1. Variance = average of squared deviations. Standard Deviation (SD) = square root of variance.

∑x / N

Mean (arithmetic mean)

the middle score in a set of scores. Used when the mean is not valid, which might be because the data are not symmetrically or normally distributed, or because the data are measured in an ordinal level.

Median

the most frequent score in the distribution or the most common observation among a group of scores.

Mode

the simplest measure of dispersion. It is the distance between the highest score and the lowest score. Simple but distorted by outliers.

Range

measure of central tendency used with ordinal data or with non-normal distributions. Resistant to outliers, often used with median.

Inter-Quartile Range (IQR)

the distance between the upper and lower quartiles.

Inter-Quartile Range (IQR)

square root of variance.

Standard Deviation (SD)

average of squared deviations.

Variance

Standard Deviation (SD)

Tells how spread out data are relative to the mean.

Show median, quartiles, and outliers visually. Whiskers usually extend to 1.5 × IQR from the box.

Boxplots

describe/summarize the data a researcher has

Descriptive Statistics

helps a researcher understand the data that he has, while descriptive statistics help him explain to other people what is happening to his data

Exploratory data analysis

Different ways of Describing the Distribution

- Frequency Table - Charts (e.g., histograms, bar chart etc)

Highest frequency

Mode

used to present the pattern in the data

Charts

frequency distributions of Nominal or Ordinal Data are customarily plotted using a what?

Bar Graph

used to represent frequency distributions composed of interval or ratio data

Histogram

used to represent interval or ratio data.

Frequency polygon

presents the score values and their frequency of occurrence

Frequency distribution

indicates the proportion of the total number of scores in each interval.

Relative Frequency Distribution

indicates the number of scores that fall below the upper limit of each interval.

Cumulative Frequency Distribution

indicates the percentage of scores that fall below the upper limit of each interval.

Cumulative Percentage Distribution

f/N

Relative Frequency

frequency of interval + frequencies of all class intervals below it.

Cumulative Frequency

cum f / N x 100

Cumulative Percentage

Why does it matter if a distribution is normal or not?

There are mathematical equations that can be used to draw a normal distribution. These equations can be used in statistical tests. A lot of tests depend on the data being from a normal distribution. Many variables are normally distributed. It makes it easier to do inferential statistics accurately and without bias.

occurs when only few of the subjects are strong enough to get off the floor.

Floor Effect

causes negative skew and are much less common in Psychology

Ceiling Effect

Prerequisites for mean

1. Bell-shaped distribution 2. Continuous variable (ratio/interval)

the square of the Standard Deviation.

Variance

√ (Sum of ((x - x-bar)^2 / n))

Sample standard deviation

√ (Sum of ((x - x-bar)^2 / n-1))

Population standard deviation

Descriptive Statistics Flashcards

(60 cards)