What is designing in the language of statistics?
Setting up a hypothesis or question and deciding how to collect data
What is describing in the language of statistics? (descriptive statistics)
Summarizing data with numbers and graph.
What is inferences in the language of statistics? (inferential statistics)
Decisions and predictions based on the data.
The typical statistical model assumes what? (model = assumptions)
(but this is not always the case)
What is the significance level? (rule of thumb)
5 %.
What is random sampling and why is it important?
Making sure that each subject in the population has the same chance of being in the sample so that we make sure that the sample is a good reflection of the population.
What does inferential statistics refer to?
Methods of making decisions or predictions about a population, based on data obtained from a sample of that population.
What is the difference between a parameter and a statistic?
Parameter: a numerical summary of the population
Statistic: a numerical summary of the sample taken from the population
When is a variable categorical and when is it quantitative?
What is unordered (nominal) data and what type?
Categorical: e.g.: Male/female, type of business, ZIP code etc.
What is ordered (ordinal) data and what type?
Categorical: e.g.: Grades, likert scales etc.
What is a good graph?
Check colors: www.colorbrewer2.org
Remember to: use different lines, colors, different plotting symbols.
Remember it might be printed black/white
What is a discrete variable and what type?
Numerical: Value in subset of natural numbers (typically integers)
E.g.: 0,1,2,3… (number of employees, number of companies etc.)
What is a continuous variable?
Numerical: May take any value in an interval
E.g.: income, sales etc.
When is a variable discrete and when is it continuous?
What is the median?
The middle observation
E.g.: 1,1,1,2,2,2,3,3,4,5,6,7,7,8,8
Median = 3
When is it called modal category and when it is called mode?
Modal category and mode both refer to being the most frequent answer in a data set.
Modal category ⇒ the category with the highest frequency
Mode ⇒ the numerical value (quantitative) that occurs most frequently
What are the primary graphical display for summarizing a categorical variable?

What are the primary graphical display for summarizing quantitative variables?

What is the “mode” in a frequency table or histogram?
The highest point.
What does unimodal and bimodal refer to?
Whether the histogram or frequency table has a single mound or two distinct mounds.

What does symmetric and skewed shape refer to?

What is the “mean” of a distribution of a quantitative variable?
The sum of the observations divided by the number of observations.
(The average / The balance point of the distribution)

What is the median?
The median is the middle value of the observations when the observations are ordered from smallest to the largest.
(in case you have 20 observations, you will take observation (10+11)/2 as your median)