STAT500 Terms Flashcards

(66 cards)

1
Q

α (‘Alpha’)

A

The probability of committing a Type I error. Also known as the significance level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Analysis of Variance (ANOVA)

A

A statistical method that analyzes variances to determine if the means from more than two populations are the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Bar Chart

A

A graph where the height of the bar for each category is equal to the frequency (number of observations) in the category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bayes’ Theorem

A

A theorem stating that if (A_1, \dots, A_k) are (k) mutually exclusive and exhaustive events, then (P(A_{i}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

β (‘Beta’)

A

The probability of committing a Type II error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Binary Categorical Variable

A

A variable that has two possible outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Binomial Distribution

A

A special discrete distribution where there are two distinct complementary outcomes, a “success” and a “failure.” It applies when an experiment consists of a fixed number of independent, identical trials, each with the same probability of success.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Bootstrapping

A

A method of using samples to find the approximate sampling distribution of a statistic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Classical Interpretation of Probability

A

The probability that event E occurs is denoted by P(E). When all outcomes are equally likely, then: (P(E) = \frac{\text{number of outcomes in E}}{\text{number of possible outcomes}}).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Coefficient of Variation (CV)

A

A unit-free statistic used to compare dispersion of data from two or more distinct populations, calculated as (CV = \dfrac{\text{Standard Deviation}}{\text{Mean}}).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Conditional Probability

A

The probability of one event occurring given that it is known that a second event has occurred.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Confounding variable

A

A variable that is in the study and is related to the other study variables, thus having an effect on the relationship between these variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Control Group

A

The group that did not receive the study treatment(s) and is used as a benchmark.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Critical values

A

The values that separate the rejection and non-rejection regions in a hypothesis test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Dependent Events

A

Two events are not independent if the knowledge of the outcome of one changes the probability of the other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Descriptive statistics

A

Techniques of describing data in ways to capture the essence of the information in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Empirical Rule

A

In any normal or bell-shaped distribution, roughly 68% of observations lie within one standard deviation of the mean, 95% within two, and 99.7% within three.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Experimental (study)

A

A study that involves some random assignment of a treatment; researchers can draw cause and effect (or causal) conclusions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Explanatory Variable

A

Variables that serve to explain changes in the response. They may also be called the predictor or independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Factors and Measurements

A

The factors are the controlled categorical predictors in the study. The response which is recorded but not controlled by the researcher is sometimes called measurements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

False Negatives

A

When test results come back negative for someone who is actually positive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

False Positives

A

When test results come back positive for someone who is actually negative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Independent Events

A

Two events, A and B, are considered independent if the probability of A occurring is not changed based on any knowledge of the outcome of B.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Inferential statistics

A

The process of drawing conclusions from data about the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Interquartile Range (IQR)
The difference between the upper and lower quartiles ((Q3 - Q1)).
26
Lurking variable
A variable that is neither the explanatory variable nor the response variable but has a relationship with both. It is not considered in the study but could influence the relationship.
27
Marginal Probability
The probability of an event without reference to any other event or events occurring.
28
Mean
The average of data.
29
Median
The middle value of the ordered data.
30
Mode
The value that occurs most often in the data.
31
Non-probability Methods
Methods for collecting data that might include convenience sampling or gathering volunteers.
32
Non-Response Bias
Bias that occurs when a large percentage of those sampled do not respond or participate.
33
Nonparametric Methods
Methods that require very few assumptions about the underlying distribution and can be used when the underlying distribution is unspecified.
34
Observational (study)
A study where a researcher records or observes the observations or measurements without manipulating any variables.
35
P-value
The probability that the test statistic equals the observed value or a more extreme value under the assumption that the null hypothesis is true.
36
Parameter
Any summary number, like an average or percentage, that describes the entire population.
37
Percentiles
The (p^{th}) percentile is a measurement such that after the data are ordered, at most, (p%) of the data are at or below this value.
38
Pie Chart
A graph where each sector of the circle represents the percentage of that category.
39
Population
Any large collection of objects or individuals about which information is desired.
40
Power
The probability the null hypothesis is rejected given that it is false (i.e., (1-\beta)).
41
Prevalence
The probability or proportion of occurrence of a disease or behavior in the population at a particular point in time.
42
Probability Methods
Methods for collecting data such as simple random sample, stratified random sample, or cluster sample.
43
Proportion
A fraction or part of the total that possesses a certain characteristic.
44
Qualitative (Categorical) (variable)
Data that serves the function of a name only. Sub-types include binary, ordinal, and nominal.
45
Quantitative (variable)
Data that takes on numerical values that has a measure of distance between them. Sub-types include discrete and continuous.
46
Randomization
The process where subjects should be randomly divided into groups to avoid unintentional selection bias.
47
Range
The difference in the maximum and minimum values of a data set.
48
Rejection region
The set of values for the test statistic that leads to rejection of the null hypothesis.
49
Relative Frequency Concept of Probability
If a particular outcome happens over a large number of events then the percentage of that outcome is close to the true probability.
50
Replication
Using a sufficient number of subjects to ensure that randomization creates groups that resemble each other closely and to increase the chances of detecting differences.
51
Response Bias
Bias that occurs when study participants either do not respond truthfully or give answers they feel the researcher wants to hear.
52
Response Variable
The variable about which the researcher is posing the question. Also called the outcome or dependent variable.
53
Sample
A representative group drawn from the population.
54
Selection Bias
Bias that occurs when the sample selected does not reflect the population of interest.
55
Sensitivity
The probability of a positive test result given the person is actually positive.
56
Specificity
The probability of a negative test result given the person is actually negative.
57
Standard Deviation
Approximately the average distance the values of a data set are from the mean; the square root of the variance.
58
Standard Normal Distribution
A normal distribution with a mean of 0 and a variance of 1. Also known as a Z distribution.
59
Statistic
Any summary number, like an average or percentage, that describes the sample.
60
Statistical literacy
People’s ability to interpret and critically evaluate statistical information and data-based arguments appearing in diverse media channels.
61
Statistics
The art and science of answering questions and exploring ideas through the processes of gathering data, describing data, and making generalizations about a population on the basis of a sample.
62
Subjective Probability
Probability that reflects personal belief which involves personal judgment, information, intuition, etc.
63
Test statistic
The sample statistic one uses to either reject (H_0) or fail to reject (H_0).
64
Treatments
If there is only one factor, then levels are the treatments. If there are multiple factors, then those factor combinations are called treatments.
65
Variable
Any characteristic, number, or quantity that can be measured, counted, or observed for record.
66
Variance
The average squared distance from the mean.