Advanced Statistics Flashcards

Question

What is skewness?

Answer 1

Asymmetry in a data distribution.

Answer 2

They are equal.

Answer 3

Mean greater than median with a long right tail.

Answer 4

Mean less than median with a long left tail.

Answer 5

The spread of data values.

Answer 6

Difference between highest and lowest values.

Answer 7

Difference between the 75th and 25th percentiles.

Answer 8

Average squared deviation from the mean.

Answer 9

Square root of variance, expressed in original units.

Answer 10

SE = SD / √ n

Answer 11

Precision of the sample mean as an estimate of the population mean.

Answer 12

Bar charts and pie charts.

Answer 13

Histogram.

Answer 14

Quartiles, median, range, and outliers.

Answer 15

Many statistical tests assume normality.

Answer 16

68%, 95%, and 99% respectively.

Answer 17

Sample means are normally distributed if sample size is sufficiently large.

Answer 18

A normal distribution with mean 0 and SD 1.

Answer 19

Standardised value calculated as (x-mean) / SD

Answer 20

Relative frequency of an event, ranging from 0 to 1.

Answer 21

Ratio of events occurrence to non-occurrence.

Answer 22

Odds = p / (1-p) Probability = odds/ (1+odds)

Answer 23

The population to which study results are intended to apply.

Answer 24

The accessible sampling frame.

Answer 25

Individuals meeting study criteria.

Answer 26

Eligible individuals who consent to participate.

Answer 27

Participants who complete all study requirements.

Answer 28

Every individual has equal selection probability.

Answer 29

Random sampling within predefined strata.

Answer 30

Sampling of pre-existing groups rather than individuals.

Answer 31

Sampling where selection probability is unknown.

Answer 32

Deliberate selection based on study objectives.

Answer 33

Methods used to draw conclusions about populations from samples.

Answer 34

Estimating a range likely to include the true parameter.

Answer 35

A conjectural statement linking variables.

Answer 36

Statement of no difference or effect.

Answer 37

Statement opposing H₀ and reflecting the research belief.

Answer 38

Hypothesis specifying direction of effect.

Answer 39

Hypothesis specifying difference without direction.

Answer 40

Probability of rejecting the H₀ when H₀ is true.

Answer 41

False positive; occurs when H₀ is incorrectly rejected, resulting in false positive conclusion in favor of H₁.

Answer 42

When a researcher incorrectly upholds the H₁ and rejects the H₀, despite it being true.

Answer 43

Alpha (α).

Answer 44

Less than 0.05, commonly accepted as the level of statistical significance (p-value).

Answer 45

Multiple hypothesis testing, subgroup analyses, or secondary analyses increase the chance that at least one test will be falsely significant.

Answer 46

False negative; occurs when H₀ is incorrectly accepted, leading to false negative conclusion when a true difference exists.

Answer 47

Beta (β).

Answer 48

Small sample size and large variance.

Answer 49

The power of a study.

Answer 50

Reducing Type I error generally increases Type II error, and vice versa.

Answer 51

α = 5%, β = 20%.

Answer 52

The ability of a study to detect a true difference between groups when it exists.

Answer 53

β = 0.2 and power = 80% (0.8).

Answer 54

Sample size, effect size, variability of observations, and chosen level of statistical significance (p-value).

Answer 55

It is essentially a sample size calculation based on predefined values of α, power, effect size, and variance.

Answer 56

From pilot studies or previously published literature.

Answer 57

Effect size expressed as the target difference in means divided by the standard deviation.

Answer 58

Estimating sample size using standardised difference and power.

Answer 59

It increases power but also increases the risk of Type I error.

Answer 60

It increases power by reducing variance but increases cost.

Answer 61

Large effect sizes increase power but may ignore clinically meaningful small effects.

Answer 62

Through more precise measurements or matching subjects.

Answer 63

Because it concentrates statistical power in one direction, assuming strong prior justification.

Answer 64

Parametric tests are generally more powerful when assumptions are met.

Answer 65

Samples are used to draw inferences about the population from which they are drawn.

Answer 66

When the sample is reasonably representative of the population.

Answer 67

Sample data are only approximations of true population values.

Answer 68

A range of values within which the true population parameter is likely to lie.

Answer 69

That 95 out of 100 similarly drawn samples would contain the true population value.

Answer 70

Greater precision and better representativeness of the sample.

Answer 71

Means, effect sizes, relative risks, odds ratios, and number needed to treat (NNT).

Answer 72

The complement of the p-value (e.g. 95% confidence corresponds to an α = 0.05).

Answer 73

Higher confidence levels result in wider confidence intervals.

Answer 74

Precision of the estimate, influenced by standard error and sample size.

Answer 75

If limits include the population value, there is no evidence of a difference.

Answer 76

0 for mean difference, 1 for ratio measures, and infinity for NNT.

Answer 77

Degree of confidence, precision, clinical significance, and statistical significance.

Answer 78

Lower confidence level, reduced SD, or increased sample size.

Answer 79

Difference in outcomes between intervention and control divided by SD.

Answer 80

They quantify magnitude of effect and are independent of sample size.

Answer 81

They allow comparison across studies using different measurement scales.

Answer 82

A standardised difference between 2 means.

Answer 83

Small (0.2), medium (0.5), and large (0.8).

Answer 84

A z-score indicating the proportion of control scores exceeded by the experimental group.

Answer 85

Probability that a randomly selected experimental score exceeds a control score.

Answer 86

It increases the likelihood of Type I errors.

Answer 87

Adjusting significance level by dividing it by the number of tests performed.

Answer 88

It is overly conservative and may increase false negatives.

Answer 89

Probability of at least one Type I error across multiple comparisons.

Answer 90

The expected proportion of false positives among significant findings.

Answer 91

Methods used to generalise findings from a sample to a population.

Answer 92

Estimation and hypothesis testing.

Answer 93

Estimation of a single population parameter value.

Answer 94

Estimation of a parameter range with a defined confidence level.

Answer 95

The consistency and replicability of a measurement instrument.

Answer 96

No, it guarantees consistency but not truth.

Answer 97

By administering the same test twice to the same population.

Answer 98

A measure of internal consistency, with 0.70 commonly used as a cut-off.

Answer 99

Agreement between multiple raters using the same instrument.

Answer 100

Proportion of variance attributable to true differences between subjects.

Answer 101

The extent to which an instrument measures what it intends to measure.

Answer 102

Subjective judgment of whether an instrument appears to measure the intended construct.

Answer 103

Whether an instrument measures the theoretical construct of interest.

Answer 104

Degree to which test items represent all relevant domains.

Answer 105

Performance of a test against an external criterion.

Answer 106

Validity based on current correlations.

Answer 107

Validity based on future outcomes.

Answer 108

Agreement between instruments measuring the same construct.

Answer 109

Low correlation between instruments measuring different constructs.

Answer 110

Sensitivity of an instrument to detect change after intervention.

Answer 111

Degree of variability in repeated measurements.

Answer 112

Closeness of a measurement to the true population value.

Answer 113

It overestimates agreement by ignoring chance agreement.

Answer 114

Agreement beyond chance for categorical variables.

Answer 115

For ordinal data.

Answer 116

Assessing agreement for continuous variables.

Answer 117

What is being tested, comparison of point estimates (Category 1) or demonstration of association/relationship (Category 2)?

Answer 118

Comparing samples for point estimates such as means, medians, or proportions.

Answer 119

Demonstrating associations or relationships between variables.

Answer 120

Category 2.

Answer 121

What is the nature of the point estimate, mean or proportion?

Answer 122

How many groups are being compared?

Answer 123

Are observations paired or unpaired (independent)?

Answer 124

Can a parametric distribution be assumed?

Answer 125

When at least one variable is quantitative and normally distributed.

Answer 126

Biologically variables are assumed to be normally distributed.

Answer 127

t-test and ANOVA.

Answer 128

When variables are qualitative or when quantitative data are not normally distributed.

Answer 129

Ranks rather than means or medians alone.

Answer 130

Sign test.

Answer 131

Wilcoxon rank sum test.

Answer 132

Mann-Whitney U test.

Answer 133

Kruskal-Wallis test.

Answer 134

To allow the use of more robust parametric tests.

Answer 135

Log transformation.

Answer 136

Square root transformation.

Answer 137

Reciprocal transformation.

Answer 138

Logit transformation.

Answer 139

Confidence intervals may be difficult to interpret.

Answer 140

Comparing frequency counts or proportions.

Answer 141

A table displaying frequency data for categorical variables.

Answer 142

They must be mutually exclusive.

Answer 143

Actual outcomes observed in a study.

Answer 144

Outcomes predicted if the null hypothesis were true.

Answer 145

When expected frequencies are <5 in more than 20% of cells.

Answer 146

When total sample size is <100 or any cell value is <10.

Answer 147

Because converting to percentages reduces cell values and yields incorrect results.

Answer 148

A chi-square test for paired categorical data.

Answer 149

A chi-square method assessing effects of 2 dichotomous variables.

Answer 150

A chi-square-based method using log frequencies for multiple variables.

Answer 151

Comparing a sample mean with a known population mean.

Answer 152

Normal distribution and adequate sample size.

Answer 153

Comparing means of 2 samples.

Answer 154

For independent samples measured once.

Answer 155

For repeated measurements in the same subjects.

Answer 156

Equal variance between groups.

Answer 157

Using Levene's test.

Answer 158

Comparing means across multiple groups.

Answer 159

Variance between groups relative to variance within groups.

Answer 160

Comparison of one independent variable across multiple groups.

Answer 161

Analysis involving 2 independent variables.

Answer 162

When the same subjects are measured multiple times.

Answer 163

F-statistic.

Answer 164

It identifies that differences exist but not where they occur.

Answer 165

To identify specific group differences.

Answer 166

Normality, equal variance, and independent observations.

Answer 167

The number of values free to vary when estimating a statistic.

Answer 168

One degree of freedom is lost due to estimation of the mean.

Answer 169

One df lost for each parameter estimated.

Answer 170

df = (Rows - 1) x (Columns - 1)

Answer 171

df = (n₁ + n₂ − 2)

Answer 172

Degree of association between 2 quantitative variables.

Answer 173

Scatterplot.

Answer 174

Pearson's correlation coefficient (r).

Answer 175

Direction of the relationship.

Answer 176

Strength and linearity of association.

Answer 177

For ordinal date, non-normal distributions, non-linearity, or small samples.

Answer 178

When ordinal ranks are not equidistant.

Answer 179

To predict values of one variable from another.

Answer 180

y = a + bx

Answer 181

The change in y per unit change in x.

Answer 182

Least squares method.

Answer 183

Prediction of one dependent variable using multiple independent variables.

Answer 184

High correlation between independent variables.

Answer 185

Proportion of variance in the dependent variable explained by predictors.

Answer 186

When the dependent variable is binary.

Answer 187

Number of predictors ≤10% of sample size or number of events.

Answer 188

One dependent variable with multiple independent variables.

Answer 189

Multiple dependent and independent variables.

Answer 190

When both categorical and continuous independent variables are present.

Answer 191

Identifying latent variables underlying observed correlations.

Answer 192

Exploratory and confirmatory.

Answer 193

Data reduction and identification of latent constructs.

Answer 194

Testing predefined factor structures and construct validity.

Answer 195

Retain factors with eigenvalues >1 (Kaiser rule).

Answer 196

Controlling known confounders.

Answer 197

Applying study rates to a standard population.

Answer 198

Applying standard population rates to the study sample.

Answer 199

+ - Test + TP FP Test - FN TN Sp = TN/(TN+FP) PPV = TP/(TP+FP) NPV = TN/(TN+FN) | Sn = TP/(TP+FN) ## Footnote Start clockwise top left Sn, Sp columns ↓ PPV/NPV → rows T's always in numerator!

Advanced Statistics Flashcards

(229 cards)