Statistics Flashcards

(107 cards)

1
Q

What is Mean?

A

Average value of the dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Median?

A

Middle value when data is sorted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Mode?

A

Most frequent value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Variance?

A

Measure of how spread out values are.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Standard Deviation?

A

Square root of variance; spread of data from mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is IQR?

A

Interquartile Range = Q3 − Q1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to detect outliers?

A

Using IQR rule or Z-score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How to Detect outliers with IQR?

A

Lb = Q1 - 1.5(IQR)
Ub = Q3 + 1.5(IQR)
Anything outside lb and ub is outlier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Z-score?

A

How many SDs a value is from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does Empirical Rule for Normal D data, say?

A

Mean +- 1SD = 68% Data
Mean +- 2SD = 95% Data
Mean +- 3SD = 99.7% Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Standard Error (SE)?

A

SD of sample mean = SD / √n.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a distribution?

A

Pattern of data values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is probability?

A

Chance of an event occurring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are independent events in Porb.?

A

Events that don’t affect each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are mutually exclusive events (Prob.)?

A

Events that cannot happen together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Conditional probability?

A

Probability of A given B.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Bayes Theorem use?

A

Reverse conditional probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a random variable in Prob.?

A

Variable whose value is determined by probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is Normal distribution?

A

Bell-shaped, symmetric distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why Normal distribution is important?

A

Many real-life variables follow it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is Standard Normal distribution?

A

Mean = 0, SD = 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is Binomial distribution?

A

Repeated trials with success/failure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is Poisson distribution?

A

Models number of events in fixed interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is CLT?

A

Sample means become normally distributed and avg of all sample’s means = Pop mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Minimum sample size for CLT?
Usually n ≥ 30.
26
Why CLT is important?
Allows use of Z & T tests even when population is not normal.
27
What is Hypothesis Testing?
Procedure to test claims using sample data.
28
What is Null Hypothesis (H0)?
No effect, no difference.
29
What is Alternative Hypothesis (Ha)?
There is an effect or difference.
30
What is p-value?
Probability of data assuming H0 is true.
31
When do you reject H0?
p-value < α.
32
What is significance level (α)?
Threshold for rejecting H0 (usually 0.05).
33
Type I Error?
Rejecting true H0 (false positive).
34
Type II Error?
Failing to reject false H0 (false negative).
35
One-tailed test?
Tests direction (greater/less).
36
Two-tailed test?
Tests any difference (<>).
37
When to use Z-test?
Population SD known or sample size ≥ 30 (for 1 or 2 groups)
38
What does Z-test compare?
Means or proportions.
39
Types of Z-tests?
One-sample, two-sample, and proportion tests.
40
When to use 1-sample Z-test?
Comparing sample mean vs population mean.
41
When to use 2-sample Z-test?
Compare means of two independent groups.
42
When to use Z-test for proportions?
Comparing population proportions.
43
When to use T-test?
Population SD unknown; small samples, (1 or 2 groups).
44
Types of T-tests?
One-sample, independent, paired.
45
What does One-sample T-test?
Compares Sample mean vs population mean.
46
What does Independent T-test?
Compare means of two independent groups.
47
What does Paired T-test?
Compares Before–after or matched pairs.
48
What is ANOVA?
Compares 3 or more means.
49
What is F-statistic?
MSB / MSW (between variance / within variance).
50
When to use ANOVA?
3+ groups with 1 numeric outcome.
51
Types of ANOVA?
One-way, Two-way (with interaction), Repeated measures.
52
One-way ANOVA use?
1 factor with 3+ levels.
53
Two-way ANOVA use?
2 factors + interaction effect.
54
What is interaction effect (ANOVA)?
Effect of one factor depends on another.
55
What is Factor and Levels in ANOVA ?
Factor = Categorical variable (column - Gender) Levels = Groups inside that Factor (Male, Female, Others)
56
What is Observed vs Expected values in Chi-Square?
Observation = Actual values in data Expected = our expected values (E = Total observations/No. Of categories)
57
What is Chi-square test?
Compares count/frequency of categorical variables.
58
Chi-square Goodness of Fit use?
Compare observed vs expected distribution.
59
Chi-square Test of Independence use?
Check relation between two categorical variables.
60
Assumptions of Chi-square?
Large sample; expected freq > 5.
61
What is correlation?
Strength & direction of linear relationship.
62
Range of Pearson correlation?
-1 to +1. (Strong Negative to Strong Positive)
63
What is covariance?
Joint variability of two variables.
64
Difference between correlation & covariance?
Covariance is unscaled; correlation is standardized.
65
Pearson vs Spearman Correlation?
Pearson = linear(normal d data); Spearman = ranked / monotonic. (not linear - marks up - rank down)
66
What is Simple Random Sampling?
Every member has equal chance.
67
What is Stratified Sampling?
Sampling from subgroups proportionally.
68
What is Systematic Sampling?
Selecting every k-th item.
69
What is Cluster Sampling?
Sampling by groups, not individuals.
70
How to check normality?
Histogram, Q-Q plot, Shapiro-Wilk test.
71
What is Homoscedasticity?
Equal variances across groups.
72
Which tests assume normality?
Z, T, ANOVA.
73
Which tests DO NOT require normality?
Chi-square.
74
What is confidence interval (CI)?
Range of values likely containing true mean.
75
What is effect size?
Strength of the relationship.
76
What is sample vs population?
Sample = subset; population = whole group.
77
What is degrees of freedom (df)?
Number of independent pieces of info.
78
What is p-hacking?
Manipulating tests to get significant results.
79
What is A/B testing?
Comparing two groups to see which performs better.
80
Test For Before–after same people?
Paired T-test.
81
Test to compare Two categorical variables?
Chi-square Independence.
82
Test for Categorical vs distribution?
Chi-square GOF.
83
What is Positive Correlation?
Both var increase or decrease together.
84
What is Negative Correlation?
If one var increase while another decrease.
85
Which plot we use to see Corr in 2 var?
Scatter Plot
86
Which Plot we use to see Corr in 2+ var?
Heatmap
87
What is Percentile and It's formula?
Divides data in 100 equal part. P = P(n+1)/100
88
What is Quartile and its formula?
Divide data in 4 equal Parts (Q1,Q2,Q3). Q= Q(n+1)/4
89
50th Percentile = ?
P50 = Q2 = Median
90
Three Main Type of Skewness?
Negative (Tail on left, meanmedian>mode), Normal (No skeweness).
91
Shapes of Data Distribution?
Normal, Skewed, Uniform, Bimodel (2 peaks)
92
What is "UNION" Probability?
Prob of A or B or Both Happening (AuB)
93
What is "Intersection" Probability?
Prob of "A and B" Happening together.
94
Types of "Intersection Prob"?
Independent and Dependent.
95
Diff b/w Conditional and Dependent Prob?
cond= Given case (P(B∣A)), Dependent = And case ( P(A∩B))
96
PMF is used for..?
Prob. of Discrete data (0-1).
97
PDF is used for..?
Prob. of Continuous Data (Area under the curve = 1).
98
What is CDF..?
Cumulative Density Function. It's like running Total of Prob.
99
How we get P-value in Right-Tailed Test (z/t)?
1 - table_value of Test
100
How we get P-value in Left-Tailed Test(z/t)?
Pvalue = table_value of Test
101
How we get P-value in Two-Tailed Test (z/t)?
2*(1 - abs(table_value of Test))
102
How we get Table value of Z-calculate?
norm.cdf(z_cal) -> Import norm from scipy.stats
103
How we get table value of T-calculate?
t.cdf(t_cal, df) -> import 't' from scipy.stats
104
How we get P-value in ANOVA?
Create 'Fit Model - Anova Table' by stats.models library
105
How we get P-value in Chi-Square?
chi2.sf(Chi-calculat, df) -> Import chi2 from scipy.stats
106
How to get calculated values of z,t,chi? (To get P value)
107
How to get T_cal, z_cal and chi_calculated value (which is used to get P value)?
Using their corresponding Formulas according to Ques/Condition