What is the main goal of an A/B test?
To compare outcomes between a control (A) and a treatment (B) group in a randomized experiment to estimate the causal effect of a change.
What is randomization in A/B testing?
Assigning units (such as users or sessions) to treatment or control at random so that, on average, the groups are comparable on all other factors.
Why is randomization critical for causal interpretation?
It breaks systematic links between treatment assignment and confounders, so differences in outcomes can be attributed to the treatment under reasonable assumptions.
What are common units of randomization in online experiments?
Users, sessions, or requests, chosen so that interference between units is minimized and exposure is clearly defined.
What is a treatment effect at a high level?
The difference in expected outcomes between the treatment and control conditions for the same population.
What is uplift or lift in A/B tests?
The relative change in a metric between treatment and control, often expressed as a percentage increase or decrease.
What is a primary metric in experimental design?
The main outcome of interest that the experiment is aimed at improving, such as conversion rate or retention.
What are guardrail metrics?
Metrics monitored to ensure that the treatment does not harm important aspects of the product, such as latency or error rates.
What is a null hypothesis H₀ in the context of an A/B test?
The assumption that there is no difference in the metric between treatment and control (effect size zero).
What is an alternative hypothesis H₁ in an A/B test?
The claim that there is a nonzero difference (positive or negative) between treatment and control.
What is a significance level α in experiment design?
A pre-specified maximum probability of rejecting a true null hypothesis (Type I error), often chosen as 0.05.
What is statistical power in an experiment?
The probability of correctly rejecting the null hypothesis when the treatment truly has the minimum effect size of interest.
Why is power important when planning experiments?
Low power means a high chance of missing real effects, leading to wasted experimentation and misleading ‘no effect’ conclusions.
What factors influence the required sample size for a given power?
Baseline metric level, minimum detectable effect size, variance, desired power, and chosen significance level.
Why does smaller minimum detectable effect size require larger sample size?
Detecting subtler differences requires more data to distinguish them from random noise in metrics.
What is a two-sample test for proportions in an A/B test?
A statistical test that compares conversion or success rates between treatment and control using binomial or Normal approximations.
Why are confidence intervals often more informative than just p-values in A/B tests?
They show a range of plausible effect sizes and convey both magnitude and uncertainty, not just significance yes/no.
What is a fixed-horizon (classical) A/B test?
An experiment where the sample size and analysis time are set in advance and results are evaluated only after data collection is complete.
Why does repeatedly checking p-values before the planned sample size inflate Type I error?
Each additional look at the data effectively adds another hypothesis test, increasing the chance of false positives unless corrected.
What is sequential or online testing?
Approaches that allow monitoring results over time with statistical corrections to maintain error control while peeking at data.
What is a simple good practice for stopping rules if you use fixed-horizon tests?
Commit to a sample size and analysis plan in advance and avoid acting on interim p-values unless using a sequential method.
What is a one-sided vs two-sided test in A/B experiments?
A one-sided test evaluates improvement in a specific direction; a two-sided test looks for any difference, positive or negative.
Why are two-sided tests often preferable in product experiments?
They guard against harmful changes by detecting significant decreases as well as increases in the metric.
What is a pre-analysis plan?
A document or specification that defines hypotheses, metrics, sample size, and analysis methods before running the experiment.