What happens if CUPED baseline (X) has missing values for some users?
CUPED invalid for those users; either drop them or use alternative adjustment.
Why must users exist in both pre-period and experiment period for CUPED?
CUPED compares each user’s baseline to their experiment metric; new users have no X → CUPED unusable.
What happens if correlation between X and Y differs across segments (e.g., device, geography)?
Apply CUPED separately within segments; mixed correlations cause suboptimal variance reduction.
Does CUPED change the estimated treatment effect?
No — it only reduces variance (narrower CIs); the mean effect remains unchanged.
If correlation is high in pre-period but unstable over time (seasonality), should CUPED be used?
No — metric non-stationarity breaks CUPED assumptions.
What is baseline contamination?
Treatment affecting the pre-period metric, making X invalid for variance reduction.
Pre-period logs were partially missing. Should you apply CUPED?
No — incomplete baseline logging introduces bias into adjusted metric.
Why can CUPED dramatically reduce required sample size?
Lower metric variance → more power → smaller minimum detectable effect → shorter experiment.
What is the main assumption behind CUPED?
Pre-period metric X predicts part of the variation in experiment metric Y, and treatment does not affect X.
If correlation(X,Y)=0.75, what variance reduction can CUPED offer?
Roughly ~50–60% variance reduction (approx. correlation²).