Advanced CUPED Flashcards

Question 1

Q

What happens if CUPED baseline (X) has missing values for some users?

Answer

A

CUPED invalid for those users; either drop them or use alternative adjustment.

Question 2

Q

Why must users exist in both pre-period and experiment period for CUPED?

Answer

A

CUPED compares each user’s baseline to their experiment metric; new users have no X → CUPED unusable.

Question 3

Q

What happens if correlation between X and Y differs across segments (e.g., device, geography)?

Answer

A

Apply CUPED separately within segments; mixed correlations cause suboptimal variance reduction.

Question 4

Q

Does CUPED change the estimated treatment effect?

Answer

A

No — it only reduces variance (narrower CIs); the mean effect remains unchanged.

Question 5

Q

If correlation is high in pre-period but unstable over time (seasonality), should CUPED be used?

Answer

A

No — metric non-stationarity breaks CUPED assumptions.

Question 6

Q

What is baseline contamination?

Answer

A

Treatment affecting the pre-period metric, making X invalid for variance reduction.

Question 7

Q

Pre-period logs were partially missing. Should you apply CUPED?

Answer

A

No — incomplete baseline logging introduces bias into adjusted metric.

Question 8

Q

Why can CUPED dramatically reduce required sample size?

Answer

A

Lower metric variance → more power → smaller minimum detectable effect → shorter experiment.

Question 9

Q

What is the main assumption behind CUPED?

Answer

A

Pre-period metric X predicts part of the variation in experiment metric Y, and treatment does not affect X.

Question 10

Q

If correlation(X,Y)=0.75, what variance reduction can CUPED offer?

Answer

A

Roughly ~50–60% variance reduction (approx. correlation²).

(10 cards)