Does a small imbalance (e.g., 51% vs 49%) automatically mean SRM?
No — small imbalances can occur by chance. Only a chi-square test confirms SRM.
Should chi-square for SRM be run on user-level assignments or on events?
User-level assignments. Event-level counts inflate sample size and give false positives.
SRM appears only in conversion events, not assignment events. What does this mean?
Assignment is correct; downstream logging, joins, or ETL ingestion is failing for certain buckets.
Your experiment shows perfect 50/50 assignment but checkout events show 60/40. Why?
Event logs failing, client-side tracking broken, or differences in event firing by bucket; pipeline joins inconsistent.
SRM appears only on mobile traffic, not desktop. Likely cause?
Mobile app version mismatch, client logging missing, or mobile-specific caching/stickiness.
SRM reversed mid-experiment (A bigger first 3 days, B bigger last 3 days). What does it suggest?
Experiment rollout issues, partial ramping, deployment/configuration change, or logging regression introduced midweek.
Assignment is correct but session counts are wrong by bucket. Why?
Sticky sessions or caching routes certain users disproportionately to one bucket.
If filtering is applied after assignment (e.g., eligible users only), what happens?
Filtering breaks randomization → SRM because buckets have different eligibility distributions.
Bots hit one bucket much more due to caching. How does this affect SRM?
Artificial SRM — bot traffic distorts assignment ratios even if human assignment is correct.
Is the presence of SRM always a showstopper?
Yes — you cannot interpret experiment results until SRM cause is fixed.