Bug 137396

Summary: Bias in first few random numbers generated by WeakRandom
Product: WebKit Reporter: ajith
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: NEW ---    
Severity: Normal CC: ap, bastien, ggaren
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Mac (Intel)   
OS: Unspecified   
Attachments:
Description Flags
C++ code to reproduce bug. none

Description ajith 2014-10-03 12:19:32 PDT
The first few random numbers generated after WeakRandom is initialized show a bias. While this is may not be an issue for a web-page calling Math.random many more times (the random stream has good uniformity properties), it is a problem in the following scenario. 

Consider a web-app that runs experiments and shows different content dynamically based on 2 successive random numbers returned by Math.random.  The first number decides if that page load is eligible for the experiment or not; if eligible, the second number decides on showing one of two content variations with 50-50 probability. The attached file 'RandomBucketingBias.cpp' simulates these bucketing trials 100,000 times, initializing WeakRandom at each iteration to simulate a distributed scenario where many different end-users load the page independently.

Compile 'RandomBucketingBias': g++ -o RandomBucketingBias RandomBucketingBias.cpp
Running 'RandomBucketingBias': ./RandomBucketingBias
shows something like:
Running bucketing tests:
Of 100000 tests, 50004 ignored, 49996 not ignored
Bucket counts 31235 18761, ratio = 1.664890

PASS: Ignores trials as expected
FAIL: First bucket count as expected
FAIL: Second bucket count as expected

One can skip the first k random numbers after initializing WeakRandom. Interestingly, skipping 3 numbers seems to get rid of the bias.
RandomBucketingBias takes 2 arguments. 
The first is the ignore percentage, an integer between 0 and 100 to specify what percentage of the trials must proceed to the second bucketing stage.
The second is the number of random numbers to skip before running the experiment. So, ./RandomBucketingBias 50 3 shows:
Running bucketing tests:
Of 100000 tests, 50020 ignored, 49980 not ignored
Bucket counts 24841 25139, ratio = 0.988146

PASS: Ignores trials as expected
PASS: First bucket count as expected
PASS: Second bucket count as expected

I also tested this against the Chromium RNG code; it shows no bias. For the full test see:  https://gist.github.com/ajithopti/bdd3cb21c89e203f569a
Comment 1 ajith 2014-10-03 12:20:46 PDT
Created attachment 239219 [details]
C++ code to reproduce bug.
Comment 2 ajith 2014-10-15 14:20:38 PDT
Any progress on this? There's a blog post in the works about this, so in the ideal case, am hoping for a fix before that is published.
Comment 3 Geoffrey Garen 2014-10-16 19:18:19 PDT
Looks like initializeSeed can correct for bias by using a different constant.

Can your experiment work around this limitation by explicitly skipping the first few random numbers?
Comment 4 ajith 2014-10-20 09:56:39 PDT
We currently have a fix that avoids the bucketing issues. If a different constant does indeed fixes this issue, it'd be great to have this fix in. Thanks for checking this.
Comment 5 ajith 2014-11-07 09:43:00 PST
Geoffrey,
Just checking. Will you or someone else be applying a fix for this?
Comment 6 Bastien Caudan 2016-06-14 06:46:45 PDT
It seems that we have this particular issue with our a/b test algorithm. We have this kind of repartition between test versions on safari:
Version 0:    800 
Version 1: 11 000  
Version 2: 35 000 
Version 3: 24 000 

We observe this behavior with all safari versions with significant traffic: 9.x, 8.0, 7.0, 6.1.6
However, Safari 4.0 does not seem to have this issue. 

We have reproduced this behavior on all safari versions available on saucelabs with this piece of code : https://gist.github.com/bcaudan/bfab026fa2416af9fc48dfacc4a59938

Do you have any news on this issue?
Comment 7 Geoffrey Garen 2016-06-14 10:50:00 PDT
Have you tested Safari Technology Preview?
Comment 8 Bastien Caudan 2016-06-15 05:54:07 PDT
No. Is it supposed to be fixed?
Comment 9 Geoffrey Garen 2016-06-15 12:36:59 PDT
Yes.
Comment 10 Bastien Caudan 2016-06-16 01:22:02 PDT
Great, then! Thx.