Bug 137139

Summary: Failed to run SunSpider 1.0.2 in Chrome
Product: WebKit Reporter: Yang Gu <yang.gu>
Component: Tools / TestsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED INVALID    
Severity: Major CC: barraclough, benjamin, commit-queue, darin, dewei_zhu, fpizlo, ggaren, mjs, msaboff, rniwa
Priority: P1    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch darin: review-

Description Yang Gu 2014-09-25 22:31:29 PDT
I tried both Chrome for Linux and Android, and it had high possibility I could not get the final result. The reason is though we should have 26 cases for each round, but actually I could not get the results of all the them. If some of them are "undefined", the results.html could not parse the result. The issue is very easy to reproduce if you directly (not using web server) run sunspider 1.0.2  in Chrome. 

The root cause as I think is below: 
Snippet of code in function next and recordResult

function next()
{
...
    testFrame.contentDocument.open();
    testFrame.contentDocument.write(testContents[testIndex]);
    testFrame.contentDocument.close();
    window.setTimeout(next, 0);
...
}

function recordResult(time)
{
    if (currentRepeat >= 0) // negative repeats are warmups
        output[currentRepeat][tests[testIndex]] = time;
}

Chrome uses standalone process to handle iframe. Thus there is a race condition here that before the recordResult is executed, function next is triggered in a new round and the iframe is filled with new content. The result from previous round gets no chance to be recorded. 

I changed the logic as below (will send out a patch), and everything recovers for me. 

function next()
{
...
    testFrame.contentDocument.open();
    testFrame.contentDocument.write(testContents[testIndex]);
    testFrame.contentDocument.close();
...
}

function recordResult(time)
{
    if (currentRepeat >= 0) // negative repeats are warmups
        output[currentRepeat][tests[testIndex]] = time;

    window.setTimeout(next, 0);
}

Please verify this issue.
Comment 1 Yang Gu 2014-09-25 23:41:52 PDT
Created attachment 238698 [details]
Patch
Comment 2 Yang Gu 2014-09-25 23:51:27 PDT
Not sure if I have submitted the patch in a correct way, as I couldn't find commit message in my git log as below:

Fix race condition in SunSpider 1.0.2 for Chrome

Chrome uses standalone process to handle iframe. Thus there is a chance
before recordResult is executed, a new round of test is triggered and
the content of iframe is replaced. The result of previous case would
never be recorded and result in errors in results.html.

What should I do?
Comment 3 Benjamin Poulain 2014-09-26 12:27:12 PDT
(In reply to comment #2)
> Not sure if I have submitted the patch in a correct way, as I couldn't find commit message in my git log as below:
> 
> Fix race condition in SunSpider 1.0.2 for Chrome
> 
> Chrome uses standalone process to handle iframe. Thus there is a chance
> before recordResult is executed, a new round of test is triggered and
> the content of iframe is replaced. The result of previous case would
> never be recorded and result in errors in results.html.
> 
> What should I do?

The complete explanation should be in your changelog.
Comment 4 Yang Gu 2014-09-27 17:46:53 PDT
Created attachment 238795 [details]
Patch
Comment 5 Yang Gu 2014-09-27 17:48:28 PDT
I updated the patch with new ChangeLog. Please take a review again. Thanks!
Comment 6 Geoffrey Garen 2014-10-03 17:11:56 PDT
The document.open/write/close APIs are synchronous. So, the target document should call back to recordResult synchronously before any new loading takes place. What is it about Chrome's process model that prevents document.open/write/close from being synchronous? Is this a bug in Chrome?

Why is a timeout of 0 sufficient? If the target iframe is truly decoupled and asynchronous, the required timeout could be unpredictable, and much longer than 0.
Comment 7 Darin Adler 2016-03-09 09:00:37 PST
Comment on attachment 238795 [details]
Patch

Nk we won't make this change. SunSpider is mostly obsolete and people should use JetStream http://browserbench.org/JetStream/
Comment 8 Darin Adler 2016-03-09 09:00:59 PST
I meant to say I think.
Comment 9 Filip Pizlo 2016-03-09 10:06:15 PST
(In reply to comment #7)
> Comment on attachment 238795 [details]
> Patch
> 
> Nk we won't make this change. SunSpider is mostly obsolete and people should
> use JetStream http://browserbench.org/JetStream/

I think that if this can be fixed easily then we would do it. We still let people run SunSpider and it's a shame if the benchmark itself has a bug that causes it to not give any results in some cases.
Comment 10 Darin Adler 2016-03-09 11:58:10 PST
OK, would you like to review the patch then?
Comment 11 Filip Pizlo 2016-03-09 12:09:41 PST
Comment on attachment 238795 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=238795&action=review

I'll let the r- stand because I don't agree with the diagnosis or fix.

> PerformanceTests/SunSpider/ChangeLog:11
> +        Chrome uses standalone process to handle iframe. Thus there is a
> +        chance before recordResult is executed, a new round of test is
> +        triggered and the content of iframe is replaced. The result of
> +        previous case would never be recorded and result in errors in
> +        results.html.

I don't buy this.  To my understanding, Chrome doesn't do it this way.

> PerformanceTests/SunSpider/ChangeLog:12
> +        The commit also removes some unexpected trailing whitespaces.

We usually don't mix style changes with functional changes like this, especially when it's just whitespace.

> PerformanceTests/SunSpider/resources/driver-TEMPLATE.html:29
> - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
> + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Please revert.

> PerformanceTests/SunSpider/resources/driver-TEMPLATE.html:57
> -    for (var start = new Date; new Date - start < warmupMS; ) {
> +    for (var start = new Date; new Date - start < warmupMS;) {

Revert.

> PerformanceTests/SunSpider/resources/driver-TEMPLATE.html:65
> -function start() 
> +function start()

Revert.
Comment 12 dewei_zhu 2016-03-09 13:30:28 PST
Comment on attachment 238795 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=238795&action=review

>> PerformanceTests/SunSpider/ChangeLog:11
>> +        results.html.
> 
> I don't buy this.  To my understanding, Chrome doesn't do it this way.

Correct me if I'm wrong. According to our observation, the problem is that we are loading sub-resources (<link rel="stylesheet" href="../sunspider.css">)  in the tests. This will trigger a separate thread loading the resource. And next round could happen before resource loading is finished.
I submitted a related patch for benchmark automation. https://bugs.webkit.org/show_bug.cgi?id=152938
Solution proposed by this patch is one of the options when we tried to fix this. But since 'sunspider.css' is not used, and changing the the way runs the test sounds more risky, we decide to remove those resource in the tests.