Bug 70246 - REGRESSION(r96189): Cappuccino applications don't work anymore
Summary: REGRESSION(r96189): Cappuccino applications don't work anymore
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac (Intel) All
: P1 Normal
Assignee: Filip Pizlo
URL:
Keywords: InRadar
Depends on: 70770 70777 70854 97373
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-17 09:52 PDT by Antoine Mercadal
Modified: 2012-09-21 16:03 PDT (History)
4 users (show)

See Also:


Attachments
bug 70246 test case (3.08 MB, application/zip)
2012-09-20 18:40 PDT, Antoine Mercadal
no flags Details
simple reduced case (153 bytes, text/plain)
2012-09-21 15:33 PDT, Filip Pizlo
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Antoine Mercadal 2011-10-17 09:52:09 PDT
Between this two revisions, something happened in JavaScriptCore that prevents Cappuccino applications to work properly in WebKit Nightlies.
I've manually bisected between these two nighty builds, and looking at the changesets, and the culprit may be (but it is just an assumption) r96164, as there is lot of change in JSC. I've tried to revert this commit, but I gave up after too much conflicts and no sufficient understanding of the impacted code to resolve them.

The errors are the following:

 - sometimes it works
 - most of the time, it crashes at random place in the code, throwing random syntax errors
 - sometimes it crashes telling isa.info is not an object (the isa.info from Cappuccino runtime)
 - with the JS Debugger activated, it improves drastically the chance to get the app working fine (but sometimes, it fails)
 - when JS console is opened, it sometimes crashes the SafariWebContent process

It works well on standard Safari 5.1.1 shipped with Mac OS X 10.7.2, and latest revision of Chromium (downloaded a nightly today)
You can find Cappuccino application here:

 - http://app.archipelproject.org (you should not be able to reach the login form)
 - http://githubissues.heroku.com

The size of the application and the amount of data loaded/Ajax requests seems to have a direct relation with the crashes: The default Cappuccino Hello World application loads most of the time fine (but sometimes crashes).

It is really hard to dig further as enabling JS debugger make it work, or make the content process to crash.

Please tell me if you need any further information.

Thanks,
Comment 1 Mark Hahnenberg 2011-10-17 10:31:46 PDT
The range of revisions you've specified is still pretty wide.  There are several JSC changes within that range.  What leads you to believe it was r96164?
Comment 2 Darin Adler 2011-10-17 10:42:59 PDT
(In reply to comment #0)
> I've manually bisected between these two nighty builds

Could you say a little more about what you learned when bisecting manually?
Comment 3 Antoine Mercadal 2011-10-17 10:45:09 PDT
> Could you say a little more about what you learned when bisecting manually?

Basically using the nightly build r96162 makes Cappuccino apps working like a charm and when upgrading to the very next rr96196 makes it not working anymore, randomly throwing the errors mentioned above.
Comment 4 Alexey Proskuryakov 2011-10-17 11:36:56 PDT
> - http://app.archipelproject.org (you should not be able to reach the login form)

Further bisecting this example, it's <http://trac.webkit.org/changeset/96189>.
Comment 5 Antoine Mercadal 2011-10-17 11:39:52 PDT
> The range of revisions you've specified is still pretty wide.  There are several JSC changes within that range.  What leads you to believe it was r96164?

It was just an assumption. I'm tired, so I haven't thought to simply checkout r96162, and build commit by commit from it. Let me do this. I'll post the result in a few minutes
Comment 6 Antoine Mercadal 2011-10-17 12:39:22 PDT
(In reply to comment #4)
> > - http://app.archipelproject.org (you should not be able to reach the login form)
> 
> Further bisecting this example, it's <http://trac.webkit.org/changeset/96189>.

I confirm, I just finish my bisect and I run in the same commit.
Comment 7 Geoffrey Garen 2011-10-17 12:47:19 PDT
<rdar://problem/10297908>
Comment 8 Filip Pizlo 2011-10-25 16:27:23 PDT
I've fixed three bugs that I found by loading http://app.archipelproject.org.  These bugs are linked to this one as dependencies.  It appears that I can now reach the login form.  I have not checked the other URL.  If someone can verify that as of http://trac.webkit.org/changeset/98398, http://app.archipelproject.org does indeed load correctly, that would be great!

I will leave this bug open until I have time to also look at http://githubissues.heroku.com/.  I have other bugs to address right now so I don't know when I'll look at it.  But if anyone can confirm that the Cappuccino issues are fixed, then feel free to close the bug.
Comment 10 Antoine Mercadal 2012-09-20 11:43:41 PDT
The exact same thing is happening again, since r128441
Comment 11 Filip Pizlo 2012-09-20 16:45:46 PDT
(In reply to comment #10)
> The exact same thing is happening again, since r128441

Looking at it.
Comment 12 Antoine Mercadal 2012-09-20 16:55:16 PDT
I'll add that this time, 

- Githubissues is working (I think this is because the it's a "built" version i.e. the Cappuccino runtime is not used)

- On many other apps, I got the a "Syntax Error: Unexpected EOF"

- On app.archipelproject.org (not "built" version) we have the EOF error.

- Activating the Debugger (Webkit or Safari one) makes the page loading.
Comment 13 Filip Pizlo 2012-09-20 17:38:40 PDT
(In reply to comment #12)
> I'll add that this time, 
> 
> - Githubissues is working (I think this is because the it's a "built" version i.e. the Cappuccino runtime is not used)
> 
> - On many other apps, I got the a "Syntax Error: Unexpected EOF"

What is an example of another app that doesn't work?

> 
> - On app.archipelproject.org (not "built" version) we have the EOF error.

http://app.archipelproject.org loads fine for me, up to the login screen, in r129158.

Maybe this has since been fixed?  We did have regressions after r128400 but they were fixed quite quickly, however r128441 was still in the danger zone.

> 
> - Activating the Debugger (Webkit or Safari one) makes the page loading.
Comment 14 Antoine Mercadal 2012-09-20 17:56:03 PDT
(In reply to comment #13)

> > - On many other apps, I got the a "Syntax Error: Unexpected EOF"
> What is an example of another app that doesn't work?

Some internal apps sadly cannot share here.

> > - On app.archipelproject.org (not "built" version) we have the EOF error.
> http://app.archipelproject.org loads fine for me, up to the login screen, in r129158.
> 
> Maybe this has since been fixed?  We did have regressions after r128400 but they were fixed quite quickly, however r128441 was still in the danger zone.
> 

Wo.. yeah, you are right it's working on app.archipelproject.org.

This is still happening with r128959 my apps. this is very weird
Comment 15 Antoine Mercadal 2012-09-20 17:58:39 PDT
well maybe there is something bad in my files, but the problem is that starting the debugger makes it work, so I'm not from where to start.
Comment 16 Filip Pizlo 2012-09-20 17:59:53 PDT
(In reply to comment #15)
> well maybe there is something bad in my files, but the problem is that starting the debugger makes it work, so I'm not from where to start.

If there is a way that you can share an app that doesn't work, then that would be great.
Comment 17 Antoine Mercadal 2012-09-20 18:40:18 PDT
Created attachment 165033 [details]
bug 70246 test case

Ok, here is a sample app causing problem.

Procedure: open index-debug.html (this application will throw lot of errors, but when it's "working" you should see a window with a textfield and a slider)

Case:

In this extracted code, there are 3 referenced block identified by:

 /* BEGIN: REFERENCED BLOCK [A|B|C] */
 /* END: REFERENCED BLOCK [A|B|C] */


The error happens differently according to which blocks are commented out.

Here is what I found so far:


nothing commented: KO

A commented: OK
B commented: KO
C commented: KO

A,B commented: OK
A,C commented: OK
B,C commented: OK

A,B,C commented: OK


I cannot find any obvious logic in this.
Comment 18 Filip Pizlo 2012-09-21 12:46:43 PDT
(In reply to comment #17)
> Created an attachment (id=165033) [details]
> bug 70246 test case
> 
> Ok, here is a sample app causing problem.
> 
> Procedure: open index-debug.html (this application will throw lot of errors, but when it's "working" you should see a window with a textfield and a slider)
> 
> Case:
> 
> In this extracted code, there are 3 referenced block identified by:
> 
>  /* BEGIN: REFERENCED BLOCK [A|B|C] */
>  /* END: REFERENCED BLOCK [A|B|C] */
> 
> 
> The error happens differently according to which blocks are commented out.
> 
> Here is what I found so far:
> 
> 
> nothing commented: KO
> 
> A commented: OK
> B commented: KO
> C commented: KO
> 
> A,B commented: OK
> A,C commented: OK
> B,C commented: OK
> 
> A,B,C commented: OK
> 
> 
> I cannot find any obvious logic in this.

Awesome. I can repro.

It looks like it's our JIT's fault, as disabling it makes the bug go away.
Comment 19 Antoine Mercadal 2012-09-21 13:42:58 PDT
> Awesome. I can repro.
> 
> It looks like it's our JIT's fault, as disabling it makes the bug go away.

Cool. Any way to temporary disable JIT without rebuilding everything?
Comment 20 Filip Pizlo 2012-09-21 15:25:45 PDT
(In reply to comment #19)
> > Awesome. I can repro.
> > 
> > It looks like it's our JIT's fault, as disabling it makes the bug go away.
> 
> Cool. Any way to temporary disable JIT without rebuilding everything?

Source/JavaScriptCore/runtime/Options.cpp

add "useJIT() = false;" somewhere below the "Deconfuse editors..." line.

Then recompile.  Shouldn't take too long since it's just one file.

Other than that, there's no easy way.
Comment 21 Filip Pizlo 2012-09-21 15:26:50 PDT
It appears that I've found the cause, it was broken CSE on GetArrayLength.  The fix is done but I'm just writing a regression test.
Comment 22 Antoine Mercadal 2012-09-21 15:31:41 PDT
> add "useJIT() = false;" somewhere below the "Deconfuse editors..." line.
> Then recompile.  Shouldn't take too long since it's just one file.

Well, I don't have a local copy of the source now, so I'll just wait for the fix :)

(In reply to comment #21)
> It appears that I've found the cause, it was broken CSE on GetArrayLength.  The fix is done but I'm just writing a regression test.

Awesome!
Comment 23 Filip Pizlo 2012-09-21 15:33:13 PDT
Created attachment 165212 [details]
simple reduced case

Note that it will start to print '1,1' after about 66 iterations.  That's when we tier up to the optimizing JIT, and then the optimizing JIT mistakenly assumes that 'array.length' was invariant under 'array[1] = 42'.  It makes this mistake because of my array hole optimizations:

- 'array[1] = 42' is an in-bounds store, since the original array allocation would have created an array with enough room for 4 elements.

- 'array[1] = 42' is a holy store, since it stores to an empty value (a hole).

- We correctly infer that this does not clobber the world.  So for example accesses to unrelated things (like regular object properties, global variables, etc) would not see this array store.

- But then we incorrectly infer that this does not clobber the length.  It clearly does - in this case it changes the length from 1 to 2.

The more fundamental bug was that CSE was being way overaggressive for GetArrayLength in general. It was assuming that if in between two GetArrayLengths there wasn't anything that "clobbered the world" (function calls, crazy polymorphism, eval, etc) and the two GetArrayLengths accessed the same array, then the second one was redundant. It was saved a little bit by a separate bug where it would fall back if the butterfly CSE failed (so anything that could reallocate backing storage would seem to clobber length).  In this case, the PutByVal for 'array[1] = 42' doesn't clobber the world and doesn't lead to array storage reallocation - so GetArrayLength CSE would kick in, and boom, wrong result.

The solution is to actually write a proper interference analysis for GetArrayLength.  That will fix it.
Comment 24 Filip Pizlo 2012-09-21 16:00:13 PDT
Closing as fixed following https://bugs.webkit.org/show_bug.cgi?id=97373, http://trac.webkit.org/changeset/129266.  Please reopen if you encounter more issues!
Comment 25 Antoine Mercadal 2012-09-21 16:03:27 PDT
Thank you very much for the quick fix, as usual!