RESOLVED FIXED 16909
exec-crash REGRESSION: Amazon.com crash (ActivationImp)
https://bugs.webkit.org/show_bug.cgi?id=16909
Summary REGRESSION: Amazon.com crash (ActivationImp)
Cameron Zwarich (cpst)
Reported 2008-01-17 12:22:35 PST
Navigate to http://www.amazon.com/ and click on the Amazon logo in the top left corner. Go back and do it again. Repeating this a small number of times leads to a crash. I haven't tested it yet on an old nightly build, but this crash was probably introduced by the ActivationImp tear-off patch r29425. The address it dies on is at a small offset from 0, it can happen at a number of places in the code, and it does not quite happen deterministically, so it looks like a missed GC mark along the lines of bug 16868 or bug 16871. I'll post a stack trace with a modified GC that collects after every allocation.
Attachments
Stack trace (27.75 KB, text/plain)
2008-01-17 13:23 PST, Cameron Zwarich (cpst)
no flags
ExecState activity log (477.88 KB, text/plain)
2008-01-17 17:34 PST, Cameron Zwarich (cpst)
no flags
execstate activity log annotated with GC START and GC END markers (477.94 KB, text/plain)
2008-01-17 17:54 PST, Maciej Stachowiak
no flags
perl script to find inconsistencies (573 bytes, text/plain)
2008-01-17 17:55 PST, Maciej Stachowiak
no flags
results of perl script (look for the XXX) (478.89 KB, text/plain)
2008-01-17 17:56 PST, Maciej Stachowiak
no flags
this patch is a .4% SunSpider regression so it can't go in as-is but I think I know how to make it viable (4.91 KB, patch)
2008-01-18 01:55 PST, Maciej Stachowiak
no flags
.7% speedup - do we have a winner? (7.73 KB, patch)
2008-01-19 12:58 PST, Geoffrey Garen
no flags
The accompanying WebCore patch (3.02 KB, patch)
2008-01-19 13:20 PST, Geoffrey Garen
no flags
This patch builds and passes regression tests (12.69 KB, patch)
2008-01-19 14:25 PST, Geoffrey Garen
mjs: review+
Cameron Zwarich (cpst)
Comment 1 2008-01-17 12:23:13 PST
Cameron Zwarich (cpst)
Comment 2 2008-01-17 13:23:02 PST
Created attachment 18509 [details] Stack trace Strangely enough, I can't actually get it to crash with the GC modified to always collect after an allocation. Perhaps it is a timing issue that the extra garbage collection eliminates. Here is a stack trace of a typical crash.
Maciej Stachowiak
Comment 3 2008-01-17 15:40:40 PST
Alternate steps for a crash on Amazon (which are more reliable for me): 1. Go to Amazon.com 2. Search for "Hoboken Chicken Emergency" 3. Open the link for the first product result in a background tab.
Cameron Zwarich (cpst)
Comment 4 2008-01-17 16:12:18 PST
Maciej's instructions reliably reproduce the crash for me, with a similar looking stack trace to what I posted. Still, changing it to garbage collect every allocation makes the crash impossible to reproduce. Instead of crashing, it prints a JavaScript object such as function n2RunEvent(sWhen) { goN2Initializer.run(sWhen); } or [object Object] at the top of the page.
Cameron Zwarich (cpst)
Comment 5 2008-01-17 17:34:08 PST
Created attachment 18518 [details] ExecState activity log Tearing off every ActivationImp as soon as it is created fixes the crash, which suggests that the problem is due to a missed reference to an ActivationImp from an ExecState that should be explicitly mark()'d but isn't. Tearing off the activation puts a pointer into the GC heap in m_activation, which is caught by the conservative collector. Maciej suggested I log ExecState creations, deletions, and markings so that we can check for anything suspicious. I've attached such a log from a session that crashed.
Maciej Stachowiak
Comment 6 2008-01-17 17:54:59 PST
Created attachment 18520 [details] execstate activity log annotated with GC START and GC END markers
Maciej Stachowiak
Comment 7 2008-01-17 17:55:23 PST
Created attachment 18521 [details] perl script to find inconsistencies
Maciej Stachowiak
Comment 8 2008-01-17 17:56:24 PST
Created attachment 18522 [details] results of perl script (look for the XXX)
Maciej Stachowiak
Comment 9 2008-01-18 01:36:52 PST
With more investigation, I found that some live ExecStates fail to get marked specifically because script execution (via the <script> element) sometimes breaks both the callingExec and savedExec chains. I have a fix but it might be bad for performance - will have to test.
Maciej Stachowiak
Comment 10 2008-01-18 01:55:27 PST
Created attachment 18523 [details] this patch is a .4% SunSpider regression so it can't go in as-is but I think I know how to make it viable
David Kilzer (:ddkilzer)
Comment 11 2008-01-19 11:22:28 PST
David Kilzer (:ddkilzer)
Comment 12 2008-01-19 11:24:33 PST
(In reply to comment #1) > <rdar://problem/5691998> Cameron, please add the "InRadar" keyword to a bug if you file them in Radar. Thanks!
Geoffrey Garen
Comment 13 2008-01-19 12:58:47 PST
Created attachment 18549 [details] .7% speedup - do we have a winner?
Geoffrey Garen
Comment 14 2008-01-19 13:20:47 PST
Created attachment 18552 [details] The accompanying WebCore patch
Geoffrey Garen
Comment 15 2008-01-19 14:25:55 PST
Created attachment 18556 [details] This patch builds and passes regression tests
Maciej Stachowiak
Comment 16 2008-01-20 16:24:54 PST
Comment on attachment 18556 [details] This patch builds and passes regression tests r=me Thanks for finishing this off. I'd guess we can also reinstate the LocalStorage optimization since all ExecStates will be tracked?
Geoffrey Garen
Comment 17 2008-01-21 15:12:58 PST
I'm working on landing this now. With a little work, yes, I believe we can re-instate the LocalStorage buffer caching optimization.
Geoffrey Garen
Comment 18 2008-01-21 22:22:57 PST
Committed revision 29710.
Note You need to log in before you can comment on or make changes to this bug.