Bug 67682 - Accessibility tests crashing in BasicRawSentinelNode code
Summary: Accessibility tests crashing in BasicRawSentinelNode code
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P1 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-06 15:59 PDT by Alexey Proskuryakov
Modified: 2011-09-06 19:04 PDT (History)
5 users (show)

See Also:


Attachments
proposed patch (1.56 KB, patch)
2011-09-06 16:18 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff
the patch - fixed comment to more accurately reflect the change (1.96 KB, patch)
2011-09-06 16:32 PDT, Filip Pizlo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Proskuryakov 2011-09-06 15:59:19 PDT
When I run regression tests with GuardMalloc, there are semi-reproducible crashes like the below.

$ run-webkit-tests -g accessibility/ --repeat 3

This doesn't seem to be a failure from any single test - when I run them each 20 times, I sometimes don't get any crash at all.

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   com.apple.JavaScriptCore      	0x0000000107351d74 WTF::BasicRawSentinelNode<JSC::CallLinkInfo>::setNext(WTF::BasicRawSentinelNode<JSC::CallLinkInfo>*) + 20 (SentinelLinkedList.h:60)
1   com.apple.JavaScriptCore      	0x000000010735351f WTF::SentinelLinkedList<JSC::CallLinkInfo, WTF::BasicRawSentinelNode<JSC::CallLinkInfo> >::remove(JSC::CallLinkInfo*) + 319 (SentinelLinkedList.h:147)
2   com.apple.JavaScriptCore      	0x000000010734d535 WTF::BasicRawSentinelNode<JSC::CallLinkInfo>::remove() + 21 (SentinelLinkedList.h:98)
3   com.apple.JavaScriptCore      	0x000000010734ef98 JSC::CallLinkInfo::~CallLinkInfo() + 56 (CodeBlock.h:114)
4   com.apple.JavaScriptCore      	0x000000010734ef55 JSC::CallLinkInfo::~CallLinkInfo() + 21 (CodeBlock.h:114)
...
Comment 1 Filip Pizlo 2011-09-06 16:12:12 PDT
Yup, that's my fault.  CodeBlock is assuming that it won't be destroyed until after any CodeBlocks that have calls into it are destroyed, which is a demonstrably wrong assumption.

This bug will manifest itself if you have two CodeBlocks, A and B, where A contains a call to code owned by B.  Then the GC decides to first destroy B and then destroy A.  This can happen if both A and B became unreachable at the same time, if the heap is being destroyed, or if we call recompileAllJSFunctions().

At this stage A has a node on a linked list owned by B.  Thus A likely has multiple pointers into memory that will be destroyed when B's destructor runs.  Then the following dangerous and unpredictable course of events unfolds:

1) B gets destroyed, but does not remove A's node from its linked list.

2) A gets destroyed, and decides to be a good citizen and remove its node from B's linked list.  Linked list removal requires accessing B's linked list, which no longer exists.  Some manner of memory corruption ensues.

This bug occurs because of a misguided optimization on my part.  CodeBlock::~CodeBlock should ensure that any incoming calls (CallLinkInfos) that are on this's linked list are removed form the linked list and are marked as having been removed, so that when those CodeBlock's destructors run, they don't try to access garbage memory.  For some reason I had convinced myself that this was unnecessary and removed it.  The fix is to bring this back.

I'm rolling a patch for this now.
Comment 2 Filip Pizlo 2011-09-06 16:18:07 PDT
Created attachment 106511 [details]
proposed patch

Still testing this but it's ready for review.
Comment 3 Filip Pizlo 2011-09-06 16:32:06 PDT
Created attachment 106515 [details]
the patch - fixed comment to more accurately reflect the change
Comment 4 Filip Pizlo 2011-09-06 16:39:44 PDT
Comment on attachment 106515 [details]
the patch - fixed comment to more accurately reflect the change

This appears to fix the problem.  I cannot get crashes anymore, and nothing else seems to have regressed.
Comment 5 Geoffrey Garen 2011-09-06 16:58:00 PDT
Comment on attachment 106515 [details]
the patch - fixed comment to more accurately reflect the change

r=me
Comment 6 WebKit Review Bot 2011-09-06 19:04:09 PDT
Comment on attachment 106515 [details]
the patch - fixed comment to more accurately reflect the change

Clearing flags on attachment: 106515

Committed r94623: <http://trac.webkit.org/changeset/94623>
Comment 7 WebKit Review Bot 2011-09-06 19:04:14 PDT
All reviewed patches have been landed.  Closing bug.