<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>155678</bug_id>
          
          <creation_ts>2016-03-18 21:30:12 -0700</creation_ts>
          <short_desc>GuardMalloc crash on DFG::WorkList thread in JSC::Heap::isCollecting for destroyed Web Worker</short_desc>
          <delta_ts>2016-03-21 13:41:33 -0700</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>JavaScriptCore</component>
          <version>WebKit Nightly Build</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Joseph Pecoraro">joepeck</reporter>
          <assigned_to name="Nobody">webkit-unassigned</assigned_to>
          <cc>ddkilzer</cc>
    
    <cc>fpizlo</cc>
    
    <cc>ggaren</cc>
    
    <cc>joepeck</cc>
    
    <cc>mark.lam</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1176445</commentid>
    <comment_count>0</comment_count>
    <who name="Joseph Pecoraro">joepeck</who>
    <bug_when>2016-03-18 21:30:12 -0700</bug_when>
    <thetext>* SUMMARY
GuardMalloc crash on DFGWorkList thread in JSC::Heap::isCollecting for destroyed Web Worker.

* STEPS TO REPRODUCE
shell&gt; run-webkit-tests --release --no-retry-failures --time-out-ms=30000 -g -1 --iterations=50 LayoutTests/inspector/heap/snapshot.html

* NOTES
- In Release it is very quick to crash (almost always the 2nd test run).
- In Debug it is very rare to crash, I got it after ~20 runs.
- The Web Inspector was recently made to use Web Workers here &lt;http://trac.webkit.org/changeset/198353&gt;

* CRASH SNIPPET
Crashed Thread:        15  FTL Worklist Worker Thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000358989b30
Exception Note:        EXC_CORPSE_NOTIFY

Application Specific Information:
CRASHING TEST: inspector/heap/snapshot.html
This process is running with libgmalloc.dylib (GuardMalloc) which may have forced the crash due to a memory access error.
 
Thread 15 Crashed:: FTL Worklist Worker Thread
0   com.apple.JavaScriptCore      	0x000000010bb7fc9e JSC::Heap::isCollecting() + 14
1   com.apple.JavaScriptCore      	0x000000010c1ab0c5 JSC::DFG::Worklist::runThread(JSC::DFG::ThreadData*) + 757
2   com.apple.JavaScriptCore      	0x000000010c1a9404 JSC::DFG::Worklist::threadFunction(void*) + 36
3   com.apple.JavaScriptCore      	0x000000010ca53dc9 WTF::createThread(void (*)(void*), void*, char const*)::$_0::operator()() const + 25
4   com.apple.JavaScriptCore      	0x000000010ca53d9d void std::__1::__invoke_void_return_wrapper&lt;void&gt;::__call&lt;WTF::createThread(void (*)(void*), void*, char const*)::$_0&amp;&gt;(WTF::createThread(void (*)(void*), void*, char const*)::$_0&amp;&amp;&amp;) + 45
5   com.apple.JavaScriptCore      	0x000000010ca53d3c std::__1::__function::__func&lt;WTF::createThread(void (*)(void*), void*, char const*)::$_0, std::__1::allocator&lt;WTF::createThread(void (*)(void*), void*, char const*)::$_0&gt;, void ()&gt;::operator()() + 44
6   com.apple.JavaScriptCore      	0x000000010c2eef5a std::__1::function&lt;void ()&gt;::operator()() const + 26
7   com.apple.JavaScriptCore      	0x000000010ca5299e WTF::threadEntryPoint(void*) + 158
8   com.apple.JavaScriptCore      	0x000000010ca54331 WTF::wtfThreadEntryPoint(void*) + 289
9   libsystem_pthread.dylib       	0x00007fff966b799d _pthread_body + 131
10  libsystem_pthread.dylib       	0x00007fff966b791a _pthread_start + 168
11  libsystem_pthread.dylib       	0x00007fff966b5351 thread_start + 13</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1176446</commentid>
    <comment_count>1</comment_count>
    <who name="Joseph Pecoraro">joepeck</who>
    <bug_when>2016-03-18 21:30:28 -0700</bug_when>
    <thetext>&lt;rdar://problem/25251439&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1176447</commentid>
    <comment_count>2</comment_count>
    <who name="Joseph Pecoraro">joepeck</who>
    <bug_when>2016-03-18 21:30:36 -0700</bug_when>
    <thetext>Caught in lldb I was able to confirm that this is a Cancelled DFG::Plan holding a reference to now gone Web Worker VM that has since destructed.

1. VM::~VM for the Web Worker on thread #289:

    (lldb)  bt
      thread #289: tid = 0x69ded, 0x0000000109170b10 JavaScriptCore`JSC::VM::~VM(this=0x0000001715b67a90) + 16 at VM.cpp:335, name = &apos;WebCore: Worker&apos;, stop reason = breakpoint 3.2
        frame #0: JavaScriptCore`JSC::VM::~VM(this=0x0000001715b67a90) + 16 at VM.cpp:335
        frame #1: JavaScriptCore`WTF::ThreadSafeRefCounted&lt;JSC::VM&gt;::deref
        frame #2: JavaScriptCore`void WTF::derefIfNotNull&lt;JSC::VM&gt;
        frame #3: JavaScriptCore`WTF::RefPtr&lt;JSC::VM&gt;::operator=
        frame #4: JavaScriptCore`JSC::JSLockHolder::~JSLockHolder
        frame #5: JavaScriptCore`JSC::JSLockHolder::~JSLockHolder
        frame #6: WebCore`WebCore::WorkerScriptController::~WorkerScriptController
        frame #7: WebCore`WebCore::WorkerScriptController::~WorkerScriptController
        frame #8: WebCore`WebCore::WorkerGlobalScope::clearScript
        frame #9: WebCore`WebCore::WorkerGlobalScope::clearScript
        frame #10: WebCore`WebCore::WorkerGlobalScope::clearScript
        frame #11: WebCore`WebCore::WorkerGlobalScope::clearScript
        frame #12: WebCore`WebCore::WorkerThread::stop
    (lldb) c


2. Hitting the Crash

    (lldb) Process 35098 stopped
    * thread #41: tid = 0x6934f, 0x000000010845c6ae JavaScriptCore`JSC::Heap::isCollecting(this=0x0000001715b67aa8) + 14 at HeapInlines.h:58, name = &apos;FTL Worklist Worker Thread&apos;, stop reason = EXC_BAD_ACCESS (code=1, address=0x1715b67b30)
        frame #0: 0x000000010845c6ae JavaScriptCore`JSC::Heap::isCollecting(this=0x0000001715b67aa8) + 14 at HeapInlines.h:58
       55  	
       56  	inline bool Heap::isCollecting()
       57  	{
    -&gt; 58  	    return m_operationInProgress == FullCollection || m_operationInProgress == EdenCollection;
       59  	}
       60  	
       61  	inline Heap* Heap::heap(const JSCell* cell)

3. Inspect the DFG::Plan it contains the Web Worker&apos;s VM, and the Plan has been Cancelled, so none of its members are safe to use.


    (lldb) up
    frame #1: 0x0000000108a75e95 JavaScriptCore`JSC::DFG::Worklist::runThread(this=0x00000003c4c73ef0, data=0x00000003c4ee5fe0) + 757 at DFGWorklist.cpp:370
       367 	        
       368 	            RELEASE_ASSERT(!plan-&gt;vm.heap.isCollecting());
       369 	            plan-&gt;compileInThread(longLivedState, data);
    -&gt; 370 	            RELEASE_ASSERT(!plan-&gt;vm.heap.isCollecting());
       371 	            
       372 	            {
       373 	                LockHolder locker(m_lock);

    (lldb) frame variable plan.m_ptr-&gt;vm
    (JSC::VM &amp;) plan.m_ptr-&gt;vm = 0x0000001715b67a90: {...}

    (lldb) frame variable plan.m_ptr-&gt;stage
    (JSC::DFG::Plan::Stage) plan.m_ptr-&gt;stage = Cancelled</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1176449</commentid>
    <comment_count>3</comment_count>
      <attachid>274500</attachid>
    <who name="Joseph Pecoraro">joepeck</who>
    <bug_when>2016-03-18 21:42:00 -0700</bug_when>
    <thetext>Created attachment 274500
[PATCH] Proposed Fix

Proposed fix.

This DFG::Plan is not seen by VM::~VM&apos;s attempt to remove all plans for the destructing VM because that path loops over `m_plans`. A Cancelled DFG::Plan would have already been removed from `m_plans`.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1176450</commentid>
    <comment_count>4</comment_count>
    <who name="Filip Pizlo">fpizlo</who>
    <bug_when>2016-03-18 21:43:04 -0700</bug_when>
    <thetext>(In reply to comment #2)
&gt; Caught in lldb I was able to confirm that this is a Cancelled DFG::Plan
&gt; holding a reference to now gone Web Worker VM that has since destructed.
&gt; 
&gt; 1. VM::~VM for the Web Worker on thread #289:
&gt; 
&gt;     (lldb)  bt
&gt;       thread #289: tid = 0x69ded, 0x0000000109170b10
&gt; JavaScriptCore`JSC::VM::~VM(this=0x0000001715b67a90) + 16 at VM.cpp:335,
&gt; name = &apos;WebCore: Worker&apos;, stop reason = breakpoint 3.2
&gt;         frame #0: JavaScriptCore`JSC::VM::~VM(this=0x0000001715b67a90) + 16
&gt; at VM.cpp:335
&gt;         frame #1: JavaScriptCore`WTF::ThreadSafeRefCounted&lt;JSC::VM&gt;::deref
&gt;         frame #2: JavaScriptCore`void WTF::derefIfNotNull&lt;JSC::VM&gt;
&gt;         frame #3: JavaScriptCore`WTF::RefPtr&lt;JSC::VM&gt;::operator=
&gt;         frame #4: JavaScriptCore`JSC::JSLockHolder::~JSLockHolder
&gt;         frame #5: JavaScriptCore`JSC::JSLockHolder::~JSLockHolder
&gt;         frame #6:
&gt; WebCore`WebCore::WorkerScriptController::~WorkerScriptController
&gt;         frame #7:
&gt; WebCore`WebCore::WorkerScriptController::~WorkerScriptController
&gt;         frame #8: WebCore`WebCore::WorkerGlobalScope::clearScript
&gt;         frame #9: WebCore`WebCore::WorkerGlobalScope::clearScript
&gt;         frame #10: WebCore`WebCore::WorkerGlobalScope::clearScript
&gt;         frame #11: WebCore`WebCore::WorkerGlobalScope::clearScript
&gt;         frame #12: WebCore`WebCore::WorkerThread::stop
&gt;     (lldb) c
&gt; 
&gt; 
&gt; 2. Hitting the Crash
&gt; 
&gt;     (lldb) Process 35098 stopped
&gt;     * thread #41: tid = 0x6934f, 0x000000010845c6ae
&gt; JavaScriptCore`JSC::Heap::isCollecting(this=0x0000001715b67aa8) + 14 at
&gt; HeapInlines.h:58, name = &apos;FTL Worklist Worker Thread&apos;, stop reason =
&gt; EXC_BAD_ACCESS (code=1, address=0x1715b67b30)
&gt;         frame #0: 0x000000010845c6ae
&gt; JavaScriptCore`JSC::Heap::isCollecting(this=0x0000001715b67aa8) + 14 at
&gt; HeapInlines.h:58
&gt;        55  	
&gt;        56  	inline bool Heap::isCollecting()
&gt;        57  	{
&gt;     -&gt; 58  	    return m_operationInProgress == FullCollection ||
&gt; m_operationInProgress == EdenCollection;
&gt;        59  	}
&gt;        60  	
&gt;        61  	inline Heap* Heap::heap(const JSCell* cell)
&gt; 
&gt; 3. Inspect the DFG::Plan it contains the Web Worker&apos;s VM, and the Plan has
&gt; been Cancelled, so none of its members are safe to use.
&gt; 
&gt; 
&gt;     (lldb) up
&gt;     frame #1: 0x0000000108a75e95
&gt; JavaScriptCore`JSC::DFG::Worklist::runThread(this=0x00000003c4c73ef0,
&gt; data=0x00000003c4ee5fe0) + 757 at DFGWorklist.cpp:370
&gt;        367 	        
&gt;        368 	            RELEASE_ASSERT(!plan-&gt;vm.heap.isCollecting());
&gt;        369 	            plan-&gt;compileInThread(longLivedState, data);
&gt;     -&gt; 370 	            RELEASE_ASSERT(!plan-&gt;vm.heap.isCollecting());
&gt;        371 	            
&gt;        372 	            {
&gt;        373 	                LockHolder locker(m_lock);
&gt; 
&gt;     (lldb) frame variable plan.m_ptr-&gt;vm
&gt;     (JSC::VM &amp;) plan.m_ptr-&gt;vm = 0x0000001715b67a90: {...}
&gt; 
&gt;     (lldb) frame variable plan.m_ptr-&gt;stage
&gt;     (JSC::DFG::Plan::Stage) plan.m_ptr-&gt;stage = Cancelled

Fascinating.

VM destruction should wait for all plans for that VM to be retired from the worklist.  I believe that this code is fine.

But note that the GC has a magical power, which I believe comes into play here: it can run concurrently to compilation and cancel a plan while that plan is compiling. We take advantage of the fact that some parts of the compilation pipeline, like the B3 prepareForGeneration() phase, are both long and have no need for the VM&amp; or any JS-ey things.  In other words, we can run GC during this &quot;safe&quot; phase.  If the GC finds that the plan is compiling something that is garbage, the GC can cancel the plan. B3 will keep running, but when it finishes, we will &quot;exit&quot; the safepoint, which will wait for GC to finish, and then observe that the plan has been cancelled. The plan will then just tear down all of its data and return.

And then this awesomeness happens: the VM has also been destroyed in the meantime.  Normally, the VM would wait for compilation threads to finish.  But it doesn&apos;t wait for cancelled ones to finish.

So, the VM was destroyed and then this idiotic assertion happens!  It&apos;s probably the only darn thing on the cancelled-plan-gives-up-and-dies path that requires the VM!  Since it&apos;s a release assertion we always access the isCollecting bits, and so in this case we die.

The fix is to remove the assertion. :-)  That way, there won&apos;t be anything on the &quot;give-up-and-die&quot; path that needs the VM to still exist.  And yes, I just read the code for this, and it does look like everyone else just speedily bails without touching anything, except for these bogus assertions.  There are a handful of them.  I believe they should all be burned.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1176453</commentid>
    <comment_count>5</comment_count>
      <attachid>274500</attachid>
    <who name="Filip Pizlo">fpizlo</who>
    <bug_when>2016-03-18 21:44:53 -0700</bug_when>
    <thetext>Comment on attachment 274500
[PATCH] Proposed Fix

Nice.  Better than removing the assertion.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1176622</commentid>
    <comment_count>6</comment_count>
    <who name="Joseph Pecoraro">joepeck</who>
    <bug_when>2016-03-20 15:43:16 -0700</bug_when>
    <thetext>http://trac.webkit.org/changeset/198477</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1176835</commentid>
    <comment_count>7</comment_count>
    <who name="David Kilzer (:ddkilzer)">ddkilzer</who>
    <bug_when>2016-03-21 13:41:11 -0700</bug_when>
    <thetext>This is not a security issue since it resulted in a RELEASE_ASSERT().</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>274500</attachid>
            <date>2016-03-18 21:42:00 -0700</date>
            <delta_ts>2016-03-18 21:44:53 -0700</delta_ts>
            <desc>[PATCH] Proposed Fix</desc>
            <filename>guard-malloc.patch</filename>
            <type>text/plain</type>
            <size>1613</size>
            <attacher name="Joseph Pecoraro">joepeck</attacher>
            
              <data encoding="base64">ZGlmZiAtLWdpdCBhL1NvdXJjZS9KYXZhU2NyaXB0Q29yZS9DaGFuZ2VMb2cgYi9Tb3VyY2UvSmF2
YVNjcmlwdENvcmUvQ2hhbmdlTG9nCmluZGV4IDI0MDgyMTUuLjYwYzNlZGQgMTAwNjQ0Ci0tLSBh
L1NvdXJjZS9KYXZhU2NyaXB0Q29yZS9DaGFuZ2VMb2cKKysrIGIvU291cmNlL0phdmFTY3JpcHRD
b3JlL0NoYW5nZUxvZwpAQCAtMSwzICsxLDE5IEBACisyMDE2LTAzLTE4ICBKb3NlcGggUGVjb3Jh
cm8gIDxwZWNvcmFyb0BhcHBsZS5jb20+CisKKyAgICAgICAgQ3Jhc2ggb24gREZHOjpXb3JrTGlz
dCB0aHJlYWQgaW4gSlNDOjpIZWFwOjppc0NvbGxlY3RpbmcgZm9yIGRlc3Ryb3llZCBXZWIgV29y
a2VyCisgICAgICAgIGh0dHBzOi8vYnVncy53ZWJraXQub3JnL3Nob3dfYnVnLmNnaT9pZD0xNTU2
NzgKKyAgICAgICAgPHJkYXI6Ly9wcm9ibGVtLzI1MjUxNDM5PgorCisgICAgICAgIFJldmlld2Vk
IGJ5IE5PQk9EWSAoT09QUyEpLgorCisgICAgICAgIFRoaXMgZml4ZXMgYSBjcmFzaCB0aGF0IHdl
IHNhdyB3aXRoIEd1YXJkTWFsbG9jLiBJZiB0aGUgUGxhbiB3YXMKKyAgICAgICAgQ2FuY2VsbGVk
IGl0IG1heSBub3QgYmUgc2FmZSB0byBhY2Nlc3MgdGhlIFZNLiBJZiB0aGUgUGxhbiB3YXMKKyAg
ICAgICAgY2FuY2VsbGVkIHdlIGFyZSBqdXN0IGdvaW5nIHRvIGJhaWwgYW55d2F5cywgc28ga2Vl
cCB0aGUgQVNTRVJUIGJ1dAorICAgICAgICBzaG9ydC1jaXJjdWl0IGlmIHRoZSBwbGFuIHdhcyBD
YW5jZWxsZWQuCisKKyAgICAgICAgKiBkZmcvREZHV29ya2xpc3QuY3BwOgorICAgICAgICAoSlND
OjpERkc6OldvcmtsaXN0OjpydW5UaHJlYWQpOgorCiAyMDE2LTAzLTE3ICBKb3NlcGggUGVjb3Jh
cm8gIDxwZWNvcmFyb0BhcHBsZS5jb20+CiAKICAgICAgICAgV2ViIEluc3BlY3RvcjogV2Ugc2hv
dWxkIGhhdmUgYSB3YXkgdG8gY2FwdHVyZSBoZWFwIHNuYXBzaG90cyBwcm9ncmFtYXRpY2FsbHku
CmRpZmYgLS1naXQgYS9Tb3VyY2UvSmF2YVNjcmlwdENvcmUvZGZnL0RGR1dvcmtsaXN0LmNwcCBi
L1NvdXJjZS9KYXZhU2NyaXB0Q29yZS9kZmcvREZHV29ya2xpc3QuY3BwCmluZGV4IDMxOGE5ZjAu
LjlkN2U2OGEgMTAwNjQ0Ci0tLSBhL1NvdXJjZS9KYXZhU2NyaXB0Q29yZS9kZmcvREZHV29ya2xp
c3QuY3BwCisrKyBiL1NvdXJjZS9KYXZhU2NyaXB0Q29yZS9kZmcvREZHV29ya2xpc3QuY3BwCkBA
IC0zNjcsNyArMzY3LDcgQEAgdm9pZCBXb3JrbGlzdDo6cnVuVGhyZWFkKFRocmVhZERhdGEqIGRh
dGEpCiAgICAgICAgIAogICAgICAgICAgICAgUkVMRUFTRV9BU1NFUlQoIXBsYW4tPnZtLmhlYXAu
aXNDb2xsZWN0aW5nKCkpOwogICAgICAgICAgICAgcGxhbi0+Y29tcGlsZUluVGhyZWFkKGxvbmdM
aXZlZFN0YXRlLCBkYXRhKTsKLSAgICAgICAgICAgIFJFTEVBU0VfQVNTRVJUKCFwbGFuLT52bS5o
ZWFwLmlzQ29sbGVjdGluZygpKTsKKyAgICAgICAgICAgIFJFTEVBU0VfQVNTRVJUKHBsYW4tPnN0
YWdlID09IFBsYW46OkNhbmNlbGxlZCB8fCAhcGxhbi0+dm0uaGVhcC5pc0NvbGxlY3RpbmcoKSk7
CiAgICAgICAgICAgICAKICAgICAgICAgICAgIHsKICAgICAgICAgICAgICAgICBMb2NrSG9sZGVy
IGxvY2tlcihtX2xvY2spOwo=
</data>
<flag name="review"
          id="298936"
          type_id="1"
          status="+"
          setter="fpizlo"
    />
          </attachment>
      

    </bug>

</bugzilla>