Bug 175852 - Race condition in StartWebThread causing crash
Summary: Race condition in StartWebThread causing crash
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit API (show other bugs)
Version: Other
Hardware: iPhone / iPad iOS 10.3
: P2 Major
Assignee: Yusuke Suzuki
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2017-08-22 14:22 PDT by Ben Nham
Modified: 2017-08-23 10:42 PDT (History)
6 users (show)

See Also:


Attachments
Patch (3.95 KB, patch)
2017-08-23 09:57 PDT, Yusuke Suzuki
mark.lam: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ben Nham 2017-08-22 14:22:02 PDT
We have been seeing an increasing number of crashes in the Facebook app due to a race condition in StartWebThread. The crash involves a race between the main thread and the web thread.

The main thread crashes in this RELEASE_ASSERT in ThreadIdentifierData:

  https://opensource.apple.com/source/WTF/WTF-7603.1.30.1.33/wtf/ThreadIdentifierDataPthreads.cpp.auto.html (line 78)

The main thread's stack is:

  Thread #0 Crashed:
  0 JavaScriptCore 0x1882aabf8 WTF::ThreadIdentifierData::initialize(unsigned int) + 72
  1 JavaScriptCore 0x1882a9188 WTF::currentThread() + 48
  2 JavaScriptCore 0x1882a9188 WTF::currentThread() + 48
  3 JavaScriptCore 0x1882aac04 WTF::initializeApplicationUIThreadIdentifier() + 8
  4 WebCore 0x188f76bec StartWebThread() + 504
  5 libsystem_pthread.dylib 0x18348f418 __pthread_once_handler + 76
  6 libsystem_platform.dylib 0x183484a44 _os_once + 48
  7 libsystem_pthread.dylib 0x18348c2fc pthread_once + 64
  8 WebKitLegacy 0x18a15ba54 +[WebView(WebPrivate) enableWebThread] + 268
  9 WebKitLegacy 0x18a15b68c WebKitInitialize + 68
  10 UIKit 0x18a781dd0 ___UIApplicationLoadWebKit_block_invoke + 212
  11 libdispatch.dylib 0x18328299c _dispatch_client_callout + 12
  12 libdispatch.dylib 0x1832836c8 dispatch_once_f + 52
  13 libdispatch.dylib 0x18328299c _dispatch_client_callout + 12
  14 libdispatch.dylib 0x1832937a4 _dispatch_barrier_sync_f_slow_invoke + 300
  15 libdispatch.dylib 0x18328299c _dispatch_client_callout + 12
  16 libdispatch.dylib 0x1832875e4 _dispatch_main_queue_callback_4CF + 992
  17 CoreFoundation 0x1843790c4 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 8
  18 CoreFoundation 0x184376ce0 __CFRunLoopRun + 1568
  19 CoreFoundation 0x1842a6da0 CFRunLoopRunSpecific + 420
  20 GraphicsServices 0x185d11070 GSEventRunModal + 96
  21 UIKit 0x18a561c98 UIApplicationMain + 204
  22 Facebook 0x1000b5738 FBGenericMain (FBGenericMain.m:224)
  23 Facebook 0x1000b5240 main (main.m:150)
  24 libdyld.dylib 0x1832b5598 start + 0

This RELEASE_ASSERT fires if the pthread TSD key hasn't yet been created. The TSD key is created by WTF::initializeThreading, which is called by JSC::initializeThreading. So basically, the crash occurs because the MT called WTF::ThreadIdentifierData::initialize before another thread finished calling WTF::initializeThreading.

The thread that's responsible for calling WTF::initializeThreading is the WebThread (which is forked off by StartWebThread). The WebThread has this call stack in the crash:

  Thread #63:
  0 libsystem_kernel.dylib 0x1833c4f60 __psynch_rw_wrlock + 8
  1 libsystem_pthread.dylib 0x18348b5c8 _pthread_rwlock_lock + 464
  2 libobjc.A.dylib 0x182e35764 rwlock_tt<false>::write() + 100
  3 libobjc.A.dylib 0x182e2ebc4 _class_getNonMetaClass + 36
  4 libobjc.A.dylib 0x182e354ac lookUpImpOrForward + 220
  5 libobjc.A.dylib 0x182e40474 _objc_msgSend_uncached + 52
  6 libobjc.A.dylib 0x182e2d414 CALLING_SOME_+initialize_METHOD + 20
  7 libobjc.A.dylib 0x182e2d680 _class_initialize + 608
  8 libobjc.A.dylib 0x182e354b0 lookUpImpOrForward + 224
  9 libobjc.A.dylib 0x182e40474 _objc_msgSend_uncached + 52
  10 JavaScriptCore 0x1882a9958 WTF::initializeMainThreadPlatform() + 24
  11 libsystem_pthread.dylib 0x18348f418 __pthread_once_handler + 76
  12 libsystem_platform.dylib 0x183484a44 _os_once + 48
  13 libsystem_pthread.dylib 0x18348c2fc pthread_once + 64
  14 WebCore 0x188f78b94 RunWebThread(void*) + 32
  15 libsystem_pthread.dylib 0x18348d688 _pthread_body + 236
  16 libsystem_pthread.dylib 0x18348d598 _pthread_start + 280
  17 libsystem_pthread.dylib 0x18348acb0 thread_start + 0

If you look at the code, the WebThread is stuck two lines *before* the call to JSC::initializeThreading (which is what calls WTF::initializeThreading)--it's executing WTF::initializeMainThreadPlatform:

  https://opensource.apple.com/source/WebCore/WebCore-7603.1.30.1.33/platform/ios/wak/WebCoreThread.mm.auto.html (line 653)

In theory, there is some synchronization in the existing code that should prevent this from happening. In pseudo-code, the synchronization code looks like this:

  RunWebThread: (running on the web thread)
    JSC::initializeThreading()
    // other init...

    pthread_mutex_lock(&lock)
    pthread_cond_signal(&cond, &lock)
    pthread_mutex_unlock(&lock)

  StartWebThread: (running on the main thread)
    pthread_lock(&lock)
    pthread_cond_wait(&cond, &lock)
    pthread_unlock(&lock)

    WTF::initializeApplicationUIThreadIdentifier() // calls code which eventually asserts

However, this synchronization code is buggy because pthread_cond_wait is allowed to spuriously wake up, and pthread_cond_wait is only called once. It is very easy to demonstrate that spurious wakeups exist with pthread condition variables, e.g. here is a simple test program: https://gist.github.com/bnham/f2ca3430a7c4f6f714a918b62702125e

The fix is to add a boolean flag variable and to check whether that boolean variable is set while calling pthread_cond_wait in a loop, i.e. textbook usage of Mesa condition variables.

A possible mitigation for affected apps is to call some JSC API that eventually calls JSC::initializeThreading early on in app initialization.
Comment 1 Ben Nham 2017-08-22 14:30:30 PDT
Also filed a radar: <rdar://34021573>
Comment 2 Alexey Proskuryakov 2017-08-23 09:09:57 PDT
rdar://problem/34021570
Comment 3 Yusuke Suzuki 2017-08-23 09:11:56 PDT
Good catch!
Comment 4 Yusuke Suzuki 2017-08-23 09:37:58 PDT
Recently[1], I changed this code to call WTF::initailizeThreading() in StartWebThread before spawning WebThread :)
This is OK because now WTF::initializeThreading does not have the limitation that requires WTF::initializeThreading() should be called in the main thread.

So in the trunk, I think this issue is fixed. What do you think of Mark?

[1]: http://trac.webkit.org/r219647
Comment 5 Yusuke Suzuki 2017-08-23 09:42:27 PDT
(In reply to Ben Nham from comment #0)
> We have been seeing an increasing number of crashes in the Facebook app due
> to a race condition in StartWebThread. The crash involves a race between the
> main thread and the web thread.
> 
> The main thread crashes in this RELEASE_ASSERT in ThreadIdentifierData:
> 
>  
> https://opensource.apple.com/source/WTF/WTF-7603.1.30.1.33/wtf/
> ThreadIdentifierDataPthreads.cpp.auto.html (line 78)
> 
> The main thread's stack is:
> 
>   Thread #0 Crashed:
>   0 JavaScriptCore 0x1882aabf8
> WTF::ThreadIdentifierData::initialize(unsigned int) + 72
>   1 JavaScriptCore 0x1882a9188 WTF::currentThread() + 48
>   2 JavaScriptCore 0x1882a9188 WTF::currentThread() + 48
>   3 JavaScriptCore 0x1882aac04
> WTF::initializeApplicationUIThreadIdentifier() + 8
>   4 WebCore 0x188f76bec StartWebThread() + 504
>   5 libsystem_pthread.dylib 0x18348f418 __pthread_once_handler + 76
>   6 libsystem_platform.dylib 0x183484a44 _os_once + 48
>   7 libsystem_pthread.dylib 0x18348c2fc pthread_once + 64
>   8 WebKitLegacy 0x18a15ba54 +[WebView(WebPrivate) enableWebThread] + 268
>   9 WebKitLegacy 0x18a15b68c WebKitInitialize + 68
>   10 UIKit 0x18a781dd0 ___UIApplicationLoadWebKit_block_invoke + 212
>   11 libdispatch.dylib 0x18328299c _dispatch_client_callout + 12
>   12 libdispatch.dylib 0x1832836c8 dispatch_once_f + 52
>   13 libdispatch.dylib 0x18328299c _dispatch_client_callout + 12
>   14 libdispatch.dylib 0x1832937a4 _dispatch_barrier_sync_f_slow_invoke + 300
>   15 libdispatch.dylib 0x18328299c _dispatch_client_callout + 12
>   16 libdispatch.dylib 0x1832875e4 _dispatch_main_queue_callback_4CF + 992
>   17 CoreFoundation 0x1843790c4
> __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 8
>   18 CoreFoundation 0x184376ce0 __CFRunLoopRun + 1568
>   19 CoreFoundation 0x1842a6da0 CFRunLoopRunSpecific + 420
>   20 GraphicsServices 0x185d11070 GSEventRunModal + 96
>   21 UIKit 0x18a561c98 UIApplicationMain + 204
>   22 Facebook 0x1000b5738 FBGenericMain (FBGenericMain.m:224)
>   23 Facebook 0x1000b5240 main (main.m:150)
>   24 libdyld.dylib 0x1832b5598 start + 0
> 
> This RELEASE_ASSERT fires if the pthread TSD key hasn't yet been created.
> The TSD key is created by WTF::initializeThreading, which is called by
> JSC::initializeThreading. So basically, the crash occurs because the MT
> called WTF::ThreadIdentifierData::initialize before another thread finished
> calling WTF::initializeThreading.
> 
> The thread that's responsible for calling WTF::initializeThreading is the
> WebThread (which is forked off by StartWebThread). The WebThread has this
> call stack in the crash:
> 
>   Thread #63:
>   0 libsystem_kernel.dylib 0x1833c4f60 __psynch_rw_wrlock + 8
>   1 libsystem_pthread.dylib 0x18348b5c8 _pthread_rwlock_lock + 464
>   2 libobjc.A.dylib 0x182e35764 rwlock_tt<false>::write() + 100
>   3 libobjc.A.dylib 0x182e2ebc4 _class_getNonMetaClass + 36
>   4 libobjc.A.dylib 0x182e354ac lookUpImpOrForward + 220
>   5 libobjc.A.dylib 0x182e40474 _objc_msgSend_uncached + 52
>   6 libobjc.A.dylib 0x182e2d414 CALLING_SOME_+initialize_METHOD + 20
>   7 libobjc.A.dylib 0x182e2d680 _class_initialize + 608
>   8 libobjc.A.dylib 0x182e354b0 lookUpImpOrForward + 224
>   9 libobjc.A.dylib 0x182e40474 _objc_msgSend_uncached + 52
>   10 JavaScriptCore 0x1882a9958 WTF::initializeMainThreadPlatform() + 24
>   11 libsystem_pthread.dylib 0x18348f418 __pthread_once_handler + 76
>   12 libsystem_platform.dylib 0x183484a44 _os_once + 48
>   13 libsystem_pthread.dylib 0x18348c2fc pthread_once + 64
>   14 WebCore 0x188f78b94 RunWebThread(void*) + 32
>   15 libsystem_pthread.dylib 0x18348d688 _pthread_body + 236
>   16 libsystem_pthread.dylib 0x18348d598 _pthread_start + 280
>   17 libsystem_pthread.dylib 0x18348acb0 thread_start + 0
> 
> If you look at the code, the WebThread is stuck two lines *before* the call
> to JSC::initializeThreading (which is what calls
> WTF::initializeThreading)--it's executing WTF::initializeMainThreadPlatform:
> 
>  
> https://opensource.apple.com/source/WebCore/WebCore-7603.1.30.1.33/platform/
> ios/wak/WebCoreThread.mm.auto.html (line 653)
> 
> In theory, there is some synchronization in the existing code that should
> prevent this from happening. In pseudo-code, the synchronization code looks
> like this:
> 
>   RunWebThread: (running on the web thread)
>     JSC::initializeThreading()
>     // other init...
> 
>     pthread_mutex_lock(&lock)
>     pthread_cond_signal(&cond, &lock)
>     pthread_mutex_unlock(&lock)
> 
>   StartWebThread: (running on the main thread)
>     pthread_lock(&lock)
>     pthread_cond_wait(&cond, &lock)
>     pthread_unlock(&lock)
> 
>     WTF::initializeApplicationUIThreadIdentifier() // calls code which
> eventually asserts
> 
> However, this synchronization code is buggy because pthread_cond_wait is
> allowed to spuriously wake up, and pthread_cond_wait is only called once. It
> is very easy to demonstrate that spurious wakeups exist with pthread
> condition variables, e.g. here is a simple test program:
> https://gist.github.com/bnham/f2ca3430a7c4f6f714a918b62702125e
> 
> The fix is to add a boolean flag variable and to check whether that boolean
> variable is set while calling pthread_cond_wait in a loop, i.e. textbook
> usage of Mesa condition variables.
> 
> A possible mitigation for affected apps is to call some JSC API that
> eventually calls JSC::initializeThreading early on in app initialization.

BTW, I think your pointing is true that this pthread_cond_t use is broken due to the existence of the suspicious wakeup. We should fix this while the trunk do not show this problem. Our web threading's following code depends on that web thread is already initialized.
Comment 6 Yusuke Suzuki 2017-08-23 09:57:40 PDT
Created attachment 318880 [details]
Patch
Comment 7 Mark Lam 2017-08-23 10:34:37 PDT
Comment on attachment 318880 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=318880&action=review

r=me.

> Source/WebCore/ChangeLog:9
> +        by using pthread_cond_t. However, the main thread may be waken up due to the the existence of

/waken/woken/.

> Source/WebCore/ChangeLog:10
> +        the suspicious wakeup of the pthread_cond_t.

/suspicious/spurious/.

> Source/WebCore/ChangeLog:14
> +        does not have the suspicious wakup problem as described in Condition.h.

/suspicious/spurious/.
/wakup/wake up/.
Comment 8 Yusuke Suzuki 2017-08-23 10:37:13 PDT
Comment on attachment 318880 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=318880&action=review

Thank you!

>> Source/WebCore/ChangeLog:9
>> +        by using pthread_cond_t. However, the main thread may be waken up due to the the existence of
> 
> /waken/woken/.

Oops, thanks. Fixed.

>> Source/WebCore/ChangeLog:10
>> +        the suspicious wakeup of the pthread_cond_t.
> 
> /suspicious/spurious/.

Fixed.

>> Source/WebCore/ChangeLog:14
>> +        does not have the suspicious wakup problem as described in Condition.h.
> 
> /suspicious/spurious/.
> /wakup/wake up/.

Fixed.
Comment 9 Yusuke Suzuki 2017-08-23 10:42:10 PDT
Committed r221083: <http://trac.webkit.org/changeset/221083>