LayoutTest workers/wasm-hashset-many.html is a flaky crash (with no crashlog) https://build.webkit.org/results/Apple%20Sierra%20Release%20WK2%20(Tests)/r218718%20(2469)/results.html https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=workers%2Fwasm-hashset-many.html
I can reproduce a timeout 100% of the time by running: ./Tools/Scripts/run-webkit-tests LayoutTests/workers/wasm-hashset-many.html --repeat 150 --fully-parallel --child-processes 6 --no-retry-failure However, the sample of the process isn't very telling. I'm not sure what's happening: Call graph: 986 Thread_18334665 DispatchQueue_1: com.apple.main-thread (serial) + 986 start (in libdyld.dylib) + 1 [0x7fffaa6f9235] + 986 main (in com.apple.WebKit.WebContent.Development) + 492 [0x10bc29695] XPCServiceMain.mm:148 + 986 xpc_main (in libxpc.dylib) + 494 [0x7fffaa9512e4] + 986 _xpc_objc_main (in libxpc.dylib) + 775 [0x7fffaa9528c7] + 986 NSApplicationMain (in AppKit) + 1237 [0x7fff92a06e0e] + 986 -[NSApplication run] (in AppKit) + 926 [0x7fff92a3c3db] + 986 -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] (in AppKit) + 2796 [0x7fff931c37ee] + 986 _DPSNextEvent (in AppKit) + 1120 [0x7fff92a47a54] + 986 _BlockUntilNextEventMatchingListInModeWithFilter (in HIToolbox) + 71 [0x7fff944aeb26] + 986 ReceiveNextEventCommon (in HIToolbox) + 432 [0x7fff944aecf1] + 986 RunCurrentEventLoopInMode (in HIToolbox) + 240 [0x7fff944aeebc] + 986 CFRunLoopRunSpecific (in CoreFoundation) + 420 [0x7fff94f4e114] + 985 __CFRunLoopRun (in CoreFoundation) + 1361 [0x7fff94f4e8c1] + ! 985 __CFRunLoopServiceMachPort (in CoreFoundation) + 212 [0x7fff94f4f434] + ! 985 mach_msg (in libsystem_kernel.dylib) + 55 [0x7fffaa81f797] + ! 985 mach_msg_trap (in libsystem_kernel.dylib) + 10 [0x7fffaa82034a] + 1 __CFRunLoopRun (in CoreFoundation) + 2065 [0x7fff94f4eb81] + 1 __CFRunLoopDoTimers (in CoreFoundation) + 177 [0x7fff94f573c1] + 1 CFArrayCreateMutable (in CoreFoundation) + 113 [0x7fff94eeca11] + 1 +[__NSPlaceholderArray mutablePlaceholder] (in CoreFoundation) + 11 [0x7fff94eeca6b] 986 Thread_18334777: com.apple.NSEventThread + 986 thread_start (in libsystem_pthread.dylib) + 13 [0x7fffaa91208d] + 986 _pthread_start (in libsystem_pthread.dylib) + 286 [0x7fffaa912887] + 986 _pthread_body (in libsystem_pthread.dylib) + 180 [0x7fffaa91293b] + 986 _NSEventThread (in AppKit) + 205 [0x7fff92b94f02] + 986 CFRunLoopRunSpecific (in CoreFoundation) + 420 [0x7fff94f4e114] + 986 __CFRunLoopRun (in CoreFoundation) + 1361 [0x7fff94f4e8c1] + 986 __CFRunLoopServiceMachPort (in CoreFoundation) + 212 [0x7fff94f4f434] + 986 mach_msg (in libsystem_kernel.dylib) + 55 [0x7fffaa81f797] + 986 mach_msg_trap (in libsystem_kernel.dylib) + 10 [0x7fffaa82034a] 986 Thread_18334778 + 986 thread_start (in libsystem_pthread.dylib) + 13 [0x7fffaa91208d] + 986 _pthread_start (in libsystem_pthread.dylib) + 286 [0x7fffaa912887] + 986 _pthread_body (in libsystem_pthread.dylib) + 180 [0x7fffaa91293b] + 986 void* std::__1::__thread_proxy<std::__1::tuple<void (*)(bmalloc::AsyncTask<bmalloc::Heap, void (bmalloc::Heap::*)()>*), bmalloc::AsyncTask<bmalloc::Heap, void (bmalloc::Heap::*)()>*> >(void*) (in JavaScriptCore) + 93 [0x1111f90cd] __functional_base:416 + 986 bmalloc::AsyncTask<bmalloc::Heap, void (bmalloc::Heap::*)()>::threadEntryPoint(bmalloc::AsyncTask<bmalloc::Heap, void (bmalloc::Heap::*)()>*) (in JavaScriptCore) + 29 [0x1111f8bfd] AsyncTask.h:114 + 504 bmalloc::AsyncTask<bmalloc::Heap, void (bmalloc::Heap::*)()>::threadRunLoop() (in JavaScriptCore) + 197 [0x1111f8d55] condition_variable:235 + ! 504 std::__1::cv_status std::__1::condition_variable_any::wait_until<std::__1::unique_lock<bmalloc::Mutex>, std::__1::chrono::steady_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > >(std::__1::unique_lock<bmalloc::Mutex>&, std::__1::chrono::time_point<std::__1::chrono::steady_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > > const&) (in JavaScriptCore) + 116 [0x1111f8f14] condition_variable:224 + ! 504 std::__1::cv_status std::__1::condition_variable::wait_until<std::__1::chrono::steady_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > >(std::__1::unique_lock<std::__1::mutex>&, std::__1::chrono::time_point<std::__1::chrono::steady_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > > const&) (in JavaScriptCore) + 128 [0x1111f9040] __mutex_base:388 + ! 504 std::__1::condition_variable::__do_timed_wait(std::__1::unique_lock<std::__1::mutex>&, std::__1::chrono::time_point<std::__1::chrono::system_clock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > >) (in libc++.1.dylib) + 126 [0x7fffa9291572] + ! 504 _pthread_cond_wait (in libsystem_pthread.dylib) + 712 [0x7fffaa9137fa] + ! 504 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fffaa827bf2] + 481 bmalloc::AsyncTask<bmalloc::Heap, void (bmalloc::Heap::*)()>::threadRunLoop() (in JavaScriptCore) + 107 [0x1111f8cfb] condition_variable:211 + ! 481 void std::__1::condition_variable_any::wait<std::__1::unique_lock<bmalloc::Mutex> >(std::__1::unique_lock<bmalloc::Mutex>&) (in JavaScriptCore) + 108 [0x1111f8dfc] condition_variable:202 + ! 481 std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) (in libc++.1.dylib) + 47 [0x7fffa92914cd] + ! 481 _pthread_cond_wait (in libsystem_pthread.dylib) + 712 [0x7fffaa9137fa] + ! 481 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fffaa827bf2] + 1 bmalloc::AsyncTask<bmalloc::Heap, void (bmalloc::Heap::*)()>::threadRunLoop() (in JavaScriptCore) + 50 [0x1111f8cc2] AsyncTask.h:138 + 1 bmalloc::Heap::concurrentScavenge() (in JavaScriptCore) + 156 [0x1111f6fec] Heap.cpp:135 + 1 bmalloc::Heap::scavenge(std::__1::lock_guard<bmalloc::StaticMutex>&) (in JavaScriptCore) + 656 [0x1111f73e0] VMAllocate.h:188 + 1 madvise (in libsystem_kernel.dylib) + 10 [0x7fffaa828e9e] 986 Thread_18334800: com.apple.NSURLConnectionLoader + 986 thread_start (in libsystem_pthread.dylib) + 13 [0x7fffaa91208d] + 986 _pthread_start (in libsystem_pthread.dylib) + 286 [0x7fffaa912887] + 986 _pthread_body (in libsystem_pthread.dylib) + 180 [0x7fffaa91293b] + 986 __NSThread__start__ (in Foundation) + 1243 [0x7fff969718ed] + 986 +[NSURLConnection(Loader) _resourceLoadLoop:] (in CFNetwork) + 313 [0x7fff9408b734] + 986 CFRunLoopRunSpecific (in CoreFoundation) + 420 [0x7fff94f4e114] + 986 __CFRunLoopRun (in CoreFoundation) + 1361 [0x7fff94f4e8c1] + 986 __CFRunLoopServiceMachPort (in CoreFoundation) + 212 [0x7fff94f4f434] + 986 mach_msg (in libsystem_kernel.dylib) + 55 [0x7fffaa81f797] + 986 mach_msg_trap (in libsystem_kernel.dylib) + 10 [0x7fffaa82034a] 986 Thread_18334851: WebCore: Scrolling + 986 thread_start (in libsystem_pthread.dylib) + 13 [0x7fffaa91208d] + 986 _pthread_start (in libsystem_pthread.dylib) + 286 [0x7fffaa912887] + 986 _pthread_body (in libsystem_pthread.dylib) + 180 [0x7fffaa91293b] + 986 WTF::wtfThreadEntryPoint(void*) (in JavaScriptCore) + 74 [0x1111e87fa] ThreadingPthreads.cpp:209 + 986 WTF::threadEntryPoint(void*) (in JavaScriptCore) + 111 [0x1111e83df] Function.h:56 + 986 WebCore::ScrollingThread::initializeRunLoop() (in WebCore) + 253 [0x10df9022d] ScrollingThreadMac.mm:52 + 986 CFRunLoopRun (in CoreFoundation) + 97 [0x7fff94f8db91] + 986 CFRunLoopRunSpecific (in CoreFoundation) + 420 [0x7fff94f4e114] + 986 __CFRunLoopRun (in CoreFoundation) + 1361 [0x7fff94f4e8c1] + 986 __CFRunLoopServiceMachPort (in CoreFoundation) + 212 [0x7fff94f4f434] + 986 mach_msg (in libsystem_kernel.dylib) + 55 [0x7fffaa81f797] + 986 mach_msg_trap (in libsystem_kernel.dylib) + 10 [0x7fffaa82034a] 986 Thread_18381792 + 986 start_wqthread (in libsystem_pthread.dylib) + 13 [0x7fffaa91207d] + 986 _pthread_wqthread (in libsystem_pthread.dylib) + 1023 [0x7fffaa91248e] + 986 __workq_kernreturn (in libsystem_kernel.dylib) + 10 [0x7fffaa82844e] 554 Thread_18377504 554 start_wqthread (in libsystem_pthread.dylib) + 13 [0x7fffaa91207d] 553 _pthread_wqthread (in libsystem_pthread.dylib) + 1426 [0x7fffaa912621] ! 553 __workq_kernreturn (in libsystem_kernel.dylib) + 10 [0x7fffaa82844e] 1 _pthread_wqthread (in libsystem_pthread.dylib) + 1481 [0x7fffaa912658] Total number in stack (recursive counted multiple, when >=5): Sort by top of stack, same collapsed (when >= 5): mach_msg_trap (in libsystem_kernel.dylib) 3943 __workq_kernreturn (in libsystem_kernel.dylib) 1539 __psynch_cvwait (in libsystem_kernel.dylib) 985
The expected output: ``` CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 8: Finished running PASS successfullyParsed is true TEST COMPLETE ``` Actual output: ``` AC Plugin 2.0Content-Type: text/plain CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished CONSOLE MESSAGE: line 27: Worker finished #PID UNRESPONSIVE - com.apple.WebKit.WebContent.Development (pid 46688) FAIL: Timed out waiting for notifyDone to be called #EOF #EOF ```` So it looks like just a single worker isn't finishing for some reason.
Ignore the above logs. It's not seeing a timeout. This log was from something else. I'm running into another issue where I periodically see "AC Plugin 2.0" in the diff. This is a known problem. The above sample is not related to this bug.
Jonathan, do you have any idea why we would report a test crash but not actually get a crash log?
(In reply to Ryan Haddad from comment #4) > Jonathan, do you have any idea why we would report a test crash but not > actually get a crash log? Looking to see if I can find any other information. Failure to attach a crash-log usually means that a crash has (In reply to Ryan Haddad from comment #4) > Jonathan, do you have any idea why we would report a test crash but not > actually get a crash log? It's possible that this is a test harness failure. Here is the stdout of that test run: https://build.webkit.org/builders/Apple%20Sierra%20Release%20WK2%20%28Tests%29/builds/2469/steps/layout-test/logs/stdio. Looking at the logging, we are actually killing the test runner because it doesn't have output when we expect it to. Saam, is this test expected to take a long time? Are their possible race conditions? Or do we expect there to be a lot of output on this test? I'd bet that this is something time-outy, we're hitting the code-path where we attempt to read from the test runner's stdout but fail.
It is definitely possible there are races here. The test is not super quick, the bot seems to finish most of the time in 10s or so. It seems possible we're hitting a deadlock we don't know about, but it'd be really nice to have the proper logs here so we can tell if that's happening or not. As is, it's probably a good bet to make that we're not deadlocking. But I can't say for certain.
(In reply to Saam Barati from comment #6) > It is definitely possible there are races here. The test is not super quick, > the bot seems to finish most of the time in 10s or so. It seems possible > we're hitting a deadlock we don't know about, but it'd be really nice to > have the proper logs here so we can tell if that's happening or not. As is, > it's probably a good bet to make that we're not deadlocking. But I can't say > for certain. I'm not sure what else to do about logs. This test doesn't have much stdout or stderr and something closes the stdout and stderr file handle Python uses to communicate with the test runner. We aren't picking up a crash because no crash-log is generated, which means that whatever this crash is, the OS is not considering it a crash.
<rdar://problem/33829549>