RESOLVED FIXED 90336
REGRESSION(r121605): Changes caused flaky crashes in sputnik/Unicode tests on Apple WK1 and GTK Linux builders
https://bugs.webkit.org/show_bug.cgi?id=90336
Summary REGRESSION(r121605): Changes caused flaky crashes in sputnik/Unicode tests on...
Zan Dobersek
Reported 2012-06-30 04:45:24 PDT
http://trac.webkit.org/changeset/121605 broke the build: Changes caused flaky crashes in sputnik/Unicode tests on Apple WK1 and GTK Linux builders
Attachments
ROLLOUT of r121605 (27.11 KB, patch)
2012-06-30 04:45 PDT, Zan Dobersek
no flags
Zan Dobersek
Comment 1 2012-06-30 04:45:45 PDT
Created attachment 150314 [details] ROLLOUT of r121605 Any committer can land this patch automatically by marking it commit-queue+. The commit-queue will build and test the patch before landing to ensure that the rollout will be successful. This process takes approximately 15 minutes. If you would like to land the rollout faster, you can use the following command: webkit-patch land-attachment ATTACHMENT_ID where ATTACHMENT_ID is the ID of this attachment.
Zan Dobersek
Comment 2 2012-06-30 05:07:02 PDT
Flakiness dashboard shows various sputnik/Unicode tests have started crashing randomly after r121605: http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20-%20webkit.org&tests=sputnik%2FUnicode Crashes occur in various ways: http://build.webkit.org/results/GTK%20Linux%2064-bit%20Debug/r121619%20(34484)/sputnik/Unicode/Unicode_320/S7.6_A3.2-crash-log.txt Crash log for DumpRenderTree (pid 16652): [New LWP 16652] [New LWP 17753] [New LWP 16671] [New LWP 20825] [Thread debugging using libthread_db enabled] Core was generated by `/home/slave/webkitgtk/gtk-linux-64-debug/build/WebKitBuild/Debug/Programs/DumpR'. Program terminated with signal 11, Segmentation fault. #0 0x000000000046bb9c in JSC::WriteBarrierBase<JSC::Unknown>::get (this=0x7fef21403108) at ../../Source/JavaScriptCore/runtime/WriteBarrier.h:161 161 return JSValue::decode(m_value); ... Thread 1 (Thread 0x7fef7ab3e900 (LWP 16652)): #0 0x000000000046bb9c in JSC::WriteBarrierBase<JSC::Unknown>::get (this=0x7fef21403108) at ../../Source/JavaScriptCore/runtime/WriteBarrier.h:161 #1 0x000000000046cce3 in JSC::JSObject::getDirectOffset (this=0x7fef282c5680, offset=23) at ../../Source/JavaScriptCore/runtime/JSObject.h:208 #2 0x000000000046cc9b in JSC::JSObject::getDirect (this=0x7fef282c5680, globalData=..., propertyName=...) at ../../Source/JavaScriptCore/runtime/JSObject.h:173 #3 0x000000000046b5e2 in WebCoreTestSupport::resetInternalsObject (context=0x7fef282c5890) at ../../Source/WebCore/testing/js/WebCoreTestSupport.cpp:55 #4 0x0000000000460222 in runTest (testPathOrURL=...) at ../../Tools/DumpRenderTree/gtk/DumpRenderTree.cpp:731 #5 0x000000000045f792 in runTestingServerLoop () at ../../Tools/DumpRenderTree/gtk/DumpRenderTree.cpp:498 #6 0x00000000004627f3 in main (argc=2, argv=0x7fff2c872918) at ../../Tools/DumpRenderTree/gtk/DumpRenderTree.cpp:1403 http://build.webkit.org/results/Apple%20Lion%20Debug%20WK1%20(Tests)/r121607%20(525)/sputnik/Unicode/Unicode_320/S7.6_A5.2_T1-crash-log.txt Process: DumpRenderTree [16493] Path: /Volumes/VOLUME/*/DumpRenderTree Identifier: DumpRenderTree Version: ??? (???) Code Type: X86-64 (Native) Parent Process: Python [16489] Date/Time: 2012-06-29 19:13:07.544 -0700 OS Version: Mac OS X 10.7.3 (11D50) Report Version: 9 Anonymous UUID: FE181E3A-99A8-4646-869B-A5119B6D0850 Crashed Thread: 0 Dispatch queue: com.apple.main-thread Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x000000010feaa0e8 VM Regions Near 0x10feaa0e8: __LINKEDIT 000000010fe97000-000000010feaa000 [ 76K] r--/rwx SM=COW /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/CoreGraphics.framework/Versions/A/Resources/libPDFRIP.A.dylib --> MALLOC_LARGE 000000010fec0000-000000010fecb000 [ 44K] rw-/rwx SM=PRV Application Specific Information: objc[16493]: garbage collection is OFF Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libWebCoreTestSupport.dylib 0x0000000102ac1a10 JSC::WriteBarrierBase<JSC::Unknown>::get() const + 16 (WriteBarrier.h:161) 1 libWebCoreTestSupport.dylib 0x0000000102ad19c4 JSC::JSObject::getDirectOffset(unsigned long) const + 52 (JSObject.h:208) 2 libWebCoreTestSupport.dylib 0x0000000102ad194e JSC::JSObject::getDirect(JSC::JSGlobalData&, JSC::PropertyName) const + 94 (JSObject.h:173) 3 libWebCoreTestSupport.dylib 0x0000000102ad15fb WebCoreTestSupport::resetInternalsObject(OpaqueJSContext const*) + 139 (WebCoreTestSupport.cpp:55) 4 DumpRenderTree 0x0000000102943f61 _ZL42resetWebViewToConsistentStateBeforeTestingv + 769 (DumpRenderTree.mm:1263) 5 DumpRenderTree 0x00000001029424c2 _ZL7runTestRKNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEE + 6770 (DumpRenderTree.mm:1400) 6 DumpRenderTree 0x00000001029409aa _ZL20runTestingServerLoopv + 282 (DumpRenderTree.mm:830) 7 DumpRenderTree 0x000000010294023a dumpRenderTree(int, char const**) + 394 (DumpRenderTree.mm:879) 8 DumpRenderTree 0x00000001029428a9 main + 105 (DumpRenderTree.mm:916) 9 DumpRenderTree 0x000000010292b0c4 start + 52 http://build.webkit.org/results/Apple%20Lion%20Release%20WK1%20(Tests)/r121613%20(700)/sputnik/Unicode/Unicode_500/S7.6_A3.2-crash-log.txt Process: DumpRenderTree [7551] Path: /Volumes/VOLUME/*/DumpRenderTree Identifier: DumpRenderTree Version: ??? (???) Code Type: X86-64 (Native) Parent Process: Python [7542] Date/Time: 2012-06-29 20:56:27.088 -0700 OS Version: Mac OS X 10.7.3 (11D50) Report Version: 9 Crashed Thread: 0 Dispatch queue: com.apple.main-thread Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x000000011071c0e8 VM Regions Near 0x11071c0e8: __LINKEDIT 0000000110716000-000000011071c000 [ 24K] r--/rwx SM=COW /System/Library/Frameworks/OpenGL.framework/Versions/A/Resources/GLRendererFloat.bundle/GLRendererFloat --> TC malloc 0000000110730000-0000000110738000 [ 32K] rw-/rwx SM=PRV Application Specific Information: objc[7551]: garbage collection is OFF Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libWebCoreTestSupport.dylib 0x0000000107fb3070 JSC::JSObject::getDirect(JSC::JSGlobalData&, JSC::PropertyName) const + 208 (WriteBarrier.h:161) 1 libWebCoreTestSupport.dylib 0x0000000107fb2f20 WebCoreTestSupport::resetInternalsObject(OpaqueJSContext const*) + 80 (WebCoreTestSupport.cpp:55) 2 DumpRenderTree 0x0000000107ea2aa7 _ZL42resetWebViewToConsistentStateBeforeTestingv + 466 (RefPtr.h:64) 3 DumpRenderTree 0x0000000107ea1a3a _ZL7runTestRKNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEE + 2183 (DumpRenderTree.mm:1402) 4 DumpRenderTree 0x0000000107ea0f86 dumpRenderTree(int, char const**) + 1848 (DumpRenderTree.mm:830) 5 DumpRenderTree 0x0000000107ea1c53 main + 86 (DumpRenderTree.mm:917) 6 DumpRenderTree 0x0000000107e960b4 start + 52 http://build.webkit.org/results/Apple%20Lion%20Release%20WK1%20(Tests)/r121607%20(695)/sputnik/Unicode/Unicode_320/S7.6_A3.2-crash-log.txt Process: DumpRenderTree [86047] Path: /Volumes/VOLUME/*/DumpRenderTree Identifier: DumpRenderTree Version: ??? (???) Code Type: X86-64 (Native) Parent Process: Python [86040] Date/Time: 2012-06-29 18:45:08.844 -0700 OS Version: Mac OS X 10.7.3 (11D50) Report Version: 9 Crashed Thread: 0 Dispatch queue: com.apple.main-thread Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x00000001145f10e8 VM Regions Near 0x1145f10e8: mapped file 00000001145ae000-00000001145f1000 [ 268K] r--/rwx SM=COW /System/Library/Fonts/Geeza Pro.ttf --> ATS (font support) 0000000114605000-0000000116585000 [ 31.5M] rw-/rwx SM=COW Application Specific Information: objc[86047]: garbage collection is OFF Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libWebCoreTestSupport.dylib 0x0000000107b6a070 JSC::JSObject::getDirect(JSC::JSGlobalData&, JSC::PropertyName) const + 208 (WriteBarrier.h:161) 1 libWebCoreTestSupport.dylib 0x0000000107b69f20 WebCoreTestSupport::resetInternalsObject(OpaqueJSContext const*) + 80 (WebCoreTestSupport.cpp:55) 2 DumpRenderTree 0x0000000107a5caa7 _ZL42resetWebViewToConsistentStateBeforeTestingv + 466 (RefPtr.h:64) 3 DumpRenderTree 0x0000000107a5ba3a _ZL7runTestRKNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEE + 2183 (DumpRenderTree.mm:1402) 4 DumpRenderTree 0x0000000107a5af86 dumpRenderTree(int, char const**) + 1848 (DumpRenderTree.mm:830) 5 DumpRenderTree 0x0000000107a5bc53 main + 86 (DumpRenderTree.mm:917) 6 DumpRenderTree 0x0000000107a500b4 start + 52
Zan Dobersek
Comment 3 2012-06-30 05:09:24 PDT
Comment on attachment 150314 [details] ROLLOUT of r121605 Clearing flags on attachment: 150314 Committed r121627: <http://trac.webkit.org/changeset/121627>
Zan Dobersek
Comment 4 2012-06-30 05:09:32 PDT
All reviewed patches have been landed. Closing bug.
Filip Pizlo
Comment 5 2012-06-30 12:24:08 PDT
I think you're a bit too quick with the rollout trigger. The patch did not in fact break the build, and as you indicate, it caused a flaky crash in one test. What I would have done if I was on bit watching shift is: 1) Immediately skip the test and file a bug assigning it to the person at fault. 2) Wait a bit, give them an opportunity to fix the bug. If it was during business hours for the person committing the change, I'd probably give them 4 hours. Since it's 5am on a Saturday over here, I'd probably leave it until Saturday afternoon PST. 3) Rollout only if they are unable to fix or if they are unresponsive. Rolling out a large-ish patch at 5 in the morning PST without any warning to anyone carries more cost than you think. First, it annoys people, which is not something you want to do in a collaborative project. Second, it slows down progress. Fixing a flaky crasher in tip of tree when the faulting revision is known is a lot faster than fixing a flaky crasher in an unlanded (or previously rolled out) patch. Hence, though you certainly accomplished the immediate goal of getting some bots green, you've increased the time it will take to have a correct fix. I think this is particularly true for JavaScriptCore changes, since there is *always* a bug tail. Just look at my commits - 1/3 patches I land is a performance improvement and the rest are fixes for recent regressions. If you rolled out each of my performance improvement patches just as soon as you found that it caused a crash, then I'd never get any work done. I think that you should take some more lessons from the other gardeners, who typically only trigger rollouts for much bigger offenses (*lots* of tests failing or total build failure with no obvious fix). For smaller things they're more likely to help out instead of rapid-fire rollouts.
Note You need to log in before you can comment on or make changes to this bug.