Bug 90336 - REGRESSION(r121605): Changes caused flaky crashes in sputnik/Unicode tests on Apple WK1 and GTK Linux builders
Summary: REGRESSION(r121605): Changes caused flaky crashes in sputnik/Unicode tests on...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Zan Dobersek
URL:
Keywords:
Depends on:
Blocks: 90255
  Show dependency treegraph
 
Reported: 2012-06-30 04:45 PDT by Zan Dobersek
Modified: 2012-06-30 12:24 PDT (History)
2 users (show)

See Also:


Attachments
ROLLOUT of r121605 (27.11 KB, patch)
2012-06-30 04:45 PDT, Zan Dobersek
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zan Dobersek 2012-06-30 04:45:24 PDT
http://trac.webkit.org/changeset/121605 broke the build:
Changes caused flaky crashes in sputnik/Unicode tests on Apple WK1 and GTK Linux builders
Comment 1 Zan Dobersek 2012-06-30 04:45:45 PDT
Created attachment 150314 [details]
ROLLOUT of r121605

Any committer can land this patch automatically by marking it commit-queue+.  The commit-queue will build and test the patch before landing to ensure that the rollout will be successful.  This process takes approximately 15 minutes.

If you would like to land the rollout faster, you can use the following command:

  webkit-patch land-attachment ATTACHMENT_ID

where ATTACHMENT_ID is the ID of this attachment.
Comment 2 Zan Dobersek 2012-06-30 05:07:02 PDT
Flakiness dashboard shows various sputnik/Unicode tests have started crashing randomly after r121605:
http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=%40ToT%20-%20webkit.org&tests=sputnik%2FUnicode

Crashes occur in various ways:

http://build.webkit.org/results/GTK%20Linux%2064-bit%20Debug/r121619%20(34484)/sputnik/Unicode/Unicode_320/S7.6_A3.2-crash-log.txt
Crash log for DumpRenderTree (pid 16652):

[New LWP 16652]
[New LWP 17753]
[New LWP 16671]
[New LWP 20825]
[Thread debugging using libthread_db enabled]
Core was generated by `/home/slave/webkitgtk/gtk-linux-64-debug/build/WebKitBuild/Debug/Programs/DumpR'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000000046bb9c in JSC::WriteBarrierBase<JSC::Unknown>::get (this=0x7fef21403108) at ../../Source/JavaScriptCore/runtime/WriteBarrier.h:161
161	        return JSValue::decode(m_value);

...

Thread 1 (Thread 0x7fef7ab3e900 (LWP 16652)):
#0  0x000000000046bb9c in JSC::WriteBarrierBase<JSC::Unknown>::get (this=0x7fef21403108) at ../../Source/JavaScriptCore/runtime/WriteBarrier.h:161
#1  0x000000000046cce3 in JSC::JSObject::getDirectOffset (this=0x7fef282c5680, offset=23) at ../../Source/JavaScriptCore/runtime/JSObject.h:208
#2  0x000000000046cc9b in JSC::JSObject::getDirect (this=0x7fef282c5680, globalData=..., propertyName=...) at ../../Source/JavaScriptCore/runtime/JSObject.h:173
#3  0x000000000046b5e2 in WebCoreTestSupport::resetInternalsObject (context=0x7fef282c5890) at ../../Source/WebCore/testing/js/WebCoreTestSupport.cpp:55
#4  0x0000000000460222 in runTest (testPathOrURL=...) at ../../Tools/DumpRenderTree/gtk/DumpRenderTree.cpp:731
#5  0x000000000045f792 in runTestingServerLoop () at ../../Tools/DumpRenderTree/gtk/DumpRenderTree.cpp:498
#6  0x00000000004627f3 in main (argc=2, argv=0x7fff2c872918) at ../../Tools/DumpRenderTree/gtk/DumpRenderTree.cpp:1403


http://build.webkit.org/results/Apple%20Lion%20Debug%20WK1%20(Tests)/r121607%20(525)/sputnik/Unicode/Unicode_320/S7.6_A5.2_T1-crash-log.txt
Process:         DumpRenderTree [16493]
Path:            /Volumes/VOLUME/*/DumpRenderTree
Identifier:      DumpRenderTree
Version:         ??? (???)
Code Type:       X86-64 (Native)
Parent Process:  Python [16489]

Date/Time:       2012-06-29 19:13:07.544 -0700
OS Version:      Mac OS X 10.7.3 (11D50)
Report Version:  9

Anonymous UUID:                      FE181E3A-99A8-4646-869B-A5119B6D0850

Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x000000010feaa0e8

VM Regions Near 0x10feaa0e8:
    __LINKEDIT             000000010fe97000-000000010feaa000 [   76K] r--/rwx SM=COW  /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/CoreGraphics.framework/Versions/A/Resources/libPDFRIP.A.dylib
--> 
    MALLOC_LARGE           000000010fec0000-000000010fecb000 [   44K] rw-/rwx SM=PRV  

Application Specific Information:
objc[16493]: garbage collection is OFF

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libWebCoreTestSupport.dylib   	0x0000000102ac1a10 JSC::WriteBarrierBase<JSC::Unknown>::get() const + 16 (WriteBarrier.h:161)
1   libWebCoreTestSupport.dylib   	0x0000000102ad19c4 JSC::JSObject::getDirectOffset(unsigned long) const + 52 (JSObject.h:208)
2   libWebCoreTestSupport.dylib   	0x0000000102ad194e JSC::JSObject::getDirect(JSC::JSGlobalData&, JSC::PropertyName) const + 94 (JSObject.h:173)
3   libWebCoreTestSupport.dylib   	0x0000000102ad15fb WebCoreTestSupport::resetInternalsObject(OpaqueJSContext const*) + 139 (WebCoreTestSupport.cpp:55)
4   DumpRenderTree                	0x0000000102943f61 _ZL42resetWebViewToConsistentStateBeforeTestingv + 769 (DumpRenderTree.mm:1263)
5   DumpRenderTree                	0x00000001029424c2 _ZL7runTestRKNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEE + 6770 (DumpRenderTree.mm:1400)
6   DumpRenderTree                	0x00000001029409aa _ZL20runTestingServerLoopv + 282 (DumpRenderTree.mm:830)
7   DumpRenderTree                	0x000000010294023a dumpRenderTree(int, char const**) + 394 (DumpRenderTree.mm:879)
8   DumpRenderTree                	0x00000001029428a9 main + 105 (DumpRenderTree.mm:916)
9   DumpRenderTree                	0x000000010292b0c4 start + 52


http://build.webkit.org/results/Apple%20Lion%20Release%20WK1%20(Tests)/r121613%20(700)/sputnik/Unicode/Unicode_500/S7.6_A3.2-crash-log.txt
Process:         DumpRenderTree [7551]
Path:            /Volumes/VOLUME/*/DumpRenderTree
Identifier:      DumpRenderTree
Version:         ??? (???)
Code Type:       X86-64 (Native)
Parent Process:  Python [7542]

Date/Time:       2012-06-29 20:56:27.088 -0700
OS Version:      Mac OS X 10.7.3 (11D50)
Report Version:  9

Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x000000011071c0e8

VM Regions Near 0x11071c0e8:
    __LINKEDIT             0000000110716000-000000011071c000 [   24K] r--/rwx SM=COW  /System/Library/Frameworks/OpenGL.framework/Versions/A/Resources/GLRendererFloat.bundle/GLRendererFloat
--> 
    TC malloc              0000000110730000-0000000110738000 [   32K] rw-/rwx SM=PRV  

Application Specific Information:
objc[7551]: garbage collection is OFF

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libWebCoreTestSupport.dylib   	0x0000000107fb3070 JSC::JSObject::getDirect(JSC::JSGlobalData&, JSC::PropertyName) const + 208 (WriteBarrier.h:161)
1   libWebCoreTestSupport.dylib   	0x0000000107fb2f20 WebCoreTestSupport::resetInternalsObject(OpaqueJSContext const*) + 80 (WebCoreTestSupport.cpp:55)
2   DumpRenderTree                	0x0000000107ea2aa7 _ZL42resetWebViewToConsistentStateBeforeTestingv + 466 (RefPtr.h:64)
3   DumpRenderTree                	0x0000000107ea1a3a _ZL7runTestRKNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEE + 2183 (DumpRenderTree.mm:1402)
4   DumpRenderTree                	0x0000000107ea0f86 dumpRenderTree(int, char const**) + 1848 (DumpRenderTree.mm:830)
5   DumpRenderTree                	0x0000000107ea1c53 main + 86 (DumpRenderTree.mm:917)
6   DumpRenderTree                	0x0000000107e960b4 start + 52


http://build.webkit.org/results/Apple%20Lion%20Release%20WK1%20(Tests)/r121607%20(695)/sputnik/Unicode/Unicode_320/S7.6_A3.2-crash-log.txt
Process:         DumpRenderTree [86047]
Path:            /Volumes/VOLUME/*/DumpRenderTree
Identifier:      DumpRenderTree
Version:         ??? (???)
Code Type:       X86-64 (Native)
Parent Process:  Python [86040]

Date/Time:       2012-06-29 18:45:08.844 -0700
OS Version:      Mac OS X 10.7.3 (11D50)
Report Version:  9

Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x00000001145f10e8

VM Regions Near 0x1145f10e8:
    mapped file            00000001145ae000-00000001145f1000 [  268K] r--/rwx SM=COW  /System/Library/Fonts/Geeza Pro.ttf
--> 
    ATS (font support)     0000000114605000-0000000116585000 [ 31.5M] rw-/rwx SM=COW  

Application Specific Information:
objc[86047]: garbage collection is OFF

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libWebCoreTestSupport.dylib   	0x0000000107b6a070 JSC::JSObject::getDirect(JSC::JSGlobalData&, JSC::PropertyName) const + 208 (WriteBarrier.h:161)
1   libWebCoreTestSupport.dylib   	0x0000000107b69f20 WebCoreTestSupport::resetInternalsObject(OpaqueJSContext const*) + 80 (WebCoreTestSupport.cpp:55)
2   DumpRenderTree                	0x0000000107a5caa7 _ZL42resetWebViewToConsistentStateBeforeTestingv + 466 (RefPtr.h:64)
3   DumpRenderTree                	0x0000000107a5ba3a _ZL7runTestRKNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEE + 2183 (DumpRenderTree.mm:1402)
4   DumpRenderTree                	0x0000000107a5af86 dumpRenderTree(int, char const**) + 1848 (DumpRenderTree.mm:830)
5   DumpRenderTree                	0x0000000107a5bc53 main + 86 (DumpRenderTree.mm:917)
6   DumpRenderTree                	0x0000000107a500b4 start + 52
Comment 3 Zan Dobersek 2012-06-30 05:09:24 PDT
Comment on attachment 150314 [details]
ROLLOUT of r121605

Clearing flags on attachment: 150314

Committed r121627: <http://trac.webkit.org/changeset/121627>
Comment 4 Zan Dobersek 2012-06-30 05:09:32 PDT
All reviewed patches have been landed.  Closing bug.
Comment 5 Filip Pizlo 2012-06-30 12:24:08 PDT
I think you're a bit too quick with the rollout trigger.  The patch did not in fact break the build, and as you indicate, it caused a flaky crash in one test.  What I would have done if I was on bit watching shift is:

1) Immediately skip the test and file a bug assigning it to the person at fault.

2) Wait a bit, give them an opportunity to fix the bug.  If it was during business hours for the person committing the change, I'd probably give them 4 hours.  Since it's 5am on a Saturday over here, I'd probably leave it until Saturday afternoon PST.

3) Rollout only if they are unable to fix or if they are unresponsive.

Rolling out a large-ish patch at 5 in the morning PST without any warning to anyone carries more cost than you think.  First, it annoys people, which is not something you want to do in a collaborative project.  Second, it slows down progress.  Fixing a flaky crasher in tip of tree when the faulting revision is known is a lot faster than fixing a flaky crasher in an unlanded (or previously rolled out) patch.  Hence, though you certainly accomplished the immediate goal of getting some bots green, you've increased the time it will take to have a correct fix.

I think this is particularly true for JavaScriptCore changes, since there is *always* a bug tail.  Just look at my commits - 1/3 patches I land is a performance improvement and the rest are fixes for recent regressions.  If you rolled out each of my performance improvement patches just as soon as you found that it caused a crash, then I'd never get any work done.

I think that you should take some more lessons from the other gardeners, who typically only trigger rollouts for much bigger offenses (*lots* of tests failing or total build failure with no obvious fix).  For smaller things they're more likely to help out instead of rapid-fire rollouts.