RESOLVED FIXED 126432
REGRESSION (r161195): Acid2 regression tests frequently fail
https://bugs.webkit.org/show_bug.cgi?id=126432
Summary REGRESSION (r161195): Acid2 regression tests frequently fail
Alexey Proskuryakov
Reported 2014-01-03 09:42:44 PST
fast/css/acid2-pixel.html and fast/css/acid2.html started to frequently fail on January 1st. Looks like OBJECT content fails to load: @@ -58,9 +58,9 @@ layer at (84,2710) size 144x24 RenderBlock (positioned) {DIV} at (48,72) size 144x24 [bgcolor=#FF0000] RenderBlock {DIV} at (0,0) size 144x0 - RenderInline {OBJECT} at (0,0) size 131x14 - RenderInline {OBJECT} at (0,0) size 131x14 - RenderImage {OBJECT} at (13,0) size 131x24 [border: none (12px solid #000000) none] + RenderInline {OBJECT} at (0,0) size 35x14 + RenderInline {OBJECT} at (0,0) size 35x14 + RenderImage {OBJECT} at (109,24) size 35x0 [border: none (12px solid #000000) none] RenderBlock (floating) {DIV} at (0,0) size 144x24 [border: none (12px solid #FF0000) none (12px solid #000000)] RenderBlock {DIV} at (0,0) size 144x24 [border: none (24px solid #FFFF00)] layer at (84,2782) size 144x24 http://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK2%20(Tests)/r161259%20(1313)/results.html
Attachments
patch (3.00 KB, patch)
2014-01-07 13:59 PST, Antti Koivisto
andersca: review+
buildbot: commit-queue-
Archive of layout-test-results from webkit-ews-06 for mac-mountainlion (540.42 KB, application/zip)
2014-01-07 15:38 PST, Build Bot
no flags
Archive of layout-test-results from webkit-ews-15 for mac-mountainlion-wk2 (454.94 KB, application/zip)
2014-01-07 19:52 PST, Build Bot
no flags
Alexey Proskuryakov
Comment 1 2014-01-03 09:42:59 PST
Andreas Kling
Comment 2 2014-01-03 10:44:40 PST
Andreas Kling
Comment 3 2014-01-03 10:46:05 PST
(In reply to comment #2) > My guess would be <http://trac.webkit.org/changeset/161208> Hrm. Or maybe <http://trac.webkit.org/changeset/161195> Antti?
Alexey Proskuryakov
Comment 4 2014-01-03 10:50:08 PST
Unless I'm fooled by time zones, r161208 was after this started, so r161195 seems more likely.
Antti Koivisto
Comment 5 2014-01-04 09:19:26 PST
Surely r161195 as it made plugin loads fail asynchronously in all cases. Acid2 render tree snapshot is probably just taken before things have truly settled.
Alexey Proskuryakov
Comment 6 2014-01-06 09:32:10 PST
Antti, what is the next step here? Marking Acid2 as flaky would be a silly thing to do, as this basically means that we ignore its results unless it starts to crash. This has been known for several days now, with no action. Rollout time?
Antti Koivisto
Comment 7 2014-01-06 10:01:08 PST
(In reply to comment #6) > Antti, what is the next step here? Marking Acid2 as flaky would be a silly thing to do, as this basically means that we ignore its results unless it starts to crash. > > This has been known for several days now, with no action. Rollout time? Probably just need to change the snapshot timings either by waitUntilDone or changes to the test framework. I doubt synchronous failures are important for web compatibility. I'll take a look tomorrow when I'm back to work. Please don't roll out anything.
Alexey Proskuryakov
Comment 8 2014-01-06 14:10:38 PST
> Probably just need to change the snapshot timings either by waitUntilDone or changes to the test framework. I doubt synchronous failures are important for web compatibility. I'll take a look tomorrow when I'm back to work. Thank you for letting me know that you are not going to work on this until tomorrow. I marked the tests as flaky in <http://trac.webkit.org/r161371>. It would have helped if you could state it earlier that you did not intend to do anything about the regressions for several days. > Please don't roll out anything. Why? There needs to be a specific reason to keep brokenness in the tree. This situation is an example of why immediate rollout is almost always best - I should have done that three days ago, so that we wouldn't have been left in a broken state for so long.
Antti Koivisto
Comment 9 2014-01-07 08:18:58 PST
(In reply to comment #8) > Why? > > There needs to be a specific reason to keep brokenness in the tree. This situation is an example of why immediate rollout is almost always best - I should have done that three days ago, so that we wouldn't have been left in a broken state for so long. - The problem does not reproduce locally so I'll have to try landing fixes for bots to test. Repeatedly relanding the original creates churn and confusing history. It may break other peoples work-in-progress patches. - One ~5-10% flaky test doesn't really affect anyone else's work much.
Alexey Proskuryakov
Comment 10 2014-01-07 08:44:38 PST
> > There needs to be a specific reason to keep brokenness in the tree. > - The problem does not reproduce locally so I'll have to try landing fixes for bots to test. This is one specific reason that could validate keeping the tree in bad state while fixing it. It also almost certainly doesn't apply here - the problem is well localized, and should be easy to attack locally. Some ideas: - run the tests many times in fully parallel mode, forcing heavy CPU use; - change the delay from 0 to some positive value. > Repeatedly remanding the original creates churn and confusing history. It may break other peoples work-in-progress patches. These general reasons are valid, but quite weak. Any and all changes landed can break others' work in progress patches, and that's not a reason to halt all work. Also, the faster you roll out, the less likely it is to affect anyone. > - One ~5-10% flaky test doesn't really affect anyone else's work much. This is both incorrect, and not applicable in this case. 1. In this case, there are 4 flaky tests, each with a much higher rate of failure. 2. There are dozens engineers on the project, and if everyone felt entitled to cause 5-10% flakiness on tests, it would not be a project worth working on. 3. Flaky tests slow down EWS a lot, and there really isn't a way around that. When we only have one test with 5% failure rate, that means that 5% of patches take at least 3x time as long to be processed.
Antti Koivisto
Comment 11 2014-01-07 09:13:24 PST
(In reply to comment #10) > This is one specific reason that could validate keeping the tree in bad state while fixing it. It also almost certainly doesn't apply here - the problem is well localized, and should be easy to attack locally. Some ideas: It is easy to declare something easy. How about actual local repro steps? > - run the tests many times in fully parallel mode, forcing heavy CPU use; I tried run-webkit-tests --debug fast/css/ --iterations=1000. > - change the delay from 0 to some positive value. Which delay? > 1. In this case, there are 4 flaky tests, each with a much higher rate of failure. The bug mentions two copies of acid2. What are the other two?
Antti Koivisto
Comment 12 2014-01-07 10:23:31 PST
Actually I have a trivial repro run-webkit-tests --debug fast/css/acid2.html -2 The bug is (at least almost) wk2 only.
Antti Koivisto
Comment 13 2014-01-07 10:33:24 PST
Almost. Two WK1 bots have flaked once over last 5 days if I read this correctly but it is the same false result.
Alexey Proskuryakov
Comment 14 2014-01-07 10:40:16 PST
It is slightly easier to reproduce with WebKit2, but WebKit1 is also affected. I believe you now saw that on flakiness dashboard (but there are also failures on http versions of the tests, so it's more than "once"). On a Mac Pro, I could reproduce the WebKit1 failure very reliably by running this command in two terminal windows at once to create high CPU load. Likely easier on a notebook: run-webkit-tests fast/css/acid2-pixel.html --repeat-each 10000 -f So yes, there are many tasks in WebKit development that are much harder than reproducing a failure on an isolated test, no need to be snarky about it. > > - change the delay from 0 to some positive value. > > Which delay? Your patch made object load failing asynchronous, i.e. use a zero delay timer. You could change the timer to a non-zero one to simulate stress conditions. This is a fairly common technique. > > 1. In this case, there are 4 flaky tests, each with a much higher rate of failure. > > The bug mentions two copies of acid2. What are the other two? Those are http versions of the same tests. You could see that by looking at <http://trac.webkit.org/r161371> that's mentioned in comment 8. Another way to find put what's going on is to check the flakiness dashboard for "acid2": <http://webkit-test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=acid2&revision=153686>.
Antti Koivisto
Comment 15 2014-01-07 11:00:16 PST
> Your patch made object load failing asynchronous, i.e. use a zero delay timer. You could change the timer to a non-zero one to simulate stress conditions. This is a fairly common technique. That's a good idea and this indeed makes it repro consistently with WK1: --- Source/WebCore/dom/Document.cpp (revision 161419) +++ Source/WebCore/dom/Document.cpp (working copy) @@ -1682,7 +1682,7 @@ // FIXME: Why on earth is this here? This is clearly misplaced. invalidateAccessKeyMap(); - m_styleRecalcTimer.startOneShot(0); + m_styleRecalcTimer.startOneShot(0.1); InspectorInstrumentation::didScheduleStyleRecalculation(this); } I thought you were talking about some DRT option, sorry about that.
Antti Koivisto
Comment 16 2014-01-07 12:34:36 PST
Here is the problem. We get WebFrameLoaderClient::dispatchDidFinishLoad (because at that point we have nothing to load) and that triggers plugin fallback load via DRT. The same call also dumps the render tree before the new load completes. (lldb) bt * thread #1: tid = 0x45b45b, 0x00000001099a5160 WebCore`WebCore::HTMLPlugInImageElement::startLoadingImage(this=0x00007f95b14a3610) + 16 at HTMLPlugInImageElement.cpp:334, queue = 'com.apple.main-thread, stop reason = breakpoint 25.1 frame #0: 0x00000001099a5160 WebCore`WebCore::HTMLPlugInImageElement::startLoadingImage(this=0x00007f95b14a3610) + 16 at HTMLPlugInImageElement.cpp:334 frame #1: 0x00000001099a4e30 WebCore`WebCore::HTMLPlugInImageElement::startLoadingImageCallback(node=0x00007f95b14a3610, =0) + 32 at HTMLPlugInImageElement.cpp:341 frame #2: 0x00000001091e754d WebCore`WebCore::ContainerNode::dispatchPostAttachCallbacks() + 189 at ContainerNode.cpp:813 frame #3: 0x00000001091e7400 WebCore`WebCore::ContainerNode::resumePostAttachCallbacks(document=0x00007f95b1862000) + 64 at ContainerNode.cpp:780 frame #4: 0x000000010916eb08 WebCore`WebCore::PostAttachCallbackDisabler::~PostAttachCallbackDisabler(this=0x00007fff58c0cdf0) + 24 at Element.h:825 frame #5: 0x000000010916eae5 WebCore`WebCore::PostAttachCallbackDisabler::~PostAttachCallbackDisabler(this=0x00007fff58c0cdf0) + 21 at Element.h:824 frame #6: 0x000000010948cd0e WebCore`WebCore::Document::recalcStyle(this=0x00007f95b1862000, change=NoChange) + 574 at Document.cpp:1774 frame #7: 0x000000010948946f WebCore`WebCore::Document::updateStyleIfNeeded(this=0x00007f95b1862000) + 431 at Document.cpp:1809 frame #8: 0x000000010978e3af WebCore`WebCore::FrameView::updateLayoutAndStyleIfNeededRecursive(this=0x00007f95b1520960) + 47 at FrameView.cpp:3722 frame #9: 0x000000010897bb00 WebKit`-[WebHTMLView(self=0x00007f95b1522370, _cmd=0x00007fff919a94f6) _web_updateLayoutAndStyleIfNeededRecursive] + 112 at WebHTMLView.mm:5970 frame #10: 0x0000000108964842 WebKit`-[WebHTMLView(self=0x00007f95b1522370, _cmd=0x00007fff8efce8a3) viewWillDraw] + 98 at WebHTMLView.mm:1420 frame #11: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126 frame #12: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126 frame #13: 0x00007fff8e6e7774 AppKit`-[NSScrollView viewWillDraw] + 51 frame #14: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126 frame #15: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126 frame #16: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126 frame #17: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126 frame #18: 0x00007fff8e6e5f14 AppKit`-[NSView _sendViewWillDrawInRect:clipRootView:] + 1423 frame #19: 0x00007fff8e6c83a7 AppKit`-[NSView displayIfNeeded] + 1021 frame #20: 0x000000010701c35b DumpRenderTree`-[FrameLoadDelegate webView:didFinishLoadForFrame:](self=0x00007f95b1416b20, _cmd=0x00007fff8effa614, sender=0x00007f95b143cd90, frame=0x00007f95b15007d0) + 443 at FrameLoadDelegate.mm:269 frame #21: 0x0000000108902d8e WebKit`objc_object* wtfCallIMP<objc_object*, WebView*, objc_object*>(implementation=0x000000010701c1a0, target=0x00007f95b1416b20, selector=0x00007fff8effa614, arguments=0x00007f95b143cd90, arguments=0x00007f95b15007d0)(objc_object*, objc_selector*, ...), objc_object*, objc_selector, WebView*, objc_object*) + 62 at ObjcRuntimeExtras.h:44 frame #22: 0x0000000108900e60 WebKit`CallDelegate(implementation=0x000000010701c1a0, self=0x00007f95b143cd90, delegate=0x00007f95b1416b20, selector=0x00007fff8effa614, object=0x00007f95b15007d0)(objc_object*, objc_selector*, ...), WebView*, objc_object*, objc_selector, objc_object*) + 80 at WebDelegateImplementationCaching.mm:595 frame #23: 0x0000000108900dfd WebKit`CallFrameLoadDelegate(implementation=0x000000010701c1a0, self=0x00007f95b143cd90, selector=0x00007fff8effa614, object=0x00007f95b15007d0)(objc_object*, objc_selector*, ...), WebView*, objc_selector, objc_object*) + 77 at WebDelegateImplementationCaching.mm:1148 frame #24: 0x000000010892976c WebKit`WebFrameLoaderClient::dispatchDidFinishLoad(this=0x00007f95b143c100) + 300 at WebFrameLoaderClient.mm:859 frame #25: 0x000000010975648f WebCore`WebCore::FrameLoader::checkLoadCompleteForThisFrame(this=0x00007f95b143c828) + 1711 at FrameLoader.cpp:2302 frame #26: 0x000000010974f004 WebCore`WebCore::FrameLoader::checkLoadComplete(this=0x00007f95b143c828) + 324 at FrameLoader.cpp:2474
Antti Koivisto
Comment 17 2014-01-07 13:59:21 PST
Alexey Proskuryakov
Comment 18 2014-01-07 15:37:33 PST
EWS couldn't fully process this patch yet because of flakiness in media tests, but looks like it introduces a failure: Regressions: Unexpected text-only failures (1) fast/overflow/overflow-height-float-not-removed-crash3.html [ Failure ]
Build Bot
Comment 19 2014-01-07 15:38:05 PST
Comment on attachment 220546 [details] patch Attachment 220546 [details] did not pass mac-ews (mac): Output: http://webkit-queues.appspot.com/results/5753116477620224 New failing tests: fast/overflow/overflow-height-float-not-removed-crash3.html
Build Bot
Comment 20 2014-01-07 15:38:09 PST
Created attachment 220565 [details] Archive of layout-test-results from webkit-ews-06 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-06 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Alexey Proskuryakov
Comment 21 2014-01-07 15:39:27 PST
Good timing :) It wasted a lot of time due to flakiness though: Fail 0 minutes ago Patch does not pass tests [results] 10 minutes ago Patch does not pass tests [results] 18 minutes ago Patch does not pass tests [results] 1 hour, 8 minutes ago Patch does not pass tests [results] 1 hour, 17 minutes ago
Build Bot
Comment 22 2014-01-07 19:51:59 PST
Comment on attachment 220546 [details] patch Attachment 220546 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.appspot.com/results/5413760239927296 New failing tests: fast/overflow/overflow-height-float-not-removed-crash3.html
Build Bot
Comment 23 2014-01-07 19:52:02 PST
Created attachment 220584 [details] Archive of layout-test-results from webkit-ews-15 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-15 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Antti Koivisto
Comment 24 2014-01-07 23:29:43 PST
Note You need to log in before you can comment on or make changes to this bug.