Bug 126432 - REGRESSION (r161195): Acid2 regression tests frequently fail
Summary: REGRESSION (r161195): Acid2 regression tests frequently fail
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P1 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2014-01-03 09:42 PST by Alexey Proskuryakov
Modified: 2014-01-07 23:29 PST (History)
9 users (show)

See Also:


Attachments
patch (3.00 KB, patch)
2014-01-07 13:59 PST, Antti Koivisto
andersca: review+
buildbot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from webkit-ews-06 for mac-mountainlion (540.42 KB, application/zip)
2014-01-07 15:38 PST, Build Bot
no flags Details
Archive of layout-test-results from webkit-ews-15 for mac-mountainlion-wk2 (454.94 KB, application/zip)
2014-01-07 19:52 PST, Build Bot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Proskuryakov 2014-01-03 09:42:44 PST
fast/css/acid2-pixel.html and fast/css/acid2.html started to frequently fail on January 1st. Looks like OBJECT content fails to load:

@@ -58,9 +58,9 @@
 layer at (84,2710) size 144x24
   RenderBlock (positioned) {DIV} at (48,72) size 144x24 [bgcolor=#FF0000]
     RenderBlock {DIV} at (0,0) size 144x0
-      RenderInline {OBJECT} at (0,0) size 131x14
-        RenderInline {OBJECT} at (0,0) size 131x14
-          RenderImage {OBJECT} at (13,0) size 131x24 [border: none (12px solid #000000) none]
+      RenderInline {OBJECT} at (0,0) size 35x14
+        RenderInline {OBJECT} at (0,0) size 35x14
+          RenderImage {OBJECT} at (109,24) size 35x0 [border: none (12px solid #000000) none]
     RenderBlock (floating) {DIV} at (0,0) size 144x24 [border: none (12px solid #FF0000) none (12px solid #000000)]
     RenderBlock {DIV} at (0,0) size 144x24 [border: none (24px solid #FFFF00)]
 layer at (84,2782) size 144x24

http://build.webkit.org/results/Apple%20Mavericks%20Debug%20WK2%20(Tests)/r161259%20(1313)/results.html
Comment 1 Alexey Proskuryakov 2014-01-03 09:42:59 PST
<rdar://problem/15744279>
Comment 2 Andreas Kling 2014-01-03 10:44:40 PST
My guess would be <http://trac.webkit.org/changeset/161208>
Comment 3 Andreas Kling 2014-01-03 10:46:05 PST
(In reply to comment #2)
> My guess would be <http://trac.webkit.org/changeset/161208>

Hrm. Or maybe <http://trac.webkit.org/changeset/161195>

Antti?
Comment 4 Alexey Proskuryakov 2014-01-03 10:50:08 PST
Unless I'm fooled by time zones, r161208 was after this started, so r161195 seems more likely.
Comment 5 Antti Koivisto 2014-01-04 09:19:26 PST
Surely r161195 as it made plugin loads fail asynchronously in all cases. Acid2 render tree snapshot is probably just taken before things have truly settled.
Comment 6 Alexey Proskuryakov 2014-01-06 09:32:10 PST
Antti, what is the next step here? Marking Acid2 as flaky would be a silly thing to do, as this basically means that we ignore its results unless it starts to crash.

This has been known for several days now, with no action. Rollout time?
Comment 7 Antti Koivisto 2014-01-06 10:01:08 PST
(In reply to comment #6)
> Antti, what is the next step here? Marking Acid2 as flaky would be a silly thing to do, as this basically means that we ignore its results unless it starts to crash.
> 
> This has been known for several days now, with no action. Rollout time?

Probably just need to change the snapshot timings either by waitUntilDone or changes to the test framework. I doubt synchronous failures are important for web compatibility. I'll take a look tomorrow when I'm back to work. Please don't roll out anything.
Comment 8 Alexey Proskuryakov 2014-01-06 14:10:38 PST
> Probably just need to change the snapshot timings either by waitUntilDone or changes to the test framework. I doubt synchronous failures are important for web compatibility. I'll take a look tomorrow when I'm back to work.

Thank you for letting me know that you are not going to work on this until tomorrow. I marked the tests as flaky in <http://trac.webkit.org/r161371>.

It would have helped if you could state it earlier that you did not intend to do anything about the regressions for several days.

> Please don't roll out anything.

Why?

There needs to be a specific reason to keep brokenness in the tree. This situation is an example of why immediate rollout is almost always best - I should have done that three days ago, so that we wouldn't have been left in a broken state for so long.
Comment 9 Antti Koivisto 2014-01-07 08:18:58 PST
(In reply to comment #8)
> Why?
> 
> There needs to be a specific reason to keep brokenness in the tree. This situation is an example of why immediate rollout is almost always best - I should have done that three days ago, so that we wouldn't have been left in a broken state for so long.

- The problem does not reproduce locally so I'll have to try landing fixes for bots to test. Repeatedly relanding the original creates churn and confusing history. It may break other peoples work-in-progress patches.
- One ~5-10% flaky test doesn't really affect anyone else's work much.
Comment 10 Alexey Proskuryakov 2014-01-07 08:44:38 PST
> > There needs to be a specific reason to keep brokenness in the tree.

> - The problem does not reproduce locally so I'll have to try landing fixes for bots to test. 

This is one specific reason that could validate keeping the tree in bad state while fixing it. It also almost certainly doesn't apply here - the problem is well localized, and should be easy to attack locally. Some ideas:

- run the tests many times in fully parallel mode, forcing heavy CPU use;

- change the delay from 0 to some positive value.

> Repeatedly remanding the original creates churn and confusing history. It may break other peoples work-in-progress patches.

These general reasons are valid, but quite weak. Any and all changes landed can break others' work in progress patches, and that's not a reason to halt all work. Also, the faster you roll out, the less likely it is to affect anyone.

> - One ~5-10% flaky test doesn't really affect anyone else's work much.

This is both incorrect, and not applicable in this case.

1. In this case, there are 4 flaky tests, each with a much higher rate of failure.

2. There are dozens engineers on the project, and if everyone felt entitled to cause 5-10% flakiness on tests, it would not be a project worth working on.

3. Flaky tests slow down EWS a lot, and there really isn't a way around that. When we only have one test with 5% failure rate, that means that 5% of patches take at least 3x time as long to be processed.
Comment 11 Antti Koivisto 2014-01-07 09:13:24 PST
(In reply to comment #10)
> This is one specific reason that could validate keeping the tree in bad state while fixing it. It also almost certainly doesn't apply here - the problem is well localized, and should be easy to attack locally. Some ideas:

It is easy to declare something easy. How about actual local repro steps?

> - run the tests many times in fully parallel mode, forcing heavy CPU use;

I tried run-webkit-tests --debug fast/css/ --iterations=1000.

> - change the delay from 0 to some positive value.

Which delay?

> 1. In this case, there are 4 flaky tests, each with a much higher rate of failure.

The bug mentions two copies of acid2. What are the other two?
Comment 12 Antti Koivisto 2014-01-07 10:23:31 PST
Actually I have a trivial repro

run-webkit-tests --debug fast/css/acid2.html -2

The bug is (at least almost) wk2 only.
Comment 13 Antti Koivisto 2014-01-07 10:33:24 PST
Almost. Two WK1 bots have flaked once over last 5 days if I read this correctly but it is the same false result.
Comment 14 Alexey Proskuryakov 2014-01-07 10:40:16 PST
It is slightly easier to reproduce with WebKit2, but WebKit1 is also affected. I believe you now saw that on flakiness dashboard (but there are also failures on http versions of the tests, so it's more than "once").

On a Mac Pro, I could reproduce the WebKit1 failure very reliably by running this command in two terminal windows at once to create high CPU load. Likely easier on a notebook:

run-webkit-tests fast/css/acid2-pixel.html --repeat-each 10000 -f

So yes, there are many tasks in WebKit development that are much harder than reproducing a failure on an isolated test, no need to be snarky about it.

> > - change the delay from 0 to some positive value.
> 
> Which delay?

Your patch made object load failing asynchronous, i.e. use a zero delay timer. You could change the timer to a non-zero one to simulate stress conditions. This is a fairly common technique.

> > 1. In this case, there are 4 flaky tests, each with a much higher rate of failure.
> 
> The bug mentions two copies of acid2. What are the other two?

Those are http versions of the same tests.

You could see that by looking at <http://trac.webkit.org/r161371> that's mentioned in comment 8. Another way to find put what's going on is to check the flakiness dashboard for "acid2": <http://webkit-test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=acid2&revision=153686>.
Comment 15 Antti Koivisto 2014-01-07 11:00:16 PST
> Your patch made object load failing asynchronous, i.e. use a zero delay timer. You could change the timer to a non-zero one to simulate stress conditions. This is a fairly common technique.

That's a good idea and this indeed makes it repro consistently with WK1:

--- Source/WebCore/dom/Document.cpp	(revision 161419)
+++ Source/WebCore/dom/Document.cpp	(working copy)
@@ -1682,7 +1682,7 @@
     // FIXME: Why on earth is this here? This is clearly misplaced.
     invalidateAccessKeyMap();
     
-    m_styleRecalcTimer.startOneShot(0);
+    m_styleRecalcTimer.startOneShot(0.1);
 
     InspectorInstrumentation::didScheduleStyleRecalculation(this);
 }

I thought you were talking about some DRT option, sorry about that.
Comment 16 Antti Koivisto 2014-01-07 12:34:36 PST
Here is the problem. We get WebFrameLoaderClient::dispatchDidFinishLoad (because at that point we have nothing to load) and that triggers plugin fallback load via DRT. The same call also dumps the render tree before the new load completes.

(lldb) bt
* thread #1: tid = 0x45b45b, 0x00000001099a5160 WebCore`WebCore::HTMLPlugInImageElement::startLoadingImage(this=0x00007f95b14a3610) + 16 at HTMLPlugInImageElement.cpp:334, queue = 'com.apple.main-thread, stop reason = breakpoint 25.1
    frame #0: 0x00000001099a5160 WebCore`WebCore::HTMLPlugInImageElement::startLoadingImage(this=0x00007f95b14a3610) + 16 at HTMLPlugInImageElement.cpp:334
    frame #1: 0x00000001099a4e30 WebCore`WebCore::HTMLPlugInImageElement::startLoadingImageCallback(node=0x00007f95b14a3610, =0) + 32 at HTMLPlugInImageElement.cpp:341
    frame #2: 0x00000001091e754d WebCore`WebCore::ContainerNode::dispatchPostAttachCallbacks() + 189 at ContainerNode.cpp:813
    frame #3: 0x00000001091e7400 WebCore`WebCore::ContainerNode::resumePostAttachCallbacks(document=0x00007f95b1862000) + 64 at ContainerNode.cpp:780
    frame #4: 0x000000010916eb08 WebCore`WebCore::PostAttachCallbackDisabler::~PostAttachCallbackDisabler(this=0x00007fff58c0cdf0) + 24 at Element.h:825
    frame #5: 0x000000010916eae5 WebCore`WebCore::PostAttachCallbackDisabler::~PostAttachCallbackDisabler(this=0x00007fff58c0cdf0) + 21 at Element.h:824
    frame #6: 0x000000010948cd0e WebCore`WebCore::Document::recalcStyle(this=0x00007f95b1862000, change=NoChange) + 574 at Document.cpp:1774
    frame #7: 0x000000010948946f WebCore`WebCore::Document::updateStyleIfNeeded(this=0x00007f95b1862000) + 431 at Document.cpp:1809
    frame #8: 0x000000010978e3af WebCore`WebCore::FrameView::updateLayoutAndStyleIfNeededRecursive(this=0x00007f95b1520960) + 47 at FrameView.cpp:3722
    frame #9: 0x000000010897bb00 WebKit`-[WebHTMLView(self=0x00007f95b1522370, _cmd=0x00007fff919a94f6) _web_updateLayoutAndStyleIfNeededRecursive] + 112 at WebHTMLView.mm:5970
    frame #10: 0x0000000108964842 WebKit`-[WebHTMLView(self=0x00007f95b1522370, _cmd=0x00007fff8efce8a3) viewWillDraw] + 98 at WebHTMLView.mm:1420
    frame #11: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126
    frame #12: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126
    frame #13: 0x00007fff8e6e7774 AppKit`-[NSScrollView viewWillDraw] + 51
    frame #14: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126
    frame #15: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126
    frame #16: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126
    frame #17: 0x00007fff8e6e72a1 AppKit`-[NSView viewWillDraw] + 1126
    frame #18: 0x00007fff8e6e5f14 AppKit`-[NSView _sendViewWillDrawInRect:clipRootView:] + 1423
    frame #19: 0x00007fff8e6c83a7 AppKit`-[NSView displayIfNeeded] + 1021
    frame #20: 0x000000010701c35b DumpRenderTree`-[FrameLoadDelegate webView:didFinishLoadForFrame:](self=0x00007f95b1416b20, _cmd=0x00007fff8effa614, sender=0x00007f95b143cd90, frame=0x00007f95b15007d0) + 443 at FrameLoadDelegate.mm:269
    frame #21: 0x0000000108902d8e WebKit`objc_object* wtfCallIMP<objc_object*, WebView*, objc_object*>(implementation=0x000000010701c1a0, target=0x00007f95b1416b20, selector=0x00007fff8effa614, arguments=0x00007f95b143cd90, arguments=0x00007f95b15007d0)(objc_object*, objc_selector*, ...), objc_object*, objc_selector, WebView*, objc_object*) + 62 at ObjcRuntimeExtras.h:44
    frame #22: 0x0000000108900e60 WebKit`CallDelegate(implementation=0x000000010701c1a0, self=0x00007f95b143cd90, delegate=0x00007f95b1416b20, selector=0x00007fff8effa614, object=0x00007f95b15007d0)(objc_object*, objc_selector*, ...), WebView*, objc_object*, objc_selector, objc_object*) + 80 at WebDelegateImplementationCaching.mm:595
    frame #23: 0x0000000108900dfd WebKit`CallFrameLoadDelegate(implementation=0x000000010701c1a0, self=0x00007f95b143cd90, selector=0x00007fff8effa614, object=0x00007f95b15007d0)(objc_object*, objc_selector*, ...), WebView*, objc_selector, objc_object*) + 77 at WebDelegateImplementationCaching.mm:1148
    frame #24: 0x000000010892976c WebKit`WebFrameLoaderClient::dispatchDidFinishLoad(this=0x00007f95b143c100) + 300 at WebFrameLoaderClient.mm:859
    frame #25: 0x000000010975648f WebCore`WebCore::FrameLoader::checkLoadCompleteForThisFrame(this=0x00007f95b143c828) + 1711 at FrameLoader.cpp:2302
    frame #26: 0x000000010974f004 WebCore`WebCore::FrameLoader::checkLoadComplete(this=0x00007f95b143c828) + 324 at FrameLoader.cpp:2474
Comment 17 Antti Koivisto 2014-01-07 13:59:21 PST
Created attachment 220546 [details]
patch
Comment 18 Alexey Proskuryakov 2014-01-07 15:37:33 PST
EWS couldn't fully process this patch yet because of flakiness in media tests, but looks like it introduces a failure:

Regressions: Unexpected text-only failures (1)
  fast/overflow/overflow-height-float-not-removed-crash3.html [ Failure ]
Comment 19 Build Bot 2014-01-07 15:38:05 PST
Comment on attachment 220546 [details]
patch

Attachment 220546 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.appspot.com/results/5753116477620224

New failing tests:
fast/overflow/overflow-height-float-not-removed-crash3.html
Comment 20 Build Bot 2014-01-07 15:38:09 PST
Created attachment 220565 [details]
Archive of layout-test-results from webkit-ews-06 for mac-mountainlion

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-06  Port: mac-mountainlion  Platform: Mac OS X 10.8.5
Comment 21 Alexey Proskuryakov 2014-01-07 15:39:27 PST
Good timing :)

It wasted a lot of time due to flakiness though:

Fail 0 minutes ago
Patch does not pass tests [results] 10 minutes ago
Patch does not pass tests [results] 18 minutes ago
Patch does not pass tests [results] 1 hour, 8 minutes ago
Patch does not pass tests [results] 1 hour, 17 minutes ago
Comment 22 Build Bot 2014-01-07 19:51:59 PST
Comment on attachment 220546 [details]
patch

Attachment 220546 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/5413760239927296

New failing tests:
fast/overflow/overflow-height-float-not-removed-crash3.html
Comment 23 Build Bot 2014-01-07 19:52:02 PST
Created attachment 220584 [details]
Archive of layout-test-results from webkit-ews-15 for mac-mountainlion-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-15  Port: mac-mountainlion-wk2  Platform: Mac OS X 10.8.5
Comment 24 Antti Koivisto 2014-01-07 23:29:43 PST
https://trac.webkit.org/r161484