When running $ run-webkit-tests -2 --efl --release fast/block on my Ubuntu 12.04 Installation, on a Dell XPS 3800 with GL configuration as follows: OpenGL vendor string: Intel Open Source Technology Center OpenGL renderer string: Mesa DRI Intel(R) Sandybridge Desktop OpenGL version string: 3.0 Mesa 9.0 OpenGL shading language version string: 1.30 Tests start to stall after about 8 seconds, or 320 test cases. And I start to see "Timed out waiting for repaint" in the test runner dumps of the single runs.
I should mention, this is with subpixel layout enabled. Will try without.
(In reply to comment #1) > I should mention, this is with subpixel layout enabled. Will try without. That doesn't make a difference, current trunk shows the same issues, in Debug as well as Release. That's on an Ubuntu 12.10 64bit system.
Mhm, much easier, just: $ run-webkit-tests -2 --efl --debug fast/block/abspos-child-container-changes-from-relative-to-static.html fails to repaint for me.
When running fast/block/abspos-child-container-changes-from-relative-to-static.html WebKitTestRunner forces a repaint on the WebProcess but we end up in a deadlock and WTR detects a repaint failure in dumpPixelsAndCompareWithExpected. As far as I can see it, this happens because, first CoordinatedLayerTreeHost::forceRepaintAsync is called, which then calls scheduleLayerFlush() - after this timer fires: performScheduledLayerFlush() is called, but here: m_waitingForUIProcess is true and the method returns early, while the UIProcess is waiting for the WebProcess.
Created attachment 182751 [details] Patch
FYI, I could not reproduce the issue on my machine. I tried: Tools/Scripts/run-webkit-tests --debug --efl -2 fast/block/abspos-child-container-changes-from-relative-to-static.html
(In reply to comment #6) > FYI, I could not reproduce the issue on my machine. I tried: > Tools/Scripts/run-webkit-tests --debug --efl -2 fast/block/abspos-child-container-changes-from-relative-to-static.html It was 100% reproducible on my machine - maybe timing differences in Ubuntu 12.10 vs. 12.04.
I think we should use [CoordinatedGraphics] tag instead of [EFL]. Otherwise, the patch makes sense to me.
Created attachment 182755 [details] Patch v2, updated bug title.
Comment on attachment 182755 [details] Patch v2, updated bug title. This is fine with me but I don't get to review patches in WebKit2 anymore. Please find a WebKit2 owner.
Created attachment 182772 [details] Patch v3, grammar in ChangeLog fixed.
(In reply to comment #10) > (From update of attachment 182755 [details]) > This is fine with me but I don't get to review patches in WebKit2 anymore. Please find a WebKit2 owner. Doesn't it fit into the "trivial" category? http://lists.webkit.org/pipermail/webkit-dev/2013-January/023263.html
Comment on attachment 182772 [details] Patch v3, grammar in ChangeLog fixed. Attachment 182772 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/15841996 New failing tests: inspector-protocol/debugger-terminate-dedicated-worker-while-paused.html
(In reply to comment #13) > (From update of attachment 182772 [details]) > Attachment 182772 [details] did not pass chromium-ews (chromium-xvfb): > Output: http://queues.webkit.org/results/15841996 > > New failing tests: > inspector-protocol/debugger-terminate-dedicated-worker-while-paused.html I think that's an artifact, I am changing WebKit2 code that's not run by Chromium tests and the previous two EWS runs showed green.
Comment on attachment 182772 [details] Patch v3, grammar in ChangeLog fixed. Sure, if No'am says okay :)
(In reply to comment #15) > (From update of attachment 182772 [details]) > Sure, if No'am says okay :) Thanks, Benjamin.
Comment on attachment 182772 [details] Patch v3, grammar in ChangeLog fixed. Clearing flags on attachment: 182772 Committed r139781: <http://trac.webkit.org/changeset/139781>
All reviewed patches have been landed. Closing bug.
I still needed http://cgit.freedesktop.org/xorg/xserver/patch/?id=65642ccb78aa2d4c4e17b9ac42e4ef625c4a6e8b to my xvfb package in order to be able to generate pixel results.
(In reply to comment #19) > I still needed http://cgit.freedesktop.org/xorg/xserver/patch/?id=65642ccb78aa2d4c4e17b9ac42e4ef625c4a6e8b to my xvfb package in order to be able to generate pixel results. Has this bug been reported against Ubuntu so that they consider applying the patch to their package?
(In reply to comment #20) > (In reply to comment #19) > > I still needed http://cgit.freedesktop.org/xorg/xserver/patch/?id=65642ccb78aa2d4c4e17b9ac42e4ef625c4a6e8b to my xvfb package in order to be able to generate pixel results. > > Has this bug been reported against Ubuntu so that they consider applying the patch to their package? Yes, just now, https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/1100312
(In reply to comment #4) > When running fast/block/abspos-child-container-changes-from-relative-to-static.html WebKitTestRunner forces a repaint on the WebProcess but we end up in a deadlock and WTR detects a repaint failure in dumpPixelsAndCompareWithExpected. > > As far as I can see it, this happens because, first > CoordinatedLayerTreeHost::forceRepaintAsync is called, > which then calls scheduleLayerFlush() - after this timer fires: > performScheduledLayerFlush() is called, but here: > m_waitingForUIProcess is true and the method returns early, while the UIProcess is waiting for the WebProcess. This timeout was happening on other ports like Qt?
I believe this is a wrong fix for a EFL specific problem. UIProcess can expect the callback to arrive at some time, but it should never really wait for it (on CoordinatedGraphics). UIProcess has to keep up with its main loop so things like "viewNeedsDisplay" will be processed and then unlock the WebProcess to do the render of next frame. Once the next frame is rendered and the flush happens on WebProcess side, then the callback will be sent back to UIProcess. This is the only way to guarantee that the rendered frame is the one after the forceRepaint was requested. We've been observing some flakyness on our API tests after this patch and then I've observed TestController::platformRunUntil of EFL: if (timeout == m_noTimeout) { // Never timeout if we are debugging or not meant to timeout. while (!condition) { ecore_main_loop_iterate(); sleep(1); } return; } This sleep seems to be the source of the problem. You should keep up the main loop running in a non-blocking way, like Qt does. Could we rollout this change or make the proper fix soon?
I think you are right. Dominik?
(In reply to comment #24) > I think you are right. > Dominik? Agree that it has a different root cause. I can look into rolling out and doing an EFL fix later on today - thanks Rafael for pointing this out. If you wanna roll out earlier, feel free to go ahead and CC me on the rollout patch.
(In reply to comment #23) > I believe this is a wrong fix for a EFL specific problem. UIProcess can expect the callback to arrive at some time, but it should never really wait for it (on CoordinatedGraphics). UIProcess has to keep up with its main loop so things like "viewNeedsDisplay" will be processed and then unlock the WebProcess to do the render of next frame. Once the next frame is rendered and the flush happens on WebProcess side, then the callback will be sent back to UIProcess. This is the only way to guarantee that the rendered frame is the one after the forceRepaint was requested. > > We've been observing some flakyness on our API tests after this patch and then I've observed TestController::platformRunUntil of EFL: > > if (timeout == m_noTimeout) { > // Never timeout if we are debugging or not meant to timeout. > while (!condition) { > ecore_main_loop_iterate(); > sleep(1); > } > return; > } > > This sleep seems to be the source of the problem. You should keep up the main loop running in a non-blocking way, like Qt does. Could we rollout this change or make the proper fix soon? I'm removing the sleep() instruction in Bug 107442. I would like this fix to land before we roll out.
Re-opened since this is blocked by bug 107443
(In reply to comment #23) > This sleep seems to be the source of the problem. You should keep up the main loop running in a non-blocking way, like Qt does. Could we rollout this change or make the proper fix soon? Very nice catch!
Closing since it should be fixed by Bug 107442 (dependency).