WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
100686
WebKitTestRunner flakily hangs when running pixel tests on Qt
https://bugs.webkit.org/show_bug.cgi?id=100686
Summary
WebKitTestRunner flakily hangs when running pixel tests on Qt
Balazs Kelemen
Reported
2012-10-29 10:11:42 PDT
Well, this is really just a try to catch the issue that makes the pixel tests timing out on the bot sometimes. My theory is that if a forced repaint times out, than it can trigger later when we expect it for a new test and it can confuse our machinery. Actually I could not prove that this is happening becase I could not reproduce the problem locally, but it cannot hurt if we add a check for this.
Attachments
Patch
(2.64 KB, patch)
2012-10-29 10:18 PDT
,
Balazs Kelemen
no flags
Details
Formatted Diff
Diff
Patch
(8.00 KB, patch)
2012-11-06 07:23 PST
,
Balazs Kelemen
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Balazs Kelemen
Comment 1
2012-10-29 10:18:58 PDT
Created
attachment 171276
[details]
Patch
WebKit Review Bot
Comment 2
2012-10-29 10:20:36 PDT
Attachment 171276
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Tools/ChangeLog', u'Tools/WebKitTestRunner..." exit_code: 1 Tools/WebKitTestRunner/qt/TestInvocationQt.cpp:75: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] Total errors found: 1 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Balazs Kelemen
Comment 3
2012-10-29 10:23:03 PDT
(In reply to
comment #2
)
>
Attachment 171276
[details]
did not pass style-queue: > > Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Tools/ChangeLog', u'Tools/WebKitTestRunner..." exit_code: 1 > Tools/WebKitTestRunner/qt/TestInvocationQt.cpp:75: Weird number of spaces at line-start. Are you using a 4-space indent? [whitespace/indent] [3] > Total errors found: 1 in 2 files > > > If any of these errors are false positives, please file a bug against check-webkit-style.
check-webkit-style is complaining here: fputs("ERROR: unexpected forceRepaintDone callback triggered.\n" " It is likely the result of a timeouted repaint form a previous test.\n", stderr); So it expects the second line to be indented one space less, but I think it would be ugly.
Balazs Kelemen
Comment 4
2012-10-31 02:51:25 PDT
Comment on
attachment 171276
[details]
Patch According to my testing it will not be enough to eliminate the flakiness.
Balazs Kelemen
Comment 5
2012-10-31 03:13:51 PDT
Let's concentrate on the real bug! As one can see from the history of the bot (
http://build.webkit.sed.hu/builders/x86-64%20Linux%20Qt%20Release%20WebKit2%20%28Pixel%20Tests%29
) our pixel tests are sometimes start to fail at a given point and than every test is failing, i.e. we timeout waiting for the callback of forceRepaint. It's very bad since in this case the bot have to be stopped manually, otherwise it can run for a whole day. I don't know the exact reason, but what I can tell is that for some reason we don't get the didrenderframe from the web process although it does send it, so we will never push a flushLayerChange operation in the queue of LayerTreeRenderer and never send rendernextframe again and the web process will wait for it forever without rendering a single frame. As I am not too sure how difficult it would be to find the reason behind not getting the message (note that you need to run all tests about 10 times in average even to reproduce it), currently I am thinking about adding yet another private API for the time being for forcing a flushLayerChange. At the first time we timeout we can use it and I believe it would move us towards for the next test. Now I am going to test whether it works.
Jocelyn Turcotte
Comment 6
2012-10-31 05:59:01 PDT
It's best if we can fix this without blindly pushing fixes that add unneeded complexity until it works. If you can reproduce the problem somehow locally or on a bot without having to push changes, you could try to see if this is a result of Qt triggering a cleanup of the nodes, i.e. if ~ContentsSGNode gets called. There could be issues around there where the web and UI processes would be waiting for each other.
Balazs Kelemen
Comment 7
2012-11-05 01:16:08 PST
(In reply to
comment #6
)
> It's best if we can fix this without blindly pushing fixes that add unneeded complexity until it works. If you can reproduce the problem somehow locally or on a bot without having to push changes, you could try to see if this is a result of Qt triggering a cleanup of the nodes, i.e. if ~ContentsSGNode gets called. >
I can reproduce it, about once a day :/ Actually the only thing I can do is to add fprintf(stderr,...)'s here and there and watch the output at of the test that timed out and some previous one.
> There could be issues around there where the web and UI processes would be waiting for each other.
As I understand, ~ContentsSGNode is handled fine, i.e. we throw away all queued update from the LayerTreeRenderer queue, and when we attached to a new node we send a rendernextframe (unconditionally). Ownerships are also seems ok. So I don't see anything that could stop the rendering and syncing loop. First we do an update upon window->show() that makes us send the first rendernextframe, and every didrenderframe triggers a flushLayerChanges in LayerTreeRenderer that also ends in rendernextframe. Anyway, I'm going to try reproducing digging more info via printf's further.
Balazs Kelemen
Comment 8
2012-11-06 07:23:45 PST
Created
attachment 172580
[details]
Patch
Balazs Kelemen
Comment 9
2012-11-06 07:27:18 PST
(In reply to
comment #8
)
> Created an attachment (id=172580) [details] > Patch
This is also a workaround, but I believe a more acceptable one :)
WebKit Review Bot
Comment 10
2012-11-06 11:27:58 PST
Comment on
attachment 172580
[details]
Patch
Attachment 172580
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/14761048
New failing tests: inspector-protocol/debugger-terminate-dedicated-worker-while-paused.html
Csaba Osztrogonác
Comment 11
2012-11-07 07:52:17 PST
Comment on
attachment 172580
[details]
Patch LGTM, r=me. At least the following tests will pass if this flakiness occur.
Balazs Kelemen
Comment 12
2012-11-07 09:33:36 PST
Comment on
attachment 172580
[details]
Patch Clearing flags on attachment: 172580 Committed
r133768
: <
http://trac.webkit.org/changeset/133768
>
Balazs Kelemen
Comment 13
2012-11-07 09:33:41 PST
All reviewed patches have been landed. Closing bug.
Alexey Proskuryakov
Comment 14
2013-04-29 17:20:18 PDT
Comment on
attachment 172580
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=172580&action=review
> Tools/WebKitTestRunner/TestController.cpp:697 > - TestInvocation::dumpWebProcessUnresponsiveness("Failed to reset to consistent state before the first test"); > + m_currentInvocation->dumpWebProcessUnresponsiveness();
This change was not correct - there is no current invocation at this point yet. So, WebKitTestRunner just crashes when a debugger is attached to WebProcess before the first resetStateToConsistentValues().
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug