WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 144155
fast/frames/flattening/iframe-flattening-resize-event-count.html times out on Yosemite WK2
https://bugs.webkit.org/show_bug.cgi?id=144155
Summary
fast/frames/flattening/iframe-flattening-resize-event-count.html times out on...
Alexey Proskuryakov
Reported
2015-04-24 11:12:48 PDT
fast/frames/flattening/iframe-flattening-resize-event-count.html very frequently times out on Yosemite WK2 bot. It's quite fast when it doesn't time out.
Attachments
add some logging
(2.45 KB, patch)
2015-04-24 11:52 PDT
,
Alexey Proskuryakov
andersca
: review+
Details
Formatted Diff
Diff
disable App Nap
(1.58 KB, patch)
2015-04-27 12:14 PDT
,
Alexey Proskuryakov
thorton
: review+
Details
Formatted Diff
Diff
unhide the window between tests
(1.49 KB, patch)
2015-04-29 00:46 PDT
,
Alexey Proskuryakov
kling
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Alexey Proskuryakov
Comment 1
2015-04-24 11:52:31 PDT
Created
attachment 251563
[details]
add some logging
Alexey Proskuryakov
Comment 2
2015-04-24 11:56:52 PDT
Landed in
r183272
. This can't fix the issue though.
Alexey Proskuryakov
Comment 3
2015-04-25 12:25:57 PDT
The logging is interesting! Every time the test fails, it only fires the timer exactly 7 times instead of 100. This makes me suspect that nested timer throttling goes wrong, and stops the timer completely. Note that the new 20-second watchdog timer fires anyway, so the process is not frozen, only the 0-delay timer is. I cannot reproduce locally yet.
Alexey Proskuryakov
Comment 4
2015-04-25 13:21:57 PDT
Added more temporary logging for timers in
r183312
.
Alexey Proskuryakov
Comment 5
2015-04-25 19:25:19 PDT
The logging says: CONSOLE MESSAGE: line 36: Test timed out. Resized 7 times, got 7 events. CONSOLE MESSAGE: line 40: Timers were scheduled at: 37.944230949506164,556.3224649522454,1564.046156941913,2555.3537160158157,3563.0944420117885,4561.856752960011,20164.311336935498 When I test locally, and print the timing data even on success, the output is: CONSOLE MESSAGE: line 31: Timers were scheduled at: 125.79913299850887,128.9805999986129,141.93081700068433,146.540253997955,152.07373299926985,155.34578100050567,162.11933999875328,177.24060999898938,183.70886299817357,190.41601500066463,197.25199999811593,203.73337699857075,210.3919910005061,218.73386399965966,225.48055100196507,250.12134900316596,258.89310300408397,265.5821590014966,272.1203940018313,278.9613470013137,285.7946400035871,292.5488300024881,303.734835004434,310.7598240021616,319.68921900261194,326.54779400036205,335.85378299903823,342.9574070032686,353.57211700465996,360.1231889988412,370.0214600030449,377.03421500191325,387.11607600271236,400.0187960045878,407.9218039987609,416.8596780000371,423.3271519988193,429.7157410037471,435.7981600041967,442.50050999835366,454.3809640017571,461.1590260028606,470.320073000039,477.19982400303707,487.3991450003814,494.21547100064345,503.9252940041479,510.3242820041487,520.7717220037011,527.4011189976591,537.5356279982952,544.4351370024378,553.9757229998941,560.8028060014476,570.4738330023247,577.4468419986079,587.4084050010424,594.3376850045752,603.7952970000333,610.804048999853,620.8505680042435,627.6742940026452,637.7976470030262,644.5893909985898,653.4063369981595,660.498393997841,671.2844819994643,678.3023910029442,687.6440220003133,694.384695001645,703.5507379987394,710.1908699987689,721.123487004661,728.1277980000596,738.0604470017715,745.0514209995163,754.0727090017754,761.0254570026882,771.0709820021293,777.9118559992639,788.0212250020122,794.822378004028,804.5823670036043,811.569422003231,821.8013090008753,828.3847080019768,837.5335410019034,844.372765001026,854.4278589979513,861.4545880045625,871.3724330009427,878.4461210016161,887.7621720021125,894.8069830003078,904.3407959979959,911.3547460001428,921.1714140037657,928.0214230020647,938.36195699987 This means that when the test fails, the 0-delay timers first get throttled to fire once a second, and then stall completely until the 20-second watchdog timer fires. At the same time, WebCore doesn't think that the timers are ever throttled. Given that this happens on Yosemite only, and that WebCore doesn't know about the throttling, it smells like an issue below WebKit. At the same time, it is difficult to see how the 20-second timer could fire at exact time if the issue were below us - all these DOM timers are served by a single shared CF timer.
Alexey Proskuryakov
Comment 6
2015-04-25 20:20:06 PDT
Looks like the only way WebCore would clamp timers to 1 second is when it thinks that the timer causes a repaint of something that's not user visible (it won't do that when there is no repainting at all). And there seems to be no way that we'd clamp to 15 seconds, short of a timer heap bug.
Alexey Proskuryakov
Comment 7
2015-04-25 23:44:35 PDT
Found something interesting in system logs. Every time the test fails, there is this logging while it runs: Apr 25 18:55:31 bot185.apple.com WindowServer[147]: WSGetSurfaceInWindow : Invalid surface 734273305 for window 38598 Apr 25 18:55:32 --- last message repeated 3 times --- Apr 25 18:55:32 bot185.apple.com WindowServer[147]: WSGetSurfaceInWindow : Invalid surface 656102528 for window 38600 Apr 25 18:55:32 --- last message repeated 3 times --- And when the test passes, the same logging occurs slightly later.
Alexey Proskuryakov
Comment 8
2015-04-26 00:17:12 PDT
I can reproduce this logging by running fast/events/popup-allowed-from-gesture-initiated-event.html (and there is another test that also causes it). However, running it in parallel with fast/frames/flattening/iframe-flattening-resize-event-count.html doesn't reproduce the problem, so it's not directly related.
Alexey Proskuryakov
Comment 9
2015-04-27 12:14:59 PDT
Created
attachment 251766
[details]
disable App Nap Talked to Gavin, who suggested that this might be an effect of App Nap. The mechanism is not quite clear, but it certainly seems worth trying.
Alexey Proskuryakov
Comment 10
2015-04-27 13:36:06 PDT
Committed the patch for disabling App Nap in
r183418
, let's see what happens.
Alexey Proskuryakov
Comment 11
2015-04-27 17:34:43 PDT
Disabling App Nap made a difference, but the test is still flaky. We got rid of the super long stall, but the 0-delay timer is still taking 1 second to fire: +CONSOLE MESSAGE: line 40: Timers were scheduled at: 43.86900598183274,548.0330429272726,1556.697802967392,2549.8486949363723,3549.5337379397824,4556.125045986846,5548.013246967457,6556.058387970552,7554.3883389327675,8552.839554962702,9547.597624943592,10548.691916977987,11550.223996979184,12547.692796913907,13549.164323951118,14547.859265003353,15556.478225975297,16549.75908191409,17550.117639941163,18548.79687190987,19555.410598986782
Alexey Proskuryakov
Comment 12
2015-04-27 17:41:14 PDT
Oh, I wonder if 1 second is this: static double hiddenPageAlignmentInterval() { return 1.0; } // 1 second.
Alexey Proskuryakov
Comment 13
2015-04-27 23:34:45 PDT
Confirmed that WebKitTestRunner doesn't bring back the window if a preceding test calls testRunner.setPageVisibility("hidden") without a subsequent testRunner.resetPageVisibility() or testRunner.setPageVisibility("visible"). I couldn't find any direct evidence of this being the culprit in this case yet, but it seems like it has to be.
Darin Adler
Comment 14
2015-04-28 08:13:38 PDT
So does that mean we could fix this by adding code to TestController::resetStateToConsistentValues?
Alexey Proskuryakov
Comment 15
2015-04-28 10:40:19 PDT
Yes, probably just call TestController::setHidden(false). I'd like to look into a few things before that: - which test leaves the window hidden (couldn't find it yet); - how we get away with TestController::setHidden(false) calling -makeKeyAndOrderFront - why doesn't that steal focus? - how DumpRenderTree works (it doesn't seem to ever hide or unhide windows, maybe it doesn't test page visibility properly?)
Alexey Proskuryakov
Comment 16
2015-04-29 00:46:29 PDT
Created
attachment 251931
[details]
unhide the window between tests
> - which test leaves the window hidden (couldn't find it yet);
Still couldn't catch anything like this in action. If this patch doesn't help, my next theory is that the window is actually visible, but a CoreIPC issue akin to
bug 141122
confuses WebProcess into getting out of sync. That would be pretty much impossible to verify without a locally reproducible case. - how we get away with TestController::setHidden(false) calling -makeKeyAndOrderFront - why doesn't that steal focus? A mystery. I patched WKTR to not move the window offscreen, and ran a test that hides/shows the window in a loop. The window does appear and disappear, but it always stays below the currently active window, not stealing focus and not obscuring it. - how DumpRenderTree works (it doesn't seem to ever hide or unhide windows, maybe it doesn't test page visibility properly?) DumpRenderTree calls directly into WebKit to fool it about the current visibility state. [webView _setVisibilityState:WebPageVisibilityStateHidden isInitialState:NO];
Alexey Proskuryakov
Comment 17
2015-04-29 10:44:32 PDT
Committed the patch for unhiding the window in
r183558
, will keep watching.
Alexey Proskuryakov
Comment 18
2015-04-29 20:05:59 PDT
This is still happening, no change. Added more logging in
r183607
, it will now print out visibility state as WebProcess sees it.
Alexey Proskuryakov
Comment 19
2015-04-29 23:50:06 PDT
Strange. CONSOLE MESSAGE: line 40: Timers were scheduled at: 38.54244900867343,634.1223609633744,1630.6184619897977,2633.0562729854137,3635.5311119696125,4630.785787943751,5634.050983935595,6638.653066940606,7635.863823001273,8635.53286192473,9633.291222970001,10638.184183975682,11632.18399696052,12640.454213949852,13631.606793962419,14630.80570299644,15635.705446940847,16635.929689975455,17632.111273007467,18637.691960902885,19631.200731964782 CONSOLE MESSAGE: line 41: document.hidden = false; document.visibilityState = visible
Alexey Proskuryakov
Comment 20
2015-05-02 12:05:31 PDT
testRunner.setPageVisibility is somewhat broken in WebKitTestRunner, creating an opportunity for races. It is asynchronous, and there are tests that just call the function and terminate without waiting for a visibility change event, thus leaving the following test to handle the message in UI process. But once again, there is no indication that this is somehow breaking this particular test - in a failing run on the bot, no preceding tests call setPageVisibility.
Alexey Proskuryakov
Comment 21
2015-05-02 12:20:56 PDT
Added some debug-only logging to WebCore in
r183720
, will roll out as soon as there are results from the bots.
Alexey Proskuryakov
Comment 22
2015-05-03 00:09:51 PDT
Yes, there are spurious calls to setDOMTimerAlignmentInterval. They sometimes happen during this test, which makes it fail, and sometimes during other tests. This doesn't look like a race though. Will see if I can reproduce this slightly more broad condition locally.
Alexey Proskuryakov
Comment 23
2015-05-12 10:32:56 PDT
This is taking way too long to keep the bots red, marked as flaky in
r184205
.
Alexey Proskuryakov
Comment 24
2015-08-06 18:32:14 PDT
The remaining issues are quite likely fixed in
bug 147718
. Will keep an eye on the bots.
Alexey Proskuryakov
Comment 25
2015-08-28 15:58:23 PDT
fast/frames/flattening/iframe-flattening-resize-event-count.html hasn't failed since. So, these fixes plus
bug 147718
took care of it.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug