RESOLVED FIXED Bug 60881
Inspector may close at the start of the next inspector test in DRT
https://bugs.webkit.org/show_bug.cgi?id=60881
Summary Inspector may close at the start of the next inspector test in DRT
Csaba Osztrogonác
Reported 2011-05-16 05:23:21 PDT
After r86499 inspector/styles/styles-url-linkify.html fails on 32 bit debug mode: --- /ramdisk/qt-linux-32-debug/build/layout-test-results/inspector/styles/styles-url-linkify-expected.txt 2011-05-14 16:06:23.691172055 -0700 +++ /ramdisk/qt-linux-32-debug/build/layout-test-results/inspector/styles/styles-url-linkify-actual.txt 2011-05-14 16:06:23.691172055 -0700 @@ -1,19 +1,4 @@ +FAIL: Timed out waiting for notifyDone to be called Tests that URLs are linked to and completed correctly. Bugs 51663, 53171 -Partial URLs completed: -http://example.com/ -http://example.com/moo -https://secure.com/moo -https://secure.com/moo -http://example.com/moo -http://example.com/foo/zoo/moo -http://example.com/foo/boo/moo -http://example.com/moo -http://example.com/foo?a=b -http://example.com/foo?a=b -Link for a URI from CSS document: -inspector/styles/resources/fromcss.png -Link for a URI from iframe inline stylesheet: -inspector/styles/resources/iframed.png -
Attachments
patch (3.44 KB, patch)
2011-06-22 06:19 PDT, Zoltan Herczeg
pfeldman: review+
Csaba Osztrogonác
Comment 1 2011-05-16 05:48:11 PDT
Unfortunately inspector/styles/styles-url-linkify.html doesn't fail on its own, but if you run all tests.
Csaba Osztrogonác
Comment 2 2011-05-16 06:44:27 PDT
I can reproduce the fail with executing the 21 tests run after the last DumpRenderTree restart: inspector/profiler/heap-snapshot.html inspector/protocol/console-agent.html inspector/protocol/runtime-agent.html inspector/styles/get-set-stylesheet-text.html inspector/styles/metrics-box-sizing.html inspector/styles/parse-utf8-bom.html inspector/styles/styles-add-invalid-property.html inspector/styles/styles-cancel-editing.html inspector/styles/styles-commit-editing.html inspector/styles/styles-computed-trace.html inspector/styles/styles-disable-inherited.html inspector/styles/styles-disable-then-change.html inspector/styles/styles-disable-then-delete.html inspector/styles/styles-history.html inspector/styles/styles-iframe.html inspector/styles/styles-new-API.html inspector/styles/styles-source-lines-inline.html inspector/styles/styles-source-lines.html inspector/styles/styles-source-offsets.html inspector/styles/styles-update-from-js.html inspector/styles/styles-url-linkify.html --> FAIL
Csaba Osztrogonác
Comment 3 2011-05-16 07:18:01 PDT
After r86499 fast/js/array-sort-modifying-tostring.html fails on Qt WK2 bot. It doesn't fail on its own, but fail if you run all fast/js tests. --- /tmp/layout-test-results/fast/js/array-sort-modifying-tostring-expected.txt 2011-05-16 14:17:02.487090658 +0000 +++ /tmp/layout-test-results/fast/js/array-sort-modifying-tostring-actual.txt 2011-05-16 14:17:02.487090658 +0000 @@ -1,11 +1 @@ -Test of array with toString() override that truncates array. - -On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE". - - -PASS Array length is unchanged. -PASS Array values are correct. -PASS successfullyParsed is true - -TEST COMPLETE - +Timed out waiting for final message from web process
Csaba Osztrogonác
Comment 4 2011-05-18 00:05:11 PDT
Are Nokia interested to fix this regression? Or you specifically want to keep this regression in trunk forever? :(
Zoltan Herczeg
Comment 5 2011-05-18 00:13:19 PDT
I have investigated this bug. When JIT is enabled, two tests fails in the inspector in the same way: the testing itself does not start (function test() does never called, but the timeout mechanism is installed, which yields the error message). When JIT is disabled, there is a crash in inspector/styles/metrics-box-sizing.html. I think these issues are connected (likely a GC related issue).
Zoltan Herczeg
Comment 6 2011-05-18 00:36:03 PDT
The crash happens in ASSERT_GC_OBJECT_LOOKS_VALID(cell); JSCell.cpp:226
Zoltan Herczeg
Comment 7 2011-05-20 06:36:44 PDT
Here is what I found so far (after 3 days of debugging): In inspector-test.js, layoutTestController.evaluateInWebInspector starts some kind of asyncron execution of function runTestInFrontend(), which does not happen for styles-disable-then-delete.html and styles-url-linkify.html in some unknown circumstances. The question now: why does it fail to start the script execution?
Zoltan Herczeg
Comment 8 2011-05-20 07:11:27 PDT
Source/WebKit/qt/WebCoreSupport/InspectorClientQt.cpp bool InspectorClientQt::sendMessageToFrontend(const String& message) { #if ENABLE(INSPECTOR) if (m_inspectedWebPage->d->inspector->d->remoteFrontend) { // Code not enters here. } if (!m_frontendWebPage) // <--- EARLY RETURN HERE! return false; Page* frontendPage = QWebPagePrivate::core(m_frontendWebPage); return doDispatchMessageOnFrontendPage(frontendPage, message); #else return false; #endif } If I run it alone, the m_frontendWebPage does exists. When I run it with run-layout-tests, it does not exists, and there is an early return.
Zoltan Herczeg
Comment 9 2011-05-23 04:25:26 PDT
Now I have a theory what cause this error, and why it is a flakey test: After every tests: 1) LayoutTestControllerQt.cpp:382 calls closeWebInspector() 2) which calls close() in InspectorController.cpp 3) which calls disconnectFromBackend() in WebCore/generated/InspectorFrontend.cpp 4) which calls WebInspector.dispatchMessageFromBackend in WebCore/inspector/front-end/inspector.js (this is JavaScript now!) 5) which calls setTimeout() with "Inspector.disconnectFromBackend" argument The issue here is the setTimeout() behaviour, which may happen AFTER the next test is started by DumpRenderTree. Hence, the next test starts with closing the WebInspector. From that point, the test will fail with timeout. WebInspector experts, please help me to solve this issue.
Zoltan Herczeg
Comment 10 2011-05-23 06:26:36 PDT
inspector-test.js : function closeInspectorAndNotifyDone() { if (window._watchDogTimer) clearTimeout(window._watchDogTimer); layoutTestController.closeWebInspector(); // A helper function should be called here which would check // whether the inspector exists, and call notifyDone only if not. setTimeout(function() { layoutTestController.notifyDone(); }, 0); }
Ilya Tikhonovsky
Comment 11 2011-05-27 03:17:04 PDT
(In reply to comment #10) > inspector-test.js : > > function closeInspectorAndNotifyDone() > { > if (window._watchDogTimer) > clearTimeout(window._watchDogTimer); > > layoutTestController.closeWebInspector(); > > // A helper function should be called here which would check > // whether the inspector exists, and call notifyDone only if not. > setTimeout(function() { > layoutTestController.notifyDone(); > }, 0); > } Seems to me it just slow. The same test on chromium-debug usually requires 15 seconds. We are trying to speed up our tests, but it is very tricky process because we are loading our front-end which has 50kloc of js. http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=styles-url-linkify.html&group=%40ToT%20-%20chromium.org
Zoltan Herczeg
Comment 12 2011-05-27 04:08:18 PDT
> Seems to me it just slow. The same test on chromium-debug usually requires 15 seconds. We are trying to speed up our tests, but it is very tricky process because we are loading our front-end which has 50kloc of js. > > http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=styles-url-linkify.html&group=%40ToT%20-%20chromium.org I feel the story is not that simple. Both layoutTestController.closeWebInspector() and and layoutTestController.notifyDone() start an async operation, which adds a new event to an event queue. The event queue is allowed to execute the events in any order, hence the layoutTestController.notifyDone() can be performed earlier than layoutTestController.closeWebInspector(). When layoutTestController.notifyDone() is executed, the next test is started by DRT. There is a substring comparison in the begining of the main loop of the DRT, which checks whether the test is an inspector test, and if so, it starts the inspector (even though the previous one was not even closed! -> leaking an inspector here). Right after the test starts, events in the queue are processed again. The next event closes the WebInspector. Thus, the test fail with timeout. Does this make sense?
Zoltan Herczeg
Comment 13 2011-06-08 09:36:55 PDT
I think it is very important to fix this bug. Today it caused new flakey tests! I tried to raise the visibility of the bug, and contacted the inspector developers. Basically the issue is: setTimeout(A, 0); setTimeout(B, 0) -> in this case, nothing guarantees, that A runs first, B later. It is likely, but not necessary. The setTimeout only guarantees, that the function will eventually be called. Both "layoutTestController.closeWebInspector()" and "setTimeout(function() { layoutTestController.notifyDone(); }, 0)" starts a setTimeout async call. If layoutTestController.notifyDone() comes first, the next test is starting to run. It opens, an inspector (without closing the previous one, etc.) However, the first processed message will be closeInspector(). Later the runInInspector or something fails, since there is no inspector. Since it has no return value, you don't even know that the call is failed. After that, the test stops, and wait for a watchdog timeout event. Short, but stupid fix: setTimeout(function() { layoutTestController.notifyDone(); }, 100 /* much bigger than 0 */); It will slow down the inspector tests, but will likely works. More clever idea: Add a new bool function to DRT, which returns true, if the inspector exists. Before layoutTestController.notifyDone(), check this value, and if it is true, start another setTimeout() call. This is difficult solution, since affects all inspectors, but I feel this is the correct one. However, I don't want to start working on it until someone says this is the best solution. Any better idea is welcome.
Robert Hogan
Comment 14 2011-06-09 11:37:02 PDT
If I'm reading your comments correctly Zoltan, the fix may be to ensure that EventSender sends events in the order they were received. This is a problem with EventSender that affects other tests too, particularly Drag n Drop, and cases where EventSender.skipForward() (or whatever it's called) is used.
Zoltan Herczeg
Comment 15 2011-06-09 11:41:36 PDT
(In reply to comment #14) > If I'm reading your comments correctly Zoltan, the fix may be to ensure that EventSender sends events in the order they were received. This is a problem with EventSender that affects other tests too, particularly Drag n Drop, and cases where EventSender.skipForward() (or whatever it's called) is used. http://www.whatwg.org/specs/web-apps/current-work/multipage/timers.html#dom-windowtimers-settimeout says: 8. Wait until any invocations of this algorithm started before this one whose timeout is equal to or less than this one's have completed. It seems this behaviour might be expected by the standard as well, although I am not sure what "Living Standard" means, and how trustworthy the link above.
Csaba Osztrogonác
Comment 16 2011-06-09 19:35:08 PDT
I skipped all inspector/profiler tests on Qt: http://trac.webkit.org/changeset/88514
Csaba Osztrogonác
Comment 17 2011-06-09 19:52:52 PDT
Nowadays Qt buildbot is absolutely useless because of this crazy bug. :-((( I tried to find a subset of inspector tests and disable them to make buildbot green again, but I didn't manage after a day digging. I had to disable all inspector tests until fix: http://trac.webkit.org/changeset/88516
Zoltan Herczeg
Comment 18 2011-06-14 06:58:42 PDT
I spent another day for debugging this. It seems the issue is little more complex than I thought. It seems there is another way of calling closeWebInspector() in styles-disable-then-change.html , which affects the next test: styles-disable-then-delete.html . This different closeWebInspector() hits the m_nestingLevel >= maxTimerNestingLevel condition in WebCore/page/DOMTimer.cpp, which increase the minimum 1 ms timeout to 10 ms. However, the notifyDone does not called from here, and the normal code path does not reach this treshold, so the timeout is still 1 ms. 10 ms is way bigger than 1 ms, and the timer fires after the next test is started. Debugging the inspector is really difficult because it is half JavaScript, half C++, and full of setTimeout async calls.
Ilya Tikhonovsky
Comment 19 2011-06-14 07:52:42 PDT
got a pair of tests where second is constantly failing inspector/profiler/detailed-heapshots-dominators-shown-node-count-preserved-when-sorting.html inspector/styles/styles-commit-editing.html
Zoltan Herczeg
Comment 20 2011-06-21 07:13:50 PDT
InspectorBackendStub.js : dispatcher[functionName].apply(dispatcher, params); calls jsInspectorFrontendHostPrototypeFunctionDisconnectFromBackend However, it seems DOM.hideNodeHighlight starts the timer. Would it be possible, that the message content changes when the new test starts?
Zoltan Herczeg
Comment 21 2011-06-22 05:28:56 PDT
Finally, I solved this stupid bug. Heh, it nearly turned out to be the first bug that I cannot solve, but it isn't. Here is the culprit: inspector.js var messagesToDispatch = []; WebInspector.dispatch = function(message) { messagesToDispatch.push(message); setTimeout(function() { InspectorBackend.dispatch(messagesToDispatch.shift()); }, 0); } "messagesToDispatch" is basically a FIFO queue, which means the messages are processed in order, regardless of how the unnamed sub-function is called. Naturally, the lastly appended message in this queue is the closeWebInspector for each test case. However, depending on the test case the setTimeout depth can be increased to 5 to some messages, which increases the minimum timeout to 10 ms to that particular call. This call may be processed after(!) the next test is started, which means the last item in the queue is processed there. Possible fix: check messagesToDispatch.length() == 0 before reporting the test case finish.
Zoltan Herczeg
Comment 22 2011-06-22 06:19:09 PDT
Created attachment 98160 [details] patch Finally! A fix!
Pavel Feldman
Comment 23 2011-06-22 06:41:58 PDT
Comment on attachment 98160 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=98160&action=review > LayoutTests/http/tests/inspector/inspector-test.js:18 > + if (!WebInspector.dispatchQueueIsEmpty()) { This looks sane. Before we land this, could you try calling InspectorTest.runAfterPendingDispatches(<didEvaluateForTestInFrontend>) instead? We maintain dispatch counter and can tell at any moment whether all of them are served.
Zoltan Herczeg
Comment 24 2011-06-23 02:21:03 PDT
> This looks sane. Before we land this, could you try calling InspectorTest.runAfterPendingDispatches(<didEvaluateForTestInFrontend>) instead? We maintain dispatch counter and can tell at any moment whether all of them are served. No, I still get "FAIL: Timed out waiting for notifyDone to be called" for styles-url-linkify.html
Zoltan Herczeg
Comment 25 2011-06-23 05:04:10 PDT
Note You need to log in before you can comment on or make changes to this bug.