60881 – Inspector may close at the start of the next inspector test in DRT

RESOLVED FIXED60881

Inspector may close at the start of the next inspector test in DRT

https://bugs.webkit.org/show_bug.cgi?id=60881

Summary Inspector may close at the start of the next inspector test in DRT

Csaba Osztrogonác

Reported 2011-05-16 05:23:21 PDT

After r86499 inspector/styles/styles-url-linkify.html fails on 32 bit debug mode: --- /ramdisk/qt-linux-32-debug/build/layout-test-results/inspector/styles/styles-url-linkify-expected.txt 2011-05-14 16:06:23.691172055 -0700 +++ /ramdisk/qt-linux-32-debug/build/layout-test-results/inspector/styles/styles-url-linkify-actual.txt 2011-05-14 16:06:23.691172055 -0700 @@ -1,19 +1,4 @@ +FAIL: Timed out waiting for notifyDone to be called Tests that URLs are linked to and completed correctly. Bugs 51663, 53171 -Partial URLs completed: -http://example.com/ -http://example.com/moo -https://secure.com/moo -https://secure.com/moo -http://example.com/moo -http://example.com/foo/zoo/moo -http://example.com/foo/boo/moo -http://example.com/moo -http://example.com/foo?a=b -http://example.com/foo?a=b -Link for a URI from CSS document: -inspector/styles/resources/fromcss.png -Link for a URI from iframe inline stylesheet: -inspector/styles/resources/iframed.png -

Attachments
patch (3.44 KB, patch) 2011-06-22 06:19 PDT, Zoltan Herczeg	pfeldman: review+	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Csaba Osztrogonác

Comment 1 2011-05-16 05:48:11 PDT

Unfortunately inspector/styles/styles-url-linkify.html doesn't fail on its own, but if you run all tests.

Csaba Osztrogonác

Comment 2 2011-05-16 06:44:27 PDT

I can reproduce the fail with executing the 21 tests run after the last DumpRenderTree restart: inspector/profiler/heap-snapshot.html inspector/protocol/console-agent.html inspector/protocol/runtime-agent.html inspector/styles/get-set-stylesheet-text.html inspector/styles/metrics-box-sizing.html inspector/styles/parse-utf8-bom.html inspector/styles/styles-add-invalid-property.html inspector/styles/styles-cancel-editing.html inspector/styles/styles-commit-editing.html inspector/styles/styles-computed-trace.html inspector/styles/styles-disable-inherited.html inspector/styles/styles-disable-then-change.html inspector/styles/styles-disable-then-delete.html inspector/styles/styles-history.html inspector/styles/styles-iframe.html inspector/styles/styles-new-API.html inspector/styles/styles-source-lines-inline.html inspector/styles/styles-source-lines.html inspector/styles/styles-source-offsets.html inspector/styles/styles-update-from-js.html inspector/styles/styles-url-linkify.html --> FAIL

Csaba Osztrogonác

Comment 3 2011-05-16 07:18:01 PDT

After r86499 fast/js/array-sort-modifying-tostring.html fails on Qt WK2 bot. It doesn't fail on its own, but fail if you run all fast/js tests. --- /tmp/layout-test-results/fast/js/array-sort-modifying-tostring-expected.txt 2011-05-16 14:17:02.487090658 +0000 +++ /tmp/layout-test-results/fast/js/array-sort-modifying-tostring-actual.txt 2011-05-16 14:17:02.487090658 +0000 @@ -1,11 +1 @@ -Test of array with toString() override that truncates array. - -On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE". - - -PASS Array length is unchanged. -PASS Array values are correct. -PASS successfullyParsed is true - -TEST COMPLETE - +Timed out waiting for final message from web process

Csaba Osztrogonác

Comment 4 2011-05-18 00:05:11 PDT

Are Nokia interested to fix this regression? Or you specifically want to keep this regression in trunk forever? :(

Zoltan Herczeg

Comment 5 2011-05-18 00:13:19 PDT

I have investigated this bug. When JIT is enabled, two tests fails in the inspector in the same way: the testing itself does not start (function test() does never called, but the timeout mechanism is installed, which yields the error message). When JIT is disabled, there is a crash in inspector/styles/metrics-box-sizing.html. I think these issues are connected (likely a GC related issue).

Zoltan Herczeg

Comment 6 2011-05-18 00:36:03 PDT

The crash happens in ASSERT_GC_OBJECT_LOOKS_VALID(cell); JSCell.cpp:226

Zoltan Herczeg

Comment 7 2011-05-20 06:36:44 PDT

Here is what I found so far (after 3 days of debugging): In inspector-test.js, layoutTestController.evaluateInWebInspector starts some kind of asyncron execution of function runTestInFrontend(), which does not happen for styles-disable-then-delete.html and styles-url-linkify.html in some unknown circumstances. The question now: why does it fail to start the script execution?

Zoltan Herczeg

Comment 8 2011-05-20 07:11:27 PDT

Source/WebKit/qt/WebCoreSupport/InspectorClientQt.cpp bool InspectorClientQt::sendMessageToFrontend(const String& message) { #if ENABLE(INSPECTOR) if (m_inspectedWebPage->d->inspector->d->remoteFrontend) { // Code not enters here. } if (!m_frontendWebPage) // <--- EARLY RETURN HERE! return false; Page* frontendPage = QWebPagePrivate::core(m_frontendWebPage); return doDispatchMessageOnFrontendPage(frontendPage, message); #else return false; #endif } If I run it alone, the m_frontendWebPage does exists. When I run it with run-layout-tests, it does not exists, and there is an early return.

Zoltan Herczeg

Comment 9 2011-05-23 04:25:26 PDT

Now I have a theory what cause this error, and why it is a flakey test: After every tests: 1) LayoutTestControllerQt.cpp:382 calls closeWebInspector() 2) which calls close() in InspectorController.cpp 3) which calls disconnectFromBackend() in WebCore/generated/InspectorFrontend.cpp 4) which calls WebInspector.dispatchMessageFromBackend in WebCore/inspector/front-end/inspector.js (this is JavaScript now!) 5) which calls setTimeout() with "Inspector.disconnectFromBackend" argument The issue here is the setTimeout() behaviour, which may happen AFTER the next test is started by DumpRenderTree. Hence, the next test starts with closing the WebInspector. From that point, the test will fail with timeout. WebInspector experts, please help me to solve this issue.

Zoltan Herczeg

Comment 10 2011-05-23 06:26:36 PDT

inspector-test.js : function closeInspectorAndNotifyDone() { if (window._watchDogTimer) clearTimeout(window._watchDogTimer); layoutTestController.closeWebInspector(); // A helper function should be called here which would check // whether the inspector exists, and call notifyDone only if not. setTimeout(function() { layoutTestController.notifyDone(); }, 0); }

Ilya Tikhonovsky

Comment 11 2011-05-27 03:17:04 PDT

(In reply to comment #10) > inspector-test.js : > > function closeInspectorAndNotifyDone() > { > if (window._watchDogTimer) > clearTimeout(window._watchDogTimer); > > layoutTestController.closeWebInspector(); > > // A helper function should be called here which would check > // whether the inspector exists, and call notifyDone only if not. > setTimeout(function() { > layoutTestController.notifyDone(); > }, 0); > } Seems to me it just slow. The same test on chromium-debug usually requires 15 seconds. We are trying to speed up our tests, but it is very tricky process because we are loading our front-end which has 50kloc of js. http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=styles-url-linkify.html&group=%40ToT%20-%20chromium.org

Zoltan Herczeg

Comment 12 2011-05-27 04:08:18 PDT

> Seems to me it just slow. The same test on chromium-debug usually requires 15 seconds. We are trying to speed up our tests, but it is very tricky process because we are loading our front-end which has 50kloc of js. > > http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=styles-url-linkify.html&group=%40ToT%20-%20chromium.org I feel the story is not that simple. Both layoutTestController.closeWebInspector() and and layoutTestController.notifyDone() start an async operation, which adds a new event to an event queue. The event queue is allowed to execute the events in any order, hence the layoutTestController.notifyDone() can be performed earlier than layoutTestController.closeWebInspector(). When layoutTestController.notifyDone() is executed, the next test is started by DRT. There is a substring comparison in the begining of the main loop of the DRT, which checks whether the test is an inspector test, and if so, it starts the inspector (even though the previous one was not even closed! -> leaking an inspector here). Right after the test starts, events in the queue are processed again. The next event closes the WebInspector. Thus, the test fail with timeout. Does this make sense?

Zoltan Herczeg

Comment 13 2011-06-08 09:36:55 PDT

I think it is very important to fix this bug. Today it caused new flakey tests! I tried to raise the visibility of the bug, and contacted the inspector developers. Basically the issue is: setTimeout(A, 0); setTimeout(B, 0) -> in this case, nothing guarantees, that A runs first, B later. It is likely, but not necessary. The setTimeout only guarantees, that the function will eventually be called. Both "layoutTestController.closeWebInspector()" and "setTimeout(function() { layoutTestController.notifyDone(); }, 0)" starts a setTimeout async call. If layoutTestController.notifyDone() comes first, the next test is starting to run. It opens, an inspector (without closing the previous one, etc.) However, the first processed message will be closeInspector(). Later the runInInspector or something fails, since there is no inspector. Since it has no return value, you don't even know that the call is failed. After that, the test stops, and wait for a watchdog timeout event. Short, but stupid fix: setTimeout(function() { layoutTestController.notifyDone(); }, 100 /* much bigger than 0 */); It will slow down the inspector tests, but will likely works. More clever idea: Add a new bool function to DRT, which returns true, if the inspector exists. Before layoutTestController.notifyDone(), check this value, and if it is true, start another setTimeout() call. This is difficult solution, since affects all inspectors, but I feel this is the correct one. However, I don't want to start working on it until someone says this is the best solution. Any better idea is welcome.

Robert Hogan

Comment 14 2011-06-09 11:37:02 PDT

If I'm reading your comments correctly Zoltan, the fix may be to ensure that EventSender sends events in the order they were received. This is a problem with EventSender that affects other tests too, particularly Drag n Drop, and cases where EventSender.skipForward() (or whatever it's called) is used.

Zoltan Herczeg

Comment 15 2011-06-09 11:41:36 PDT

(In reply to comment #14) > If I'm reading your comments correctly Zoltan, the fix may be to ensure that EventSender sends events in the order they were received. This is a problem with EventSender that affects other tests too, particularly Drag n Drop, and cases where EventSender.skipForward() (or whatever it's called) is used. http://www.whatwg.org/specs/web-apps/current-work/multipage/timers.html#dom-windowtimers-settimeout says: 8. Wait until any invocations of this algorithm started before this one whose timeout is equal to or less than this one's have completed. It seems this behaviour might be expected by the standard as well, although I am not sure what "Living Standard" means, and how trustworthy the link above.

Csaba Osztrogonác

Comment 16 2011-06-09 19:35:08 PDT

I skipped all inspector/profiler tests on Qt: http://trac.webkit.org/changeset/88514

Csaba Osztrogonác

Comment 17 2011-06-09 19:52:52 PDT

Nowadays Qt buildbot is absolutely useless because of this crazy bug. :-((( I tried to find a subset of inspector tests and disable them to make buildbot green again, but I didn't manage after a day digging. I had to disable all inspector tests until fix: http://trac.webkit.org/changeset/88516

Zoltan Herczeg

Comment 18 2011-06-14 06:58:42 PDT

I spent another day for debugging this. It seems the issue is little more complex than I thought. It seems there is another way of calling closeWebInspector() in styles-disable-then-change.html , which affects the next test: styles-disable-then-delete.html . This different closeWebInspector() hits the m_nestingLevel >= maxTimerNestingLevel condition in WebCore/page/DOMTimer.cpp, which increase the minimum 1 ms timeout to 10 ms. However, the notifyDone does not called from here, and the normal code path does not reach this treshold, so the timeout is still 1 ms. 10 ms is way bigger than 1 ms, and the timer fires after the next test is started. Debugging the inspector is really difficult because it is half JavaScript, half C++, and full of setTimeout async calls.

Ilya Tikhonovsky

Comment 19 2011-06-14 07:52:42 PDT

got a pair of tests where second is constantly failing inspector/profiler/detailed-heapshots-dominators-shown-node-count-preserved-when-sorting.html inspector/styles/styles-commit-editing.html

Zoltan Herczeg

Comment 20 2011-06-21 07:13:50 PDT

InspectorBackendStub.js : dispatcher[functionName].apply(dispatcher, params); calls jsInspectorFrontendHostPrototypeFunctionDisconnectFromBackend However, it seems DOM.hideNodeHighlight starts the timer. Would it be possible, that the message content changes when the new test starts?

Zoltan Herczeg

Comment 21 2011-06-22 05:28:56 PDT

Finally, I solved this stupid bug. Heh, it nearly turned out to be the first bug that I cannot solve, but it isn't. Here is the culprit: inspector.js var messagesToDispatch = []; WebInspector.dispatch = function(message) { messagesToDispatch.push(message); setTimeout(function() { InspectorBackend.dispatch(messagesToDispatch.shift()); }, 0); } "messagesToDispatch" is basically a FIFO queue, which means the messages are processed in order, regardless of how the unnamed sub-function is called. Naturally, the lastly appended message in this queue is the closeWebInspector for each test case. However, depending on the test case the setTimeout depth can be increased to 5 to some messages, which increases the minimum timeout to 10 ms to that particular call. This call may be processed after(!) the next test is started, which means the last item in the queue is processed there. Possible fix: check messagesToDispatch.length() == 0 before reporting the test case finish.

Zoltan Herczeg

Comment 22 2011-06-22 06:19:09 PDT

Created attachment 98160 [details] patch Finally! A fix!

Pavel Feldman

Comment 23 2011-06-22 06:41:58 PDT

Comment on attachment 98160 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=98160&action=review > LayoutTests/http/tests/inspector/inspector-test.js:18 > + if (!WebInspector.dispatchQueueIsEmpty()) { This looks sane. Before we land this, could you try calling InspectorTest.runAfterPendingDispatches(<didEvaluateForTestInFrontend>) instead? We maintain dispatch counter and can tell at any moment whether all of them are served.

Zoltan Herczeg

Comment 24 2011-06-23 02:21:03 PDT

> This looks sane. Before we land this, could you try calling InspectorTest.runAfterPendingDispatches(<didEvaluateForTestInFrontend>) instead? We maintain dispatch counter and can tell at any moment whether all of them are served. No, I still get "FAIL: Timed out waiting for notifyDone to be called" for styles-url-linkify.html

Zoltan Herczeg

Comment 25 2011-06-23 05:04:10 PDT

Landed in http://trac.webkit.org/changeset/89556

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P1

Severity Blocker

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS All

Product WebKit

Component Tools / Tests

Assignee

Nobody

Reported

2011-05-16 05:23 PDT

Modified

2011-06-23 05:04 PDT History

CC List

11 users Show

URL

Keywords Qt, QtTriaged

Depends on

Blocks

60802 62278

Dependencies

tree graph