Bug 16853

Summary: http/tests/loading has different results when run with run-webkit-tests -1
Product: WebKit Reporter: Holger Freyther <zecke>
Component: New BugsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, bdakin, ddkilzer
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Mac   
OS: OS X 10.4   
Bug Depends on:    
Bug Blocks: 15765    
Attachments:
Description Flags
Run tests in loading/ as if they were invoked with -1
none
Change Qt's DRT to throw the view away before running a loading/ test
none
Run tests in loading/ as if they were invoked with -1 unless a patched DRT exists
darin: review+
Do not output willCloseFrame and didClearWindowObject darin: review+

Description Holger Freyther 2008-01-12 08:02:27 PST
The Gmail assert on load issue from bug #15765 made me aware of this. The tests in http/tests/loading enable the dumpFrameLoadCallbacks feature. This means that the result of one test depends on which test ran before. Some tests have a line that a certain frame will be closed that was created in the previous test case. This makes adding new tests to the directory more difficult.
Comment 1 Holger Freyther 2008-01-12 08:04:35 PST
Created attachment 18407 [details]
Run tests in loading/ as if they were invoked with -1

Run loading/ tests "singly".
Comment 2 Holger Freyther 2008-01-12 08:05:52 PST
This issue blocks bug #1576.
Comment 3 Darin Adler 2008-01-12 10:26:12 PST
Comment on attachment 18407 [details]
Run tests in loading/ as if they were invoked with -1

This is going to make these tests run a lot slower. Is there any other solution?
Comment 4 David Kilzer (:ddkilzer) 2008-01-12 10:41:01 PST
(In reply to comment #2)
> This issue blocks bug #1576.

Bug 15765

Comment 5 Alexey Proskuryakov 2008-01-12 14:25:59 PST
I think we can probably fix this by modifying the guilty tests - no need to make DRT 100% fool-proof if we can work around its deficiencies in tests themselves.

Perhaps a simple window.stop() would help here.
Comment 6 Holger Freyther 2008-01-12 17:00:33 PST
Yes, we talk about one second more and gain determinism. Closing the DRT at every test will happen for tests in http/tests/loading and webarchive/loading together these are  

ls -w1 LayoutTests/http/tests/loading/*.html LayoutTests/webarchive/loading/*.html | wc -l
7
 
Comment 7 Holger Freyther 2008-01-12 17:04:15 PST
(In reply to comment #5)
> I think we can probably fix this by modifying the guilty tests - no need to
> make DRT 100% fool-proof if we can work around its deficiencies in tests
> themselves.

We get run-webkit-tests 100% fool-proof regarding this test and pay one second.


> Perhaps a simple window.stop() would help here.
Yes we will have to make sure to replace the mainFrame with something like about:blank before going on to the next test. The downside is that 2/3 of the new -expected.txt will come from the tear down of the test which is kind of ugly as well. 

Comment 8 Alexey Proskuryakov 2008-01-13 11:32:33 PST
This problem is not limited to http/tests/loading and webarchive/loading - I've just fixed something similar in fast/loader <http://trac.webkit.org/projects/webkit/changeset/29340>. I think fast/dom may have some tricky loading cases, too.

> The downside is that 2/3 of the new -expected.txt will come from the tear down of
> the test which is kind of ugly as well. 

Indeed. Perhaps it would be possible to disable delegate call logging when we no longer need it?
Comment 9 Holger Freyther 2008-01-13 12:36:37 PST
(In reply to comment #8)
> This problem is not limited to http/tests/loading and webarchive/loading - I've
> just fixed something similar in fast/loader
> <http://trac.webkit.org/projects/webkit/changeset/29340>. I think fast/dom may
> have some tricky loading cases, too.

You are wrong, at least my problem is limited to the dumping of FrameLoaderClient callbacks and this applies only to paths containing "loading/", which applies only to the above two directories and tests calling layoutTestController.dumpFrameLoadCallbacks (which is one test in a loading/ directory).

What I didn't do was to compare the results of "run-webkit-tests -1" to "run-webkit-tests", I only compared http/tests/loading because I want to add a test case there and refuse to pick a file name that will hide this issue (which will break on filesystems giving a listing in random order anyway, e.g. reiser4).


> Indeed. Perhaps it would be possible to disable delegate call logging when we
> no longer need it?

I sincerly believe that my proposed fix is the proper one. It doesn't artificially change the tests, it keeps them simple and clear, and is removing dead code from DRT. The downside is that we spend some more time on these 7 tests.

On the other hand we can try to change the tests and DRT. This will make the tests itself more complex, e.g. adding a tear down phase, will add more complexity to DRT. Which includes one command to make the DRT blind. DRT has to wait until the tear down was completed, has to verify that all subframes are gone and this has to work for Mac,Win,Gtk+ and Qt. I don't think the added complexity to add a test-case is helping us in anyway.
Comment 10 Alexey Proskuryakov 2008-01-13 12:49:06 PST
(In reply to comment #9)
> You are wrong, at least my problem is limited to the dumping of
> FrameLoaderClient callbacks and this applies only to paths containing
> "loading/",

Please do check the referenced patch - it's basically the same problem. Something in a test was changing DRT state in a way that affected following tests. We've seen similar problems (not always related to loading) many times before, there's really not much special about FrameLoaderClient callbacks that warrants changing tracks now.

Of course, always running the whole test suite with -1 is not an option.
Comment 11 Holger Freyther 2008-01-13 16:23:51 PST
(In reply to comment #10)
> Please do check the referenced patch - it's basically the same problem.
> Something in a test was changing DRT state in a way that affected following
> tests. We've seen similar problems (not always related to loading) many times
> before, there's really not much special about FrameLoaderClient callbacks that
> warrants changing tracks now.
> 
> Of course, always running the whole test suite with -1 is not an option.
> 

Again my short term goal is to get bug #15765 fixed and this bug is blocking it.

We have two options (at least):
   - Change the title and make it a general (e.g. "run-webkit-tests -1 gives other results than run-webkit-tests"). This means bug #15765 will not be fixed any time soon.

   - Run "run-webkit-tests -1" and try to find classes of failures, e.g. dumpFrameLoadCallbacks would be one class and file one bug report for each class as "run-webkit-tests -1" is not an option (with having one meta bug).

Sure we have to keep the big picture of DRT and run-webkit-tests interaction in mind but I think instead of hand waving and questioning the whole interaction let us fix the issues case by case. This would be done case by case.
Comment 12 Holger Freyther 2008-01-13 17:08:49 PST
Alternatively I can add a bit of complexity to the Qt version of the DRT to workaround this issue, this needs someone to do something similar for mac/windows then. I hope to post a patch tomorrow.
Comment 13 Holger Freyther 2008-01-13 17:38:12 PST
Created attachment 18429 [details]
Change Qt's DRT to throw the view away before running a loading/ test

This is a proof of concept. When throwing the View aways before we run a test from the loading directory we get deterministic results. Regarding the speed it is a lot faster than restarting the DRT. For mac/win someone could try if the speed can be imrpvoed further when about:blank is loaded instead of the view recreation.
Comment 14 David Kilzer (:ddkilzer) 2008-01-13 21:52:55 PST
(In reply to comment #9)
> I sincerly believe that my proposed fix is the proper one. It doesn't
> artificially change the tests, it keeps them simple and clear, and is removing
> dead code from DRT. The downside is that we spend some more time on these 7
> tests.

Generally speaking, tests should be independent.  They should not depend on a previous test, and they should not affect a later test.  If either of these two conditions occur, either the test is broken or the test tool is broken, and one or both should be fixed.

It should also be possible to run tests in a completely random order and still have them all pass.  

At minimum, if a work-around is committed to run these 7 tests singly, then a follow-up bug should be filed to fix the tests and/or DRT so that they're independent again.  Otherwise, the issue(s) should be fixed before the patch lands.

Comment 15 David Kilzer (:ddkilzer) 2008-01-13 22:03:12 PST
(In reply to comment #14)
> It should also be possible to run tests in a completely random order and still
> have them all pass.  

Filed Bug 16869.

Comment 16 Holger Freyther 2008-01-14 09:38:43 PST
Created attachment 18438 [details]
Run tests in loading/ as if they were invoked with -1 unless a patched DRT exists

This combines the previous patch with attachtment from yesterday:

-Run the tests signly unless a patched DRT exists. With the attachment from yesterday only the Qt DRT is capable of it. I see this like the TemporaryLinkStubs someone will make the Mac, Win and Gtk+ work but the issue should not be blocked due this.
Comment 17 Darin Adler 2008-01-14 10:05:57 PST
Comment on attachment 18438 [details]
Run tests in loading/ as if they were invoked with -1 unless a patched DRT exists

r=me, even though I am worried about slowing down regression tests.

+        postProcessOneTest($old_base, 1) unless isQt();

I think the fact that isQt is a list of DumpRenderTree implementations that support a new mode is quite unclear here. Maybe there's a better way of writing this?

+    if ($test =~ /loading\//) {

This seems like it will fire for any directory with a suffix of "loading" -- maybe the expression should be more specific?

+        postProcessOneTest($old_base, 1) unless isQt();

Would be nicer to say "true" rather than "1" here, and even nicer to have a name rather than just a magic value -- maybe a different function name for this?

+    my ($base,$forceClose) = @_;

Missing space after comma.

+    if (($count + 1) % $testsPerDumpTool == 0 || $count == $#tests || $forceClose == 1) {

No need for "== 1" here. I think it makes things more confusing.
Comment 18 Alexey Proskuryakov 2008-01-14 10:10:24 PST
Having more and more directories with obscure special behavior makes me quite sad - it's already pretty hard to keep track of those. Especially given that there is nothing special about these particular directories in this regard.
Comment 19 Beth Dakin 2008-01-18 16:14:26 PST
I landed http://bugs.webkit.org/show_bug.cgi?id=15765 a bit hastily without noticing this bug. I suppose that makes this patch a bit more pressing unless we want to roll out http://bugs.webkit.org/show_bug.cgi?id=15765 for now.
Comment 20 Beth Dakin 2008-01-18 16:16:45 PST
Sorry! I was very eager to get Gmail working again after finding a bad regression with Kevin.
Comment 21 Beth Dakin 2008-01-18 17:01:27 PST
I papered over the problem that I created by checking in new (incorrect) results with http://trac.webkit.org/projects/webkit/changeset/29659

This is by NO MEANS to suggest that we should not address the real problem, or to suggest that the real problem does not have real priority. It's just that this problem has probably existed for a while and should not block Gmail working in Debug builds. I also did not want to leave the tests failing for the time being, because that could slow productivity in other areas of work while this issue is hashed out.
Comment 22 Holger Freyther 2008-01-18 17:42:51 PST
I have talked with Ap and Darin on irc and we agreed to consider the following:

The failing things in the test result is the output of "willCloseFrame" for subframes and maybe "windowObjectsCleared". For the current test cases willCloseFrame/windowObjectsCleared is of no interest so we are going to remove the output.

For the recently added gmail-assert test we do actually close a frame on load but we have other ways to find out if the iframe got closed.

I plan to work on it on Sunday/Monday.
Comment 23 Holger Freyther 2008-01-27 10:22:40 PST
Created attachment 18722 [details]
Do not output willCloseFrame and didClearWindowObject

As proposed on IRC do not output willCloseFrame and didClearWindowObject. The loading tests will now have the same result regardless of how we run the tests (--random, --reverse, -1).
Comment 24 Darin Adler 2008-01-29 12:57:32 PST
Comment on attachment 18722 [details]
Do not output willCloseFrame and didClearWindowObject

Seems like an OK solution for the short term.

Long term it would be nice to resolve the underlying problem too and maybe log these again.

r=me
Comment 25 Holger Freyther 2008-02-05 12:51:42 PST
Landed the patch in r30014. Should we leave this bug open until the DRT can easily load about:blank and wait for it to be loaded and we can revert this patch?
Comment 26 Mark Rowe (bdash) 2008-02-24 22:41:34 PST
No, a separate bug should be used to track any other necessary fixes and this should be closed.