fast/preloader/document-write.html started to fail very very frequently last week, I suspect <http://trac.webkit.org/changeset/166144>. The diff seems to always be the same: @@ -1,4 +1,3 @@ -document-write-plaintext.js has MIME type text/javascript script1.js has MIME type text/javascript This test requires DumpRenderTree to see the log of what resources are loaded.
Seems likely the test has a race condition in it. Let me take a look at how it works.
Is the evidence pointing to r166144 strong? I ask because after reading and re-reading that patch it seems highly unlikely that change would affect a test that does not involve <object> or <embed> elements. I think that rolling out that change because of this test failure would be a big mistake. The failure does not reflect what this test is trying to test. The point of this test is whether the preload scanner kicks off the load of script1.js or not, and clearly that is still happening. It’s just that the test is fragile for some reason. Antti, do you have any idea what could be making this test sometimes not log the load of document-write-plaintext.js?
This test now fails about 50% of the time on some of the bots, and the earliest detected failure was on r166144:r166145. Preceding changes seem much less relevant, so yes, I would say that the evidence is fairly strong. Complete history: http://webkit-test-results.appspot.com/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=fast%2Fpreloader%2Fdocument-write.html If we can't roll out this change, and if cannot fix it, then we should definitely disable or remove the test. It's pure noise now, only confirming that it doesn't crash.
(In reply to comment #3) > If we can't roll out this change We can, it’s just not necessarily worth it. > if cannot fix it No clue on the cause of the different behavior yet. > we should definitely disable or remove the test. Maybe. > It's pure noise now, only confirming that it doesn't crash. Well, the fact that script1.js appears in the output meaningfully indicates that what the test was designed to test is still working; preloading is occurring. It’s unfortunate that the test machinery classifies that as a failure. Whether document-write-plaintext.js shows up in the output or not is not relevant to what the test is trying to test. So the test would still have some value if we could find a way to ignore the irrelevant part. And also it would be interesting to know why the load of document-write-plaintext.js does not always log. Maybe there’s some previous test that loads it and it is cached?
document-write-plaintext.js is also used in fast/preloader/document-write-2.html (which comes before fast/preloader/document-write.html, counterintuitively). I tried to reproduce this issue locally to see if making the resource name unique would resolve it, but I couldn't make it happen. So, I just marked the test as flaky for now in <http://trac.webkit.org/r166548>, to avoid increasing confusion. Also, I suspect that the test is inherently flaky, because preloading could happen before JS execution starts, and thus before we call testRunner.dumpResourceResponseMIMETypes(). Assigning to Darin per his request.
(In reply to comment #5) > document-write-plaintext.js is also used in fast/preloader/document-write-2.html (which comes before fast/preloader/document-write.html, counterintuitively). > > I tried to reproduce this issue locally to see if making the resource name unique would resolve it, but I couldn't make it happen. It might be good to just land the change. That sound like a very potential fix. > Also, I suspect that the test is inherently flaky, because preloading could happen before JS execution starts, and thus before we call testRunner.dumpResourceResponseMIMETypes(). The script tags for the test resources are themselves inserted by document.write. Preload scanner can't see them before the script runs, in strict document order. It would be good to come up with a more direct mechanism for testing preload scanner and resource loads in general.
Due to PLT issues 166144 was rolled out in http://trac.webkit.org/changeset/166680
Looks good after the rollout. Removed expectation in <http://trac.webkit.org/r166699>.
The offending code got re-landed in <http://trac.webkit.org/r166853>, and the test is flaky again. Added a test expectation in <http://trac.webkit.org/r167015>.
What is the next step on this? The change was intended to be a refactoring with no side effects, but it broke a test. We have more object-related regressions in the tree, from other changes. See also bug 132297, and <http://webkit-test-results.appspot.com/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=plugins%2Fembed-attributes-style.html> for which there is no bug at the moment.
(In reply to comment #10) > What is the next step on this? The change was intended to be a refactoring with no side effects, but it broke a test. The next step is that I will fix this failure when I have a chance. > We have more object-related regressions in the tree, from other changes. See also bug 132297 Is that object-related?
> Is that object-related? Yes, Tim's analysis in that bug appears to imply so. The test content is all in an <object>, and layout state is not properly synchronized with main document. This is probably best to discuss in bug 132297. It's unfortunate that we have the additional flakiness from r166853, but I have no evidence that it's directly related to that bug.
Any tips on how to reproduce the failure locally? I am running the test over and over again without seeing any failures.
Created attachment 230971 [details] (huge) command to reproduce locally This is quite hard to reproduce locally indeed. The only way I could reproduce was by replicating the exact sequence of tests as run by a single instance of DumpRenderTree on a bot. This can undoubtedly be simplified through bisection, but running only the last few tests is insufficient. The attached command does two iterations of that. On first iteration, fast/preloader/document-write.html fails. On second iteration, fast/preloader/document-write-2.html and fast/preloader/document-write.html both fail.
I'm bisecting now.
Here is the minimal sequence to reproduce: run-webkit-tests --child-processes 1 --no-retry-failures fast/dom/clone-node-load-event-crash.html fast/preloader/document-write-2.html fast/preloader/document-write.html This fails too: run-webkit-tests --child-processes 1 --no-retry-failures fast/dom/clone-node-load-event-crash.html fast/preloader/document-write-2.html fast/preloader/document-write-2.html
An additional step, possibly obvious: comment out the entry for fast/preloader/document-write.html in platform/mac/TestExpectations, or run-webkit-tests will say "All 3 tests ran as expected."
Thanks. I had edited TestExpectations, but this recipe will be very helpful.
The problem here is simply cached resources from previous tests. Something needs to clear these caches for tests that rely on resources not being cached. I have started tackling this and I have a first cut patch.
Created attachment 231166 [details] Patch
I attached my first cut at a patch. There are a few things I am uncertain about: 1) Should emptying the caches be something the tests do explicitly separately instead of as a side effect of the “serialize loads” test function? 2) Is it valuable to empty the disk caches too? The real issue here is with in-memory caches. 3) Is there some additional consideration with the network process? Probably not, because what is actively causing a problem here is in-web-process memory caching. 4) Does the problem I fixed in +[WebCache empty] with the “just change the caching model twice” technique for emptying caches reflect a bug that affects developers using public WebKit API? Should we fix that?
I'm super curious why fast/dom/clone-node-load-event-crash.html is necessary to reproduce this.
(In reply to comment #22) > I'm super curious why fast/dom/clone-node-load-event-crash.html is necessary to reproduce this. Sadly, I’m out of time to work on this today, so I won't have a chance to answer that question today.
Comment on attachment 231166 [details] Patch This is a workaround, not the real fix.
Created attachment 231226 [details] Patch
Attachment 231226 [details] did not pass style-queue: ERROR: Source/WebCore/style/StyleResolveTree.cpp:992: Missing space before { [whitespace/braces] [5] Total errors found: 1 in 3 files If any of these errors are false positives, please file a bug against check-webkit-style.
Comment on attachment 231226 [details] Patch This is the real fix.
Committed r168580: <http://trac.webkit.org/changeset/168580>