fast/workers/worker-cloneport.html is timing out on Leopard Commit Bot
Running tests from /Users/eseidel/Projects/CommitQueue/LayoutTests
Testing 12449 test cases.
fast/workers/worker-cloneport.html -> timed out
Sampling process 89159 for 10 seconds with 10 milliseconds of run time between samples
Sampling completed, processing symbols...
Sample analysis of process 89159 written to file /Users/eseidel/Library/Logs/DumpRenderTree/HangReport.txt
Exiting early after 1 failures. 8774 tests run.
343.58s total testing time
8773 test cases (99%) succeeded
1 test case (<1%) timed out
5 test cases (<1%) had stderr output
Unabled to successfully build and test
I've not seen this time out on the bots recently, but it seems to be consistently failing for the commit-bot. This is a new failure from today as far as I can tell.
Qt has also seen crashes related to this test. bug 33653.
I'm forcing the commit-queue bot to do a clean build.
We may just have to skip this test for the time being since:
1. This is definitely a recent regression.
2. There are currently 18 patches blocked on this bug.
Created attachment 50142 [details]
Hang report from Leopard Commit Bot
I can't really tell what's going on from that hang report. Almost the entire time is spent in:
Ha! This is timing out on the bots too!
is the first timeout I could find on the bots.
Created attachment 50143 [details]
Skip the test until this bug can be looked at
Committed r55604: <http://trac.webkit.org/changeset/55604>
I forgot to pass --no-close to "webkit-patch land", sorry.
Now also in Snow Leopard: http://build.webkit.org/builders/SnowLeopard%20Intel%20Release%20(Tests)/builds/6321/steps/layout-test/logs/stdio
Should we revert r55593?
I'm really not sure that it's caused by r55593. But we could certainly revert it and see if that fixes things.
I don't see how r55593 could've made this test fail, since it doesn't change any worker-related code, and the test doesn't seem to have any DB-related code. If you need somebody to take a look at that test, I could do it on Monday. But please don't rollback r55593 only because of this test -- sometimes the test is bad, not the patch (see fast/frames/sandboxed-iframe-storage.html in r55593).
This test does a bunch of work (sends thousands of messages to a port that is being cloned).
Looking at the hang report, it *looks* like work is still being done, as expected - so I'm wondering if maybe a patch just introduced a performance regression that's causing this test to time out now. I don't know if the test is necessarily broken, just too slow to complete now?
(In reply to comment #13)
> This test does a bunch of work (sends thousands of messages to a port that is
> being cloned).
> Looking at the hang report, it *looks* like work is still being done, as
> expected - so I'm wondering if maybe a patch just introduced a performance
> regression that's causing this test to time out now. I don't know if the test
> is necessarily broken, just too slow to complete now?
It seems that I'm able to repro this locally. I'l investigate (at the very least I can walk back to see if r55593 is really at fault).
OK, this test times out repeatedly when I'm synced to 55593 (about 30% of the time).
When I'm synced to 55592, I was able to run it 100 times without any timeouts. So it must be r55593.
Assigning to Dumi to investigate further - we probably want to roll r55593 back in the meantime.
(In reply to comment #15)
> Assigning to Dumi to investigate further - we probably want to roll r55593 back
> in the meantime.
OK, the problem is the change to Document.postTask() in r55593:
Unfortunately, there doesn't seem to be any documentation either in the file or in the ChangeLog describing why we changed the behavior of postTask(), but when I revert that change the test passes again.
Adam/Dumi - any insight here as to why we changed the Document.postTask() behavior? If all tests pass without that change, I'll revert just that file and you guys can investigate re-introducing that change later.
I confirm reverting just a Document::postTask change fixes the test back. Dumi is the best to answer, but basically the reason for postTask change was that it did not preserve the order in which tasks were posted, if they were posted from different threads.
Dumi contacted me offline and I was worried about this because it would change timing in some worker cases and break them, but couldn't see what could be actually broken. We should revert r55593 and investigate. It's unknown what else it'll break, and now we have a nice repro to work with. I will revert it in a few hours and reinstate the test in a few hours if nobody comes up with a fix.
(In reply to comment #17)
> I confirm reverting just a Document::postTask change fixes the test back. Dumi
> is the best to answer, but basically the reason for postTask change was that it
> did not preserve the order in which tasks were posted, if they were posted from
> different threads.
In any case, reverting postTask() breaks open-database-creation-callback.html, so it's not an option to just revert that one method - we have to revert the entire CL.
Also on Tiger: http://build.webkit.org/builders/Tiger%20Intel%20Release/builds/9405/steps/layout-test/logs/stdio
Revert of r55593 landed: http://trac.webkit.org/changeset/55635