http/tests/xmlhttprequest/basic-auth.html timed out on Leopard bot http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48942%20(5588)/results.html I don't have frequency numbers for you yet. But I'll add notes to this bug when I see it. Not sure who to CC here, but I think ap has hacked in this code and might know.
Another one this evening: http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48942%20(5588)/results.html
Just timed out on the Snow Leopard Release bot as well: http://build.webkit.org/results/SnowLeopard%20Intel%20Release%20(Tests)/r53047%20(4086)/results.html We seem to see a lot of trouble with these auth tests on all bots. I suspect we may have a deeper problem here.
Timed out on Snow Leopard Release just now: http://build.webkit.org/results/SnowLeopard%20Intel%20Release%20(Tests)/r53660%20(4646)/results.html
http://trac.webkit.org/browser/trunk/LayoutTests/http/tests/xmlhttprequest/basic-auth.html
Created attachment 47163 [details] Patch
There are various other possibly related test failures: https://bugs.webkit.org/show_bug.cgi?id=33357 https://bugs.webkit.org/show_bug.cgi?id=33301 https://bugs.webkit.org/show_bug.cgi?id=32961 https://bugs.webkit.org/show_bug.cgi?id=30669 https://bugs.webkit.org/show_bug.cgi?id=30726 https://bugs.webkit.org/show_bug.cgi?id=29939
Comment on attachment 47163 [details] Patch I don't want to skip this. We previously had authentication tests crash, and that helped catch a bad regression - no sense in losing regression testing for auth code.
The auth tests clearly are flakey, as demonstrated in the numerous bugs above. What should we do if we don't skip them?
As long as there is no expectations mechanism to keep testing for crashes and further regressions, all we can do is suffer the flakiness. Assuming no one is going to try and fix it.
I agree that a test expectations mechanism is the right way to go. Sadly we don't have such yet. Leaving the bots red until some future time when we do is silly. I also think it's silly for one test to hold 11,000 tests hostage. Which is exactly what flakey tests do. They reduce the usefulness of the rest of the tests, but making it harder for someone to know if their patch is correct or not. People stop trusting the tests and bots. By saying "but what if it catches a crasher later" you're arguing that we should exchange some current time value for some unlikely future time value. The current value is the value of green bots catching real regressions quickly because people know that red means they broken stuff. The unlikely future value is that this test would be the only one to catch some crasher. We could put numbers on such a value estimate, but I assure you the future value is not worth the current cost. I agree skipping tests is less than ideal, but it's the only tool we currently have to keep the bots green. That or rollouts. Since we can't rollout the change that broke this test, or realistically the one which added it, the best solution we have is to skip it with plans to mark it flakey instead when we have the technology. :)
Failed on Snow Leopard: http://build.webkit.org/results/SnowLeopard%20Intel%20Release%20%28Tests%29/r57141%20%287752%29/http/tests/xmlhttprequest/basic-auth-pretty-diff.html
Eight months later and this is still failing very frequently on the bots. Sadness.
This must have been skipped or something. We haven't seen it fail in 4 months. Closing.