Bug 29322 - REGRESSION: http/tests/xmlhttprequest/cross-origin-authorization.html is failing/crashing intermittently
Summary: REGRESSION: http/tests/xmlhttprequest/cross-origin-authorization.html is fail...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC OS X 10.5
: P1 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks: 29505
  Show dependency treegraph
 
Reported: 2009-09-16 18:35 PDT by Eric Seidel (no email)
Modified: 2009-09-29 10:34 PDT (History)
5 users (show)

See Also:


Attachments
diff of failure (1.17 KB, text/plain)
2009-09-16 18:45 PDT, Eric Seidel (no email)
no flags Details
Proposed fix to the test to make them more reliable. (5.79 KB, patch)
2009-09-18 15:18 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Proposal to skip tests (to keep the bots green) until the bugs can be fixed (1.42 KB, patch)
2009-09-23 19:12 PDT, Eric Seidel (no email)
ap: review-
Details | Formatted Diff | Diff
crash log from crash seen while trying to land bug 29612 (55.96 KB, text/plain)
2009-09-28 16:07 PDT, Eric Seidel (no email)
no flags Details
proposed fix (16.54 KB, patch)
2009-09-28 18:23 PDT, Alexey Proskuryakov
beidson: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2009-09-16 18:35:34 PDT
http/tests/xmlhttprequest/cross-origin-authorization.html is failing intermittently

It failed for the commit-queue when trying to land bug 

and has failed on the bots today as well:
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48438%20(5043)/results.html
Comment 1 Eric Seidel (no email) 2009-09-16 18:35:54 PDT
Sorry, I meant "when trying to land bug 29319" :)
Comment 2 Eric Seidel (no email) 2009-09-16 18:45:08 PDT
Created attachment 39675 [details]
diff of failure
Comment 3 Eric Seidel (no email) 2009-09-16 18:59:05 PDT
(I've marked this as a regression because I've only seen this start to fail in the last few days.  It's possible it's been failing longer, but we don't have a good way to know yet.
Comment 4 Eric Seidel (no email) 2009-09-16 19:15:12 PDT
Speak of the devil.  The most recent build again:
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48448%20(5052)/results.html
Comment 6 Alexey Proskuryakov 2009-09-16 19:29:39 PDT
CC'ing Brady, because he did some major work in this area recently.
Comment 7 Brady Eidson 2009-09-16 19:37:30 PDT
Only Leopard?
Comment 8 Brady Eidson 2009-09-16 19:38:14 PDT
And do we know when the first failure was?  When I last checked in work on this, I watched the bots closely for awhile and saw no Leopard failures.
Comment 9 Eric Seidel (no email) 2009-09-16 19:41:10 PDT
The first time the commit-queue saw this failure was yesterday:
https://bugs.webkit.org/show_bug.cgi?id=27846#c5

It's possible that the buildbots saw the failure before that though.
Comment 10 Darin Adler 2009-09-18 10:46:02 PDT
I saw this test time out on my Snow Leopard machine just now.
Comment 11 Brady Eidson 2009-09-18 15:18:39 PDT
Created attachment 39798 [details]
Proposed fix to the test to make them more reliable.

This change might make them more reliable, and doesn't seem to make things worse - they all continue to pass on my machine.
Comment 12 Alexey Proskuryakov 2009-09-18 15:19:27 PDT
Comment on attachment 39798 [details]
Proposed fix to the test to make them more reliable.

r=me!
Comment 13 Eric Seidel (no email) 2009-09-18 15:20:35 PDT
Yay!

Saw this again just now:
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48536%20(5140)/results.html

--- layout-test-results/http/tests/xmlhttprequest/cross-origin-authorization-expected.txt	2009-09-18 15:11:01.000000000 -0700
+++ layout-test-results/http/tests/xmlhttprequest/cross-origin-authorization-actual.txt	2009-09-18 15:11:01.000000000 -0700
@@ -6,11 +6,11 @@
 SCRIPT SRC='...' Should succeed, since authorization is sent for cross-origin subresource loads.
 PASS: Loaded, user test
 Cross-origin XMLHttpRequest (sync), testing stored authorization.
-PASS
+FAIL: 401 Authorization required
 Cross-origin XMLHttpRequest (sync), testing cookies.
 PASS
 Cross-origin XMLHttpRequest (async), testing stored authorization.
-PASS
+FAIL: Received error event.
 Cross-origin XMLHttpRequest (async), testing cookies.
 PASS
 Cross-origin XMLHttpRequest (sync), testing authorization with explicitly provided credentials that should be ignored.

Hopefully your fix will do the trick!
Comment 14 Brady Eidson 2009-09-18 15:27:08 PDT
Committed revision 48541.

Not closing the bug, because I've never been able to reproduce this and it's intermittent for those affected, so we'll have to let it bake a little...
Comment 15 Eric Seidel (no email) 2009-09-19 11:11:01 PDT
http/tests/xmlhttprequest/cross-origin-no-authorization.html -> failed
just failed too while trying to land bug 29345.  I assume these are related?  I'm happy to file a separate bug.
Comment 16 Brady Eidson 2009-09-19 22:25:29 PDT
Definitely related, but it might use a different .php script that I didn't cleanup - filing a seperate bug will make sure someone takes a closer look  :)
Comment 17 Alexey Proskuryakov 2009-09-20 12:09:07 PDT
Is there something that can be filed as a CFNetwork bug? "Connection: close" clearly shouldn't be a prerequisite for authentication working properly.
Comment 18 Eric Seidel (no email) 2009-09-21 18:13:53 PDT
Comment on attachment 39798 [details]
Proposed fix to the test to make them more reliable.

This was landed as r48541 as noted above.  Clearing the r+ so that it doesn't show up in the list-of-stuff-to-be-committed. :)  Yay for non-flakey tests!
Comment 19 Eric Seidel (no email) 2009-09-22 10:01:53 PDT
It looks like cross-origin-authorization.php might be the script missing the close, if Brady's theory is correct.  We saw this again on the Leopard Debug bot just now:
http://build.webkit.org/results/Leopard%20Intel%20Debug%20(Tests)/r48636%20(5177)/results.html
Comment 20 Eric Seidel (no email) 2009-09-22 15:35:24 PDT
Leopard Release bot just now:
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48652%20(5247)/results.html

I'm going to test to see if:
run-webkit-tests --iterations 100 http/tests/xmlhttprequest/cross-origin-authorization.htm
will reproduce this locally on my leopard box.
Comment 21 Eric Seidel (no email) 2009-09-22 15:46:25 PDT
run-webkit-tests --iterations 100
http/tests/xmlhttprequest/cross-origin-authorization.html
was not failing for me.

run-webkit-tests --iterations 100 http/tests/xmlhttprequest
Saw two failures, related to cookies:
http/tests/xmlhttprequest/cookies.html -> failed
http/tests/xmlhttprequest/cross-origin-cookie-storage.html -> failed
I've not investigated.  I expect the cookie things are an issue of test design, unrelated to this bug.
Comment 22 Eric Seidel (no email) 2009-09-23 16:43:18 PDT
I would like to suggest that we skip http/tests/xmlhttprequest/cross-origin-authorization.html on leopard to keep the builders green.  Thoughts?

Failed again just now:
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48685%20(5280)/results.html
Comment 23 Alexey Proskuryakov 2009-09-23 17:28:43 PDT
It seems that at this point, the flakiness primarily affects the commit queue. Could it have its own extended skip list? That's how a human committer would operate.
Comment 24 Eric Seidel (no email) 2009-09-23 17:30:38 PDT
Flakiness affects us all. :) But yes, we could have a separate skipped list for the commit-queue.

I'm not sure what we gain by running flakey tests and having them occasionally cause the tree to fail.  It just makes the build bots harder to read/use.
Comment 25 Eric Seidel (no email) 2009-09-23 18:44:31 PDT
Another solution would be to use test-expectation files like Chromium does and add both PASS and FAIL as an expectation for this test.

http://src.chromium.org/viewvc/chrome/trunk/src/webkit/tools/layout_tests/test_expectations.txt

Sadly, it failed yet again just now:
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48693%20(5287)/results.html

I think for now I'll write up a patch to skip the test on Leopard, and if you folks agree we can go ahead an make it so.
Comment 26 Alexey Proskuryakov 2009-09-23 18:52:45 PDT
I disagree with swiping potential security issues under the carpet. They are supposed to inflict pain until fully investigated.
Comment 27 Eric Seidel (no email) 2009-09-23 19:06:38 PDT
(In reply to comment #26)
> I disagree with swiping potential security issues under the carpet. They are
> supposed to inflict pain until fully investigated.

I think if we have trouble forgetting to fix security issues, then we have bigger problems. :)  In this case, I think tracking bugs is what the bug tracker is for.  Telling us if our current build is OK is what the layout tests is for.  Flakey tests make the Layout Tests less effective and thus should be turned into bugs and skipped.  We should fix the bugs in the tracker(s) and rely on our Layout Tests to tell us if our builds work or not. :)
Comment 28 Eric Seidel (no email) 2009-09-23 19:12:26 PDT
Created attachment 40034 [details]
Proposal to skip tests (to keep the bots green) until the bugs can be fixed
Comment 29 Eric Seidel (no email) 2009-09-23 23:13:54 PDT
Per Alexey's request, we've taken the discussion to webkit-dev.
Comment 30 Eric Seidel (no email) 2009-09-24 13:42:33 PDT
Failed again just now. :(
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48728%20(5319)/results.html
Comment 31 Eric Seidel (no email) 2009-09-24 13:54:14 PDT
And again. :(
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48729%20(5322)/results.html
I think this one is quickly winning the "most flakey test" award. :)
Comment 32 Alexey Proskuryakov 2009-09-24 16:31:09 PDT
As far as I can tell, this test wasn't flaky when landed. When did this problem begin to happen?
Comment 33 Eric Seidel (no email) 2009-09-24 16:43:39 PDT
The earliest record the commit-queue has of this failure is from 11AM on 9/15/09:
https://bugs.webkit.org/show_bug.cgi?id=27846#c5

I don't think the buildbot logs go back that far, but I could be wrong.
Comment 34 Eric Seidel (no email) 2009-09-24 16:48:09 PDT
The commit-queue box was upgrade to 10.5.8 on 8/12/09, so I don't believe that's related.
Comment 35 Eric Seidel (no email) 2009-09-26 00:54:56 PDT
The commit-bot saw this test CRASH instead of fail this evening:
https://bugs.webkit.org/show_bug.cgi?id=29612#c27
That's kinda concerning. :(
Comment 36 Eric Seidel (no email) 2009-09-26 00:55:25 PDT
I can lookup the crash log for you on Monday when I'm back in the office and post it here.
Comment 37 Eric Seidel (no email) 2009-09-28 12:55:50 PDT
Claimed another victim.  bug 28054.
Comment 38 Eric Seidel (no email) 2009-09-28 16:07:45 PDT
Created attachment 40268 [details]
crash log from crash seen while trying to land bug 29612
Comment 39 Eric Seidel (no email) 2009-09-28 16:10:42 PDT
Comment on attachment 40268 [details]
crash log from crash seen while trying to land bug 29612

        if (!delegate->m_initialCredential.isEmpty()) {
            // Apply basic auth header
            String unencoded = delegate->m_initialCredential.user() + ":" + delegate->m_initialCredential.password();
            CString encoded = unencoded.utf8().base64Encode();
            String authHeader = String::format("Basic %s", encoded.data());
            requestWithInitialCredentials.addHTTPHeaderField("Authorization", authHeader);
        }

Is the crashing code.  That could be related to the failures seen in this test. It's possible that this test is flakey due to memory issues?

I'm also happy to file a separate bug for the crash if desired.
Comment 40 Eric Seidel (no email) 2009-09-28 16:57:59 PDT
Entertainingly, two related failures occurred just now:
http://build.webkit.org/results/Leopard%20Intel%20Release%20(Tests)/r48845%20(5501)/results.html
Alexey said he's debugging this as we speak, so here's hoping for successful resolution!
Comment 41 Alexey Proskuryakov 2009-09-28 17:18:54 PDT
Comment on attachment 40034 [details]
Proposal to skip tests (to keep the bots green) until the bugs can be fixed

I think I know what causes this; let's not skip the test.
Comment 42 Eric Seidel (no email) 2009-09-28 17:20:06 PDT
Thank you Alexey for investigating!
Comment 43 Eric Seidel (no email) 2009-09-28 17:59:16 PDT
Documenting frequency:  Another commit-bot failure: https://bugs.webkit.org/show_bug.cgi?id=29784#c4
Comment 44 Alexey Proskuryakov 2009-09-28 18:19:35 PDT
<rdar://problem/7259965>
Comment 45 Alexey Proskuryakov 2009-09-28 18:23:47 PDT
Created attachment 40276 [details]
proposed fix
Comment 46 Oliver Hunt 2009-09-28 22:23:03 PDT
Comment on attachment 40276 [details]
proposed fix

Why are there two copies of encodeBasicAuthorization added? surely one shared
implementation would be superior?
Comment 47 Brady Eidson 2009-09-29 08:22:02 PDT
Comment on attachment 40276 [details]
proposed fix

Yah...  I'm not gonna make it a deal breaker like Olliej, but if you wanted a shared implementation, no reason that encodeBasicAuthorization couldn't be moved to the Credential class, or at least to Credential.h as a free function.

Since you're always encoding the name and password from the same credential, it's logically an operation on the Credential anyways.
Comment 48 Alexey Proskuryakov 2009-09-29 09:24:32 PDT
Committed revision 48879.

I don't think that specifics of credential encoding for HTTP should be part of Credential class - its job is to hold the data, not to transfer it over HTTP. We'll probably want to factor out the code when implementing the same functionality for digest auth. 

The Basic case is just a few straightforward lines anyway, probably not worth having a separate file for this.
Comment 49 Eric Seidel (no email) 2009-09-29 10:34:08 PDT
Thank you for taking this on Alexey.  Here's hoping for green bots!