Bug 8770 - XMLHttpRequest should strip CR/LF characters from the URL
Summary: XMLHttpRequest should strip CR/LF characters from the URL
Alias: None
Product: WebKit
Classification: Unclassified
Component: XML (show other bugs)
Version: 420+
Hardware: Macintosh OS X 10.4
: P2 Normal
Assignee: Alexey Proskuryakov
Depends on:
Reported: 2006-05-07 06:07 PDT by Alexey Proskuryakov
Modified: 2008-01-03 00:21 PST (History)
1 user (show)

See Also:

test case (needs tcpdump) (366 bytes, text/html)
2006-05-07 06:10 PDT, Alexey Proskuryakov
no flags Details
proposed fix (2.00 KB, patch)
2006-05-11 12:58 PDT, Alexey Proskuryakov
darin: review+
Details | Formatted Diff | Diff
Test case showing bug in host and scheme. (371 bytes, text/html)
2007-12-19 09:54 PST, Brett Wilson (Google)
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Proskuryakov 2006-05-07 06:07:54 PDT
I've just debugged a problem with a Web forum that didn't work in Safari because a CR/LF sequence managed to get into the request URL. Firefox (and, presumably, WinIE) strip CRLF characters rather than percent-encode them.

Should investigate what other characters need to be stripped, and whether this applies to URLs other than those used in XMLHttpRequest.
Comment 1 Alexey Proskuryakov 2006-05-07 06:10:10 PDT
Created attachment 8142 [details]
test case (needs tcpdump)

Request from Safari:
GET /?%0D%0A HTTP/1.1
Comment 2 Alexey Proskuryakov 2006-05-11 12:58:13 PDT
Created attachment 8251 [details]
proposed fix

Yes, both Firefox and WinIE strip CR, LF and TAB, and this happens for all URLs, not just XMLHttpRequest (I've tried IFRAME SRC, window.location and META HTTP-EQUIV Refresh). No other characters from 0x01... 0x20 are stripped (as tested with Firefox).

No idea why they do this, doesn't really look like a security measure. My only wild guess is that this behavior originates with Gopher selector syntax :-)
Comment 3 Darin Adler 2006-05-11 18:06:53 PDT
Comment on attachment 8251 [details]
proposed fix

Comment 4 Brett Wilson (Google) 2007-12-19 09:46:09 PST
This bug is still very much open. The proposed fix only works for paths and reference fragments. If CR/LF/TAB appear in the host or scheme, KURL gets very confused. In the scheme, it won't even recognize it as an absolute URL, and in the host, not only will it fail, but in this case, it won't remove characters appearing later in the path.

I will attach a testcase.
Comment 5 Brett Wilson (Google) 2007-12-19 09:54:09 PST
Created attachment 17991 [details]
Test case showing bug in host and scheme.

All three of the links should go to Apple. Firefox and IE agree about all of them, WebKit fails on all of them.
Comment 6 Alexey Proskuryakov 2007-12-19 11:09:24 PST
Could you please file a new bug for that? To avoid confusion, we generally don't reopen bugs if the fix was incomplete - only if it was completely wrong, and had to be rolled out.
Comment 7 Mark Rowe (bdash) 2007-12-22 01:29:44 PST
For reference, the fix for this was landed back in r14320.  Marvin, it would be great if you could open a new bug about the issues you mentioned.
Comment 8 Darin Adler 2008-01-03 00:21:33 PST
I'm re-closing this bug. Marv, please open a new one.