I've just debugged a problem with a Web forum that didn't work in Safari because a CR/LF sequence managed to get into the request URL. Firefox (and, presumably, WinIE) strip CRLF characters rather than percent-encode them.
Should investigate what other characters need to be stripped, and whether this applies to URLs other than those used in XMLHttpRequest.
Created attachment 8142 [details]
test case (needs tcpdump)
Request from Safari:
GET /?%0D%0A HTTP/1.1
Created attachment 8251 [details]
Yes, both Firefox and WinIE strip CR, LF and TAB, and this happens for all URLs, not just XMLHttpRequest (I've tried IFRAME SRC, window.location and META HTTP-EQUIV Refresh). No other characters from 0x01... 0x20 are stripped (as tested with Firefox).
No idea why they do this, doesn't really look like a security measure. My only wild guess is that this behavior originates with Gopher selector syntax :-)
Comment on attachment 8251 [details]
This bug is still very much open. The proposed fix only works for paths and reference fragments. If CR/LF/TAB appear in the host or scheme, KURL gets very confused. In the scheme, it won't even recognize it as an absolute URL, and in the host, not only will it fail, but in this case, it won't remove characters appearing later in the path.
I will attach a testcase.
Created attachment 17991 [details]
Test case showing bug in host and scheme.
All three of the links should go to Apple. Firefox and IE agree about all of them, WebKit fails on all of them.
Could you please file a new bug for that? To avoid confusion, we generally don't reopen bugs if the fix was incomplete - only if it was completely wrong, and had to be rolled out.
For reference, the fix for this was landed back in r14320. Marvin, it would be great if you could open a new bug about the issues you mentioned.
I'm re-closing this bug. Marv, please open a new one.