RESOLVED CONFIGURATION CHANGED63267
WebSockets constructor erroneously unescapes forward slashes in URLs
https://bugs.webkit.org/show_bug.cgi?id=63267
Summary WebSockets constructor erroneously unescapes forward slashes in URLs
Brad Wright
Reported 2011-06-23 11:31:06 PDT
When using the following constructor: ws = new WebSocket('ws://localhost/socket.io/blah%2Ffoo%2Fbar/websocket'); The following raw request is logged by the server: GET /socket.io/blah/foo/bar/websocket HTTP/1.1 in current WebKit nightly (Version 5.0.5 (6533.21.1, r89577)) and current Safari. This indicates that the "%2F" escaped characters are being incorrectly unescaped before being sent to the server. When doing the same behaviour in development Chrome, this raw request is logged to the server: GET /socket.io/blah%2Ffoo%2Fbar/websocket HTTP/1.1 which is what I would expect. This has been verified via the Developer console and within a web page.
Attachments
Alexey Proskuryakov
Comment 1 2011-06-24 00:22:27 PDT
I don't know if this is a bug, but it's almost certainly not limited to WebSocket. All network paths are likely getting the same treatment.
Brad Wright
Comment 2 2011-06-24 05:37:37 PDT
Doing the same request via the address bar gives me the following request on the server: GET /socket.io/blah%2Ffoo%2Fbar/websocket HTTP/1.1 which looks right (and different to the WebSockets request).
Yuta Kitamura
Comment 3 2011-06-26 21:01:32 PDT
If I read RFC 3986 correctly, this sounds like a bug. Reserved characters should not be decoded. WebSocket uses KURL::path() to obtain path component of the given URL, and KURL::path() unescapes percent-encoded characters, so this issue occurs. Chromium is not affected, because KURLGoogle::path() does not unescape. So the real problem seems to exist in KURL::path(). Who does know KURL well?
Mike West
Comment 4 2011-07-03 02:10:02 PDT
Darin Fisher wrote the code in question, and Darin Adler reviewed it ((https://bugs.webkit.org/show_bug.cgi?id=23546). They'd probably be good people to ask about the difference in escaping behavior, as it's apparently intentional. I'm adding both to CC.
Darin Fisher (:fishd, Google)
Comment 5 2011-07-06 12:44:58 PDT
Brett Wilson (brettw@chromium.org) is actually the original author of KURLGoogle.cpp. I just helped upstream it. It is almost never a good idea for client software to unescape URLs. The only exception I know of is when you want to display an URL to the user. It is in my opinion a bug that KURL unescapes URL segments. You should always echo what you get back to the server and let the server deal with the unescaping. Mozilla used to unescape locally in some cases, and in almost each case, it resulted in a security bug. Eventually, Mozilla moved to a model of never unescaping unless it was for UI purposes. Chrome was designed with a similar principle. I fully support eliminating the unescaping code from KURL, although I'm not certain that it will not have unwanted side-effects for applications embedding WebKit. They may need to do their own unescaping at the UI level, which they may have been counting on WebKit doing for them.
Alexey Proskuryakov
Comment 6 2016-09-08 13:57:13 PDT
Anne van Kesteren
Comment 7 2023-05-12 09:25:21 PDT
This was resolved with the new URL parser.
Note You need to log in before you can comment on or make changes to this bug.