From <https://bugzilla.mozilla.org/show_bug.cgi?id=261929>. WinIE 6 and Opera by default encode the path part of the URL as UTF-8, and use the page encoding only for the query part. A proposed standard on Internationalized Resource Identifiers <http://www.w3.org/International/iri-edit/> says that UTF-8 should be unconditionally used for all parts, and IE 7 beta preview2 reportedly works this way, intentionally or not. Safari uses the page encoding even for the path part, matching Firefox (see the Mozilla bug mentioned above). Besides the W3C test from the bug URL, the following page has been mentioned as an example: <http://www.cdpkorea.com/zboard4/zboard.php?id=pdsboard&page=1&page_num=20&select_arrange=headnum&desc=&sn=off&ss=on&sc=on&keyword=&no=43865&category> (there should be four photos, not four replacement images).
Created attachment 9001 [details] proposed fix The major browsers disagree on many details of non-ASCII URI handling; also, both Firefox 3 and WinIE 7 include major changes to it. This patch makes a single modification that seems undisputed, and includes a test that verifies the status quo.
Comment on attachment 9001 [details] proposed fix Please disregard the empty utf8-window-location.html in the patch.
Comment on attachment 9001 [details] proposed fix I wonder what the real-world impact of this is going to be. It's interesting hearing what the various browsers do, but I also wonder what the various websites do. Do we know any websites where the old Safari would work and the new one would fail? r=me
Committed revision 15010. (In reply to comment #3) The Mozilla bug mentions one page that needs this change, and has the following comment: "In the past, I saw many web sites asking their visitors to turn off 'Always send URLs in UTF-8' in MSIE. These days, I rarely see it." Sites that would regress are those running older Unices, with file systems not in UTF-8 (and without an Apache module recoding file paths). Since WinIE and Opera default to UTF-8 for paths, such sites are apparently rare. I do not know any examples. Actually, I'm surprised that we didn't have bug reports with this issue being a root cause (or are they all in Radar?). Myself, I did see people building in-house .asp pages with Windows Cyrillic charset and Russian file names; those won't work in current Firefox and Safari releases (I think; never had a chance to actually test that).