Summary: | More IDNs containing Unicode combining marks should probably be displayed as punycode | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Mathias Bynens <mathias> | ||||
Component: | Platform | Assignee: | Nobody <webkit-unassigned> | ||||
Status: | NEW --- | ||||||
Severity: | Normal | CC: | annevk, ap, mathias, mcatanzaro, michael | ||||
Priority: | P2 | ||||||
Version: | 528+ (Nightly build) | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Attachments: |
|
Description
Mathias Bynens
2014-01-08 01:05:56 PST
The original title described something we certainly can't do - there are legitimate domain names with combining marks (after all, the whole purpose of IDN is to allow domain names in languages other than English). Re-titling to track a more specific subset. The specific example of xn--luaaaaaaaaaaaaaaaaaaaaaaaaaaaa8465w seems like something that is allowed by IDN 2003, but isn't allowed by IDN 2008. Not sure how to check if it's allowed by Unicode Standard #46 (yes, it's a mess of incompatible standards). The other specific example of xn--apple-xvd ("apple" with an arrow) is different, as even IDN 2008 allows it, as far as I can tell. (In reply to comment #1) > The original title described something we certainly can't do - there are legitimate domain names with combining marks For the record, the original title was: “IDNs containing Unicode combining marks should be displayed in Punycoded form”. I disagree that’s something you “certainly can’t do” — every other browser seems to do it. > […] after all, the whole purpose of IDN is to allow domain names in languages other than English […] The domain names would still work when entered in their raw Unicode form. apple͢.com in the address bar would still resolve to xn--apple-xvd.com. This bug is purely about the way the domains are displayed in the address bar. See http://wiki.whatwg.org/wiki/URL#UI for some details around this. > I disagree that’s something you “certainly can’t do” — every other browser seems to do it. Do what? Are you saying that every other browser displays "http://www.café.fr" as punycode? > See http://wiki.whatwg.org/wiki/URL#UI for some details around this. Thank you, this is very useful information. To avoid potential link rot, the current content of the wiki is these links: http://www.chromium.org/developers/design-documents/idn-in-google-chrome (also includes summary for other browsers) https://wiki.mozilla.org/IDN_Display_Algorithm http://www.alvestrand.no/pipermail/idna-update/2011-December/date.html (has lots of background discussion) Created attachment 255350 [details]
personal public info
Summary: The .de NIC (denic.de) will implement IDNA2008 from 2010-11-16 onwards, especially allowing for ß (\u00df) in domain names. Hence, the automatic translation of ß to ss may result in looking up the wrong domain name, allowing for spoofing attacks. (DENIC will run a sunrise period (2010-10-26 to 2010-11-15) during which holders of domains with ss will be allowed top register the respective ß domain in advance.) http://www.denic.de/en/domains/internationalized-domain-names/sharp-s.html ß and ss are not exchangable in German. ss instead of ß is just a makeshift. Germans expect ß to usually just work if umlauts work (which already do for a while). Steps to Reproduce: 1. Start Safari on OS X Mountain Lion or on iOS (the same on El Capitan and iOS 9.x) 2. Open the domain "http://www.heß.de" (a family name). Expected Results: Safari will change the "ß" character to "ss" and open "http://www.hess.de" which is a completely different family name (last name). For example: "Michael Heß" and "Peter Hess". FireFox Nightly Version 46.0a1 (and newer) did it now correct. Here are some informations: https://hg.mozilla.org/mozilla-central/rev/ac843b130537 https://hg.mozilla.org/mozilla-central/rev/dd3d6c83f354 https://hg.mozilla.org/mozilla-central/rev/f23234d57557 And the FireFox Bug Thread for this issue: https://bugzilla.mozilla.org/show_bug.cgi?id=479520 You can check the following Websites in FireFox Nightly Version 46.0a1 (2016-01-24) to understand: http://www.roessner.de http://www.rössner.de (is linked to www.roessner.de because it's the same owner, but it's also a separate name in Germany). http://www.rößner.de For example: Bill Roessner Jack Rössner Mike Rößner The browsers do it right with "roessner" and "rössner" and separate it because these are different names. Now, it's time to do it also right with the "ß" character. For now i have to use the URL in Safari: http://www.xn--rner-vna1l.de if i want to go to: http://www.rößner.de thanks & kind regards Michael (In reply to comment #7) > Expected Results: > Safari will change the "ß" character to "ss" and open "http://www.hess.de" > which is a completely different family name (last name). For example: > "Michael Heß" and "Peter Hess". > > > FireFox Nightly Version 46.0a1 (and newer) did it now correct. I think you should file a different bug for this. Your issue is orthogonal to this one, almost the opposite as this bug requires less internationalization. |