Bug 126627 - More IDNs containing Unicode combining marks should probably be displayed as punycode
Summary: More IDNs containing Unicode combining marks should probably be displayed as ...
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Platform (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-08 01:05 PST by Mathias Bynens
Modified: 2016-12-15 16:27 PST (History)
5 users (show)

See Also:


Attachments
personal public info (70.32 KB, text/plain)
2015-06-22 07:12 PDT, bugmenot
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mathias Bynens 2014-01-08 01:05:56 PST
For security reasons, internationalized domain names containing Unicode combining marks should be displayed in Punycoded form in Safari’s address bar.

Someone could register xn--apple-xvd.com and it would display in Safari’s address bar as apple͢.com, which enables all kinds of phishing attacks.

See <http://blog.dinaburg.org/2014/01/stupid-idn-tricks-unicode-combining.html>.
Comment 1 Alexey Proskuryakov 2014-01-08 09:57:35 PST
The original title described something we certainly can't do - there are legitimate domain names with combining marks (after all, the whole purpose of IDN is to allow domain names in languages other than English). Re-titling to track a more specific subset.

The specific example of xn--luaaaaaaaaaaaaaaaaaaaaaaaaaaaa8465w seems like something that is allowed by IDN 2003, but isn't allowed by IDN 2008. Not sure how to check if it's allowed by Unicode Standard #46 (yes, it's a mess of incompatible standards).

The other specific example of xn--apple-xvd ("apple" with an arrow) is different, as even IDN 2008 allows it, as far as I can tell.
Comment 2 Mathias Bynens 2014-01-09 03:15:24 PST
(In reply to comment #1)
> The original title described something we certainly can't do - there are legitimate domain names with combining marks

For the record, the original title was: “IDNs containing Unicode combining marks should be displayed in Punycoded form”.

I disagree that’s something you “certainly can’t do” — every other browser seems to do it.

> […] after all, the whole purpose of IDN is to allow domain names in languages other than English […]

The domain names would still work when entered in their raw Unicode form. apple͢.com in the address bar would still resolve to xn--apple-xvd.com. This bug is purely about the way the domains are displayed in the address bar.
Comment 3 Anne van Kesteren 2014-01-09 05:43:49 PST
See http://wiki.whatwg.org/wiki/URL#UI for some details around this.
Comment 4 Alexey Proskuryakov 2014-01-09 09:35:34 PST
> I disagree that’s something you “certainly can’t do” — every other browser seems to do it.

Do what? Are you saying that every other browser displays "http://www.café.fr" as punycode?
Comment 5 Alexey Proskuryakov 2014-01-09 09:37:36 PST
> See http://wiki.whatwg.org/wiki/URL#UI for some details around this.

Thank you, this is very useful information. To avoid potential link rot, the current content of the wiki is these links:

http://www.chromium.org/developers/design-documents/idn-in-google-chrome (also includes summary for other browsers)
https://wiki.mozilla.org/IDN_Display_Algorithm
http://www.alvestrand.no/pipermail/idna-update/2011-December/date.html (has lots of background discussion)
Comment 6 bugmenot 2015-06-22 07:12:17 PDT
Created attachment 255350 [details]
personal public info
Comment 7 Michael 2016-04-10 03:55:58 PDT
Summary:
The .de NIC (denic.de) will implement IDNA2008 from 2010-11-16 onwards,
especially allowing for ß (\u00df) in domain names. Hence, the automatic
translation of ß to ss may result in looking up the wrong domain name, allowing
for spoofing attacks.
(DENIC will run a sunrise period (2010-10-26 to 2010-11-15) during which
holders of domains with ss will be allowed top register the respective ß domain
in advance.)

http://www.denic.de/en/domains/internationalized-domain-names/sharp-s.html

ß and ss are not exchangable in German. ss instead of ß is just a makeshift. Germans expect ß to usually just work if
umlauts work (which already do for a while).

Steps to Reproduce:
1. Start Safari on OS X Mountain Lion or on iOS (the same on El Capitan and iOS 9.x)
2. Open the domain "http://www.heß.de" (a family name).


Expected Results:
Safari will change the "ß" character to "ss" and open "http://www.hess.de" which is a completely different family name (last name). For example: "Michael Heß" and "Peter Hess".


FireFox Nightly Version 46.0a1 (and newer) did it now correct.

Here are some informations:
https://hg.mozilla.org/mozilla-central/rev/ac843b130537
https://hg.mozilla.org/mozilla-central/rev/dd3d6c83f354
https://hg.mozilla.org/mozilla-central/rev/f23234d57557

And the FireFox Bug Thread for this issue:
https://bugzilla.mozilla.org/show_bug.cgi?id=479520


You can check the following Websites in FireFox Nightly Version 46.0a1 (2016-01-24) to understand:

http://www.roessner.de

http://www.rössner.de
(is linked to www.roessner.de because it's the same owner, but it's also a separate name in Germany).

http://www.rößner.de

For example:
Bill Roessner
Jack Rössner
Mike Rößner

The browsers do it right with "roessner" and "rössner" and separate it because these are different names.

Now, it's time to do it also right with the "ß" character.


For now i have to use the URL in Safari:

http://www.xn--rner-vna1l.de

if i want to go to:

http://www.rößner.de


thanks & kind regards

Michael
Comment 8 Michael Catanzaro 2016-12-15 16:27:59 PST
(In reply to comment #7)
> Expected Results:
> Safari will change the "ß" character to "ss" and open "http://www.hess.de"
> which is a completely different family name (last name). For example:
> "Michael Heß" and "Peter Hess".
> 
> 
> FireFox Nightly Version 46.0a1 (and newer) did it now correct.

I think you should file a different bug for this. Your issue is orthogonal to this one, almost the opposite as this bug requires less internationalization.