WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
144194
Support IDN2008 instead of IDN2003
https://bugs.webkit.org/show_bug.cgi?id=144194
Summary
Support IDN2008 instead of IDN2003
Darin Adler
Reported
2015-04-25 11:27:03 PDT
Support IDN2008 instead of IDN2003
Attachments
Patch
(6.03 KB, patch)
2015-04-25 11:28 PDT
,
Darin Adler
no flags
Details
Formatted Diff
Diff
Patch
(7.59 KB, patch)
2016-11-16 10:05 PST
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews101 for mac-yosemite
(961.27 KB, application/zip)
2016-11-16 11:04 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from ews104 for mac-yosemite-wk2
(956.98 KB, application/zip)
2016-11-16 11:09 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from ews113 for mac-yosemite
(1.63 MB, application/zip)
2016-11-16 11:13 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from ews125 for ios-simulator-wk2
(9.37 MB, application/zip)
2016-11-16 11:20 PST
,
Build Bot
no flags
Details
Patch
(19.17 KB, patch)
2016-11-16 11:26 PST
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(21.19 KB, patch)
2016-11-18 11:50 PST
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Patch
(26.41 KB, patch)
2016-11-18 14:34 PST
,
Alex Christensen
no flags
Details
Formatted Diff
Diff
Show Obsolete
(8)
View All
Add attachment
proposed patch, testcase, etc.
Darin Adler
Comment 1
2015-04-25 11:28:23 PDT
Created
attachment 251641
[details]
Patch
Darin Adler
Comment 2
2015-04-25 11:31:04 PDT
Here’s what’s wrong with the patch I uploaded: 1) I haven’t been able to verify that this correctly causes any websites to be loaded differently. I tried testing with <
http://fußball.de
>, but that just redirects to <
http://fussball.de
>. 2) There are no test cases yet! 3) UIDNA_DEFAULT may not be the correct flags to pass to uidna_openUTS46. We need to decide whether to pass one or more of UIDNA_USE_STD3_RULES, UIDNA_CHECK_BIDI, UIDNA_CHECK_CONTEXTJ, UIDNA_NONTRANSITIONAL_TO_ASCII, and UIDNA_NONTRANSITIONAL_TO_UNICODE.
Darin Adler
Comment 3
2015-04-25 11:31:54 PDT
<
rdar://problem/8809208
>
Darin Adler
Comment 4
2015-10-18 15:24:04 PDT
Alexey, any idea what we should do next here?
Alexey Proskuryakov
Comment 5
2015-10-18 22:40:41 PDT
I like the next steps alluded to in
comment 2
. If I were doing this, I'd try to find any real world examples of why the change is a net positive, given that we'd have to start displaying
http://💩.la
as punycode. Reading through UTS#46 may provide enough insight.
Michael
Comment 6
2016-04-10 03:47:31 PDT
Summary: The .de NIC (denic.de) will implement IDNA2008 from 2010-11-16 onwards, especially allowing for ß (\u00df) in domain names. Hence, the automatic translation of ß to ss may result in looking up the wrong domain name, allowing for spoofing attacks. (DENIC will run a sunrise period (2010-10-26 to 2010-11-15) during which holders of domains with ss will be allowed top register the respective ß domain in advance.)
http://www.denic.de/en/domains/internationalized-domain-names/sharp-s.html
ß and ss are not exchangable in German. ss instead of ß is just a makeshift. Germans expect ß to usually just work if umlauts work (which already do for a while). Steps to Reproduce: 1. Start Safari on OS X Mountain Lion or on iOS 2. Open the domain "
http://www.heß.de
" (a family name). Expected Results: Safari will change the "ß" character to "ss" and open "
http://www.hess.de
" which is a completely different family name (last name). For example: "Michael Heß" and "Peter Hess". FireFox Nightly Version 46.0a1 did it now correct. Here are some informations:
https://hg.mozilla.org/mozilla-central/rev/ac843b130537
https://hg.mozilla.org/mozilla-central/rev/dd3d6c83f354
https://hg.mozilla.org/mozilla-central/rev/f23234d57557
And the FireFox Bug Thread for this issue:
https://bugzilla.mozilla.org/show_bug.cgi?id=479520
You can check the following Websites in FireFox Nightly Version 46.0a1 (2016-01-24) to understand:
http://www.roessner.de
http://www.rössner.de
(is linked to www.roessner.de because it's the same owner, but it's also a separate name in Germany).
http://www.rößner.de
For example: Bill Roessner Jack Rössner Mike Rößner The browsers do it right with "roessner" and "rössner" and separate it because these are different names. Now, it's time to do it also right with the "ß" character. For now i have to use the URL in Safari:
http://www.xn--rner-vna1l.de
if i want to go to:
http://www.rößner.de
thanks & kind regards Michael
Alex Christensen
Comment 7
2016-11-16 10:05:33 PST
Created
attachment 294948
[details]
Patch
Alex Christensen
Comment 8
2016-11-16 10:57:55 PST
IDN2003 and UTS#46 are functionally quite similar for URL parsing. The biggest difference I could find was their treatment of U+2132 in
http://www.unicode.org/reports/tr46/#IDNAComparison
As for the uidna_openUTS46 flags: Chromium uses UIDNA_CHECK_BIDI <
https://cs.chromium.org/chromium/src/components/url_formatter/url_formatter.cc?q=UIDNA_CHECK_BIDI&sq=package:chromium&dr=C&l=510
> Firefox uses UIDNA_CHECK_BIDI, UIDNA_CHECK_CONTEXTJ, and effectively UIDNA_NONTRANSITIONAL_TO_UNICODE <
https://dxr.mozilla.org/mozilla-central/source/netwerk/dns/nsIDNService.cpp#139
> If I understand the spec correctly, it basically says to just use UIDNA_DEFAULT.
https://url.spec.whatwg.org/#concept-domain-to-ascii
maybe it should be changed? Anne?
Build Bot
Comment 9
2016-11-16 11:04:46 PST
Comment on
attachment 294948
[details]
Patch
Attachment 294948
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.webkit.org/results/2526540
New failing tests: fast/url/invalid-idn.html fast/encoding/idn-security.html fast/url/idna2008.html fast/url/idna2003.html
Build Bot
Comment 10
2016-11-16 11:04:49 PST
Created
attachment 294952
[details]
Archive of layout-test-results from ews101 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-yosemite Platform: Mac OS X 10.10.5
Build Bot
Comment 11
2016-11-16 11:09:45 PST
Comment on
attachment 294948
[details]
Patch
Attachment 294948
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.webkit.org/results/2526548
New failing tests: fast/url/invalid-idn.html fast/url/idna2008.html fast/url/idna2003.html
Build Bot
Comment 12
2016-11-16 11:09:49 PST
Created
attachment 294954
[details]
Archive of layout-test-results from ews104 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5
Build Bot
Comment 13
2016-11-16 11:12:56 PST
Comment on
attachment 294948
[details]
Patch
Attachment 294948
[details]
did not pass mac-debug-ews (mac): Output:
http://webkit-queues.webkit.org/results/2526543
New failing tests: fast/url/invalid-idn.html fast/encoding/idn-security.html fast/url/idna2008.html fast/url/idna2003.html
Build Bot
Comment 14
2016-11-16 11:13:01 PST
Created
attachment 294956
[details]
Archive of layout-test-results from ews113 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews113 Port: mac-yosemite Platform: Mac OS X 10.10.5
Build Bot
Comment 15
2016-11-16 11:20:17 PST
Comment on
attachment 294948
[details]
Patch
Attachment 294948
[details]
did not pass ios-sim-ews (ios-simulator-wk2): Output:
http://webkit-queues.webkit.org/results/2526549
New failing tests: fast/url/idna2008.html fast/url/invalid-idn.html fast/url/idna2003.html
Build Bot
Comment 16
2016-11-16 11:20:22 PST
Created
attachment 294957
[details]
Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.6
Alex Christensen
Comment 17
2016-11-16 11:26:01 PST
Created
attachment 294958
[details]
Patch
Alexey Proskuryakov
Comment 18
2016-11-16 11:45:06 PST
Comment on
attachment 294958
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=294958&action=review
> LayoutTests/fast/encoding/idn-security-expected.txt:89 > -PASS testIDNRoundTrip(0x337) is 'punycode' > +FAIL testIDNRoundTrip(0x337) should be punycode. Was %u0337.
Looks like there are many changes from PASS to FAIL here. Why is that good?
Alex Christensen
Comment 19
2016-11-16 12:07:44 PST
It is an intentional deviation from what we used to do, and it increases our compliance with the spec and compatibility with other browsers.
Alexey Proskuryakov
Comment 20
2016-11-16 12:15:37 PST
A better way to reflect that in a test is to change the expected value, instead of changing the result to FAIL. Alex, can you address Darin's questions in
comment 2
? Does the patch fix the examples in
comment 6
?
Alex Christensen
Comment 21
2016-11-16 12:21:09 PST
The examples in
comment 6
are not fixed in my proposed patch because my proposed patch matches the spec and Chrome, which use UTS#46, which I believe is what is intended by "Transitional_Processing" in the spec. Maybe we should just switch to IDN2008 without UTS#46, but I'd like Anne's input.
Alex Christensen
Comment 22
2016-11-16 14:08:48 PST
I think we should just switch do IDN2008 to prevent these German homograph attacks.
Michael
Comment 23
2016-11-16 14:50:11 PST
Please don't give up and help us Germans with "ß" in our surnames! We are waiting since 2010 that the Browsers interpret our Domains right. Exactly today 6 years before i register my domain and my dream is, that i can finally enter the domain:
http://www.rößner.de
instead of
http://www.xn--rner-vna1l.de
in the following Browsers: - Safari - Chrome - Opera - MS Edge Mozilla FireFox do it right. Thank you so much & Kind Regards Michael
Anne van Kesteren
Comment 24
2016-11-17 05:41:26 PST
I don't see how you can get around using UTS46. I don't think IDNA2008 without UTS46 really has a processing model. E.g., lowercasing "A" would be undefined. Transitional_Processing of UTS46 matches IDNA2003 pretty closely and is currently what the URL Standard requires since that's what the majority of implementations do. Nontransitional_Processing would be closer to IDNA2008 behavior, while still doing things like lowercasing "A" to "a". I used to be somewhat opposed to switching away from IDNA2003-like behavior since it's not backwards compatible and can cause issues, but Firefox manages to ship it without complaints (
https://bugzilla.mozilla.org/show_bug.cgi?id=1218179
is the bug where the switch was flipped), so maybe it's not that bad.
https://github.com/whatwg/url/issues/110
and
https://github.com/whatwg/url/issues/53
track IDNA issues in the URL Standard and UTS46.
Michael
Comment 25
2016-11-17 07:28:24 PST
(In reply to
comment #24
)
> but Firefox > manages to ship it without complaints > (
https://bugzilla.mozilla.org/show_bug.cgi?id=1218179
is the bug where the > switch was flipped), so maybe it's not that bad.
Thats the point. Maybe the developers and or the community should now take the chance to turn on the way of progress. Give it a try. Please. And the discussed risks will disappear and vanish more and more. Kind Regards Michael
Alex Christensen
Comment 26
2016-11-18 11:50:06 PST
Created
attachment 295171
[details]
Patch
Darin Adler
Comment 27
2016-11-18 12:47:33 PST
Comment on
attachment 295171
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=295171&action=review
Code change seems OK to me, but I am concerned that we have insufficient test coverage.
> LayoutTests/ChangeLog:10 > + * fast/encoding/idn-security-expected.txt: > + * fast/url/idna2003-expected.txt: > + * fast/url/idna2008-expected.txt:
Why are any of these tests still expecting behavior different from what our new code is doing? If we have a new understanding of what is correct behavior, I suggest we update the tests, not just the expected results. If we still think some of our behavior is incorrect, then in that case it would be appropriate to have some test results that say FAIL, but I don’t think it’s OK to have tests that say FAIL even though that failure represents our best understanding of correct behavior, since these are tests that we wrote ourselves for the WebKit project. Is it valuable to still have three different tests, or should we consolidate them? Do we need additional test coverage beyond what is here and what you added to the URLParser.cpp test?
Alex Christensen
Comment 28
2016-11-18 13:13:56 PST
I'll update idn-security.html because it looks like we wrote it starting in
r23963
. idn2003.html and idn2008.html should be unchanged. They test conformance with idn2003 and idn2008 respectively, and those standards don't change. We intentionally deviate from both of them because of UTS #46. I'll also note that some of my tests were based on
https://tools.ietf.org/html/rfc5893
Alex Christensen
Comment 29
2016-11-18 14:34:02 PST
Created
attachment 295195
[details]
Patch
Alex Christensen
Comment 30
2016-11-18 14:47:58 PST
http://trac.webkit.org/changeset/208902
Alex Christensen
Comment 31
2016-11-30 10:19:40 PST
http://trac.webkit.org/changeset/208905
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug