Bug 16559 - should unescape hostname first, then perform IDNA
Summary: should unescape hostname first, then perform IDNA
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 523.x (Safari 3)
Hardware: All All
: P2 Normal
Assignee: Nobody
Keywords: InRadar, ReviewedForRadar
Depends on:
Reported: 2007-12-21 10:43 PST by Erik van der Poel
Modified: 2008-03-17 15:25 PDT (History)
1 user (show)

See Also:

HTML snippet with %-escaped dot in host name (42 bytes, text/html)
2007-12-21 16:32 PST, Erik van der Poel
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Erik van der Poel 2007-12-21 10:43:24 PST
Found another issue in Safari 3, where the unescaping and
punycoding are done in the wrong order. Here's an example:

<a href="http://&#x5341;%2ecom/">link</a>

When you click on that link, Safari ends up sending the following
(wrong) domain name to DNS: xn--.com-9b5j

This is because Safari is first running the host name through IDNA to
get Punycode and *then* unescaping the %2e to get the dot. It should
first unescape the %2e to get the dot, and then separate the host name
into labels, and then run IDNA on the 1st label only, since it is
non-ASCII. The result should be xn--kkr.com. Both MSIE 7 and Opera 9
get this right, but Firefox gets it wrong, in a different way.

This bug may be more serious than the following, since it affects how
a hostname is divided into labels at each dot.


I suspect that both of these bugs would be fixed by a single check-in.
Comment 1 Mark Rowe (bdash) 2007-12-21 15:31:32 PST
Can you put the HTML snippet you mentioned into an attachment on the bug?  Bugzilla has a habit of mangling non-ASCII characters which makes it hard to decipher your original snippet.
Comment 2 Erik van der Poel 2007-12-21 16:32:02 PST
Created attachment 18047 [details]
HTML snippet with %-escaped dot in host name
Comment 3 Mark Rowe (bdash) 2008-03-17 15:25:55 PDT