Bug 16559 - should unescape hostname first, then perform IDNA
: should unescape hostname first, then perform IDNA
Status: NEW
: WebKit
Page Loading
: 523.x (Safari 3)
: All All
: P2 Normal
Assigned To:
: InRadar, ReviewedForRadar
  Show dependency treegraph
Reported: 2007-12-21 10:43 PST by
Modified: 2008-03-17 15:25 PST (History)

HTML snippet with %-escaped dot in host name (42 bytes, text/html)
2007-12-21 16:32 PST, Erik van der Poel
no flags Details


You need to log in before you can comment on or make changes to this bug.

Description From 2007-12-21 10:43:24 PST
Found another issue in Safari 3, where the unescaping and
punycoding are done in the wrong order. Here's an example:

<a href="http://&#x5341;%2ecom/">link</a>

When you click on that link, Safari ends up sending the following
(wrong) domain name to DNS: xn--.com-9b5j

This is because Safari is first running the host name through IDNA to
get Punycode and *then* unescaping the %2e to get the dot. It should
first unescape the %2e to get the dot, and then separate the host name
into labels, and then run IDNA on the 1st label only, since it is
non-ASCII. The result should be xn--kkr.com. Both MSIE 7 and Opera 9
get this right, but Firefox gets it wrong, in a different way.

This bug may be more serious than the following, since it affects how
a hostname is divided into labels at each dot.


I suspect that both of these bugs would be fixed by a single check-in.
------- Comment #1 From 2007-12-21 15:31:32 PST -------
Can you put the HTML snippet you mentioned into an attachment on the bug?  Bugzilla has a habit of mangling non-ASCII characters which makes it hard to decipher your original snippet.
------- Comment #2 From 2007-12-21 16:32:02 PST -------
Created an attachment (id=18047) [details]
HTML snippet with %-escaped dot in host name
------- Comment #3 From 2008-03-17 15:25:55 PST -------