Bug 23382

Summary: innerHTML incorrectly converts '>' characters which is not parts of html tags
Product: WebKit Reporter: Joongi Kim <me+dev>
Component: DOMAssignee: Nobody <webkit-unassigned>
Status: RESOLVED WORKSFORME    
Severity: Normal CC: ap
Priority: P2    
Version: 525.x (Safari 3.2)   
Hardware: Mac (Intel)   
OS: OS X 10.5   
URL: http://dev.textcube.org/ticket/1061
Attachments:
Description Flags
innerHTML bug example none

Description Joongi Kim 2009-01-16 05:43:01 PST
Sample Document:

<html>
<head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
</head>
<body>
<div id="test">
asdf&lt;asdf&gt;
</div>
</body>
</html>

Expected result of innerHTML property of #test element:
"asdf&lt;asdf&gt;"

Actual result of innerHTML property of #test element:
"asdf&lt;asdf>"
Comment 1 Joongi Kim 2009-01-16 05:49:22 PST
Created attachment 26791 [details]
innerHTML bug example

You can see how webkit is confused with entity conversion of innerHTML.
Comment 2 Joongi Kim 2009-01-16 05:50:53 PST
Comment on attachment 26791 [details]
innerHTML bug example

Expected result (from Mozilla Firefox 3.0.5):

asdf&lt;asdf&gt;<br>
asdf&lt;asdf&gt;<br>
asdf<asdf&gt;><br>
asdf<asdf><br>
</asdf></asdf&gt;>

Actual Result:
asdf&lt;asdf><br>
asdf&lt;asdf><br>
asdf<asdf&gt;><br>
asdf<asdf><br>
</asdf></asdf&gt;>
Comment 3 Joongi Kim 2009-01-16 05:55:23 PST
This problem happens also in Google Chrome 1.0.154.43.
Comment 4 Mark Rowe (bdash) 2009-01-16 22:45:18 PST
I see the following with TOT:

asdf&lt;asdf&gt;<br>
asdf&lt;asdf&gt;<br>
asdf<asdf&gt;><br>
asdf<asdf><br>
</asdf></asdf&gt;>

That matches what Firefox gives.
Comment 5 Alexey Proskuryakov 2009-01-17 04:04:10 PST
Please also note that the previous WebKit behavior was formally correct, too - there is no fundamental reason to convert > to &gt;, it's just something other browser do.
Comment 6 Joongi Kim 2009-01-24 07:05:26 PST
Yes, there's no problem with NOT converting closing '>' to '&gt;', but some users of Textcube which I develope complain of this behaviour. (They and we didn't know before that this is a specific behaviour of Webkit.) Also I'm not sure that this behaviour may affect the backup-restore format called TTXML which has possibilities of contents processed by plain HTML <-> our own markup syntax conversion routines using innerHTML. We hope there be no side-effects from this.