RESOLVED CONFIGURATION CHANGED 126605
DOM serialization: < and > should not be quoted in HTML attribute values
https://bugs.webkit.org/show_bug.cgi?id=126605
Summary DOM serialization: < and > should not be quoted in HTML attribute values
Victor Costan
Reported 2014-01-07 16:51:23 PST
== What steps will reproduce the problem? 1. Open the document and look at the attribute serialization in the HTML output and XML output. == What is the expected result? The HTML serialization should contain "< >" instead of "&lt; &gt;". == What happens instead? < and > in the attribute value are incorrectly escaped during serialization. In the HTML case, the DOM parsing and serialization specification defers to the HTML specification: http://www.w3.org/html/wg/drafts/html/master/syntax.html#attribute's-serialized-name Steps 1-3 under the "Escaping a string" heading are relevant to attributes. == Please provide any additional information below. Firefox, IE and Chromium do not quote < and >. The fix happened pretty recently in Chromium, so it only shows up in Canary builds. I will prepare a patch for this bug soon.
Attachments
Patch (10.04 KB, patch)
2014-01-07 17:38 PST, Victor Costan
no flags
Patch (10.02 KB, patch)
2014-04-23 20:19 PDT, Victor Costan
no flags
Patch (9.59 KB, patch)
2014-08-17 23:44 PDT, Victor Costan
no flags
Victor Costan
Comment 1 2014-01-07 17:38:27 PST
Victor Costan
Comment 2 2014-01-07 18:21:09 PST
Adam, this is a port of the following Blink CL that you reviewed. https://codereview.chromium.org/117103003/ Alexey, can you please take a look?
Alexey Proskuryakov
Comment 3 2014-01-08 10:14:34 PST
Comment on attachment 220575 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=220575&action=review > Source/WebCore/ChangeLog:10 > + * editing/MarkupAccumulator.h: EntityMaskInHTMLAttributeValue matches DOM spec. Is this actually the only place where XMLSerializer does not match the "specification" for HTML documents? I thought it didn't care whether it's HTML or XML, unlike in Firefox. > LayoutTests/fast/dom/dom-parse-serialize.html:43 > + '<?xml version="1.0"?>\n<!DOCTYPE doc [\n<!ATTLIST d id ID #IMPLIED>\n]>\n<doc>\n <foo xmlns="foobar">One</foo> <x:bar xmlns:x="barfoo">Two</x:bar>\n <d id="id3">Three</d>\n<f id="&amp;&lt;&gt;>">Four&amp;&lt;&gt;</f><empty/><empty></empty></doc>\n'; It's confusing that this change for how HTML is serialized includes a change to an XML test.
Victor Costan
Comment 4 2014-01-08 10:35:23 PST
(In reply to comment #3) > (From update of attachment 220575 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=220575&action=review > > > Source/WebCore/ChangeLog:10 > > + * editing/MarkupAccumulator.h: EntityMaskInHTMLAttributeValue matches DOM spec. > > Is this actually the only place where XMLSerializer does not match the "specification" for HTML documents? I thought it didn't care whether it's HTML or XML, unlike in Firefox. To the best of my understanding, MarkupAccumulator is used indirectly by Element.innerHTML and Element.outerHTML too. HTMLElement::{innerHTML, outerHTML} (WebCore/html/HTMLElement.cpp) -> WebCore::createMarkup (WebCore/editing/markup.cpp) -> WebCore::createMarkupInternal (WebCore/editing/markup.cpp) -> StyledMarkupAccumulator::serializeNodes (WebCore/editing/markup.cpp) -> StyledMarkupAccumulator::traverseNodesForSerialization (WebCore/editing/markup.cpp) -> MarkupAccumulator::appendStartTag (WebCore/editing/MarkupAccumulator.cpp) -> MarkupAccumulator::appendStartMarkup (WebCore/editing/MarkupAccumulator.cpp) -> StyledMarkupAccumulator::appendElement (WebCore/editing/markup.cpp) -> MarkupAccumulator::appendElement (WebCore/editing/MarkupAccumulator.cpp) -> MarkupAccumulator::appendAttribute (WebCore/editing/MarkupAccumulator.cpp) -> MarkupAccumulator::appendAttributeValue (WebCore/editing/MarkupAccumulator.cpp) MarkupAccumulator::appendAttributeValue is where the mask is chosen, depending on the document type. So changing the mask does impact the serialization of HTML documents. Also, the new layout test that I added shows that either I'm right, or there's another code path that I don't see which uses the same mask. "git grep" tells me the mask name isn't referenced anywhere in WebCore, so I'm reasonably certain that I got this right. > > > LayoutTests/fast/dom/dom-parse-serialize.html:43 > > + '<?xml version="1.0"?>\n<!DOCTYPE doc [\n<!ATTLIST d id ID #IMPLIED>\n]>\n<doc>\n <foo xmlns="foobar">One</foo> <x:bar xmlns:x="barfoo">Two</x:bar>\n <d id="id3">Three</d>\n<f id="&amp;&lt;&gt;>">Four&amp;&lt;&gt;</f><empty/><empty></empty></doc>\n'; > > It's confusing that this change for how HTML is serialized includes a change to an XML test. I added that because I wanted to make sure that > can be read from an XML attribute in both its un-escaped and escaped form (&gt;) and it always comes out as &gt; I didn't see this covered in other tests. I thought it's a useful check, from a black-box testing perspective. I can remove it if you disagree.
Victor Costan
Comment 5 2014-04-23 20:19:43 PDT
Victor Costan
Comment 6 2014-04-23 21:32:10 PDT
Chris, can you please take a look at this? It should be similar to a Blink change that you've seen recently. https://codereview.chromium.org/117103003 Also, at this point, all other major browsers converge on this behavior.
Victor Costan
Comment 7 2014-07-06 13:06:16 PDT
The patch still applies cleanly. I would be grateful for a review.
Ryosuke Niwa
Comment 8 2014-08-17 15:06:23 PDT
Comment on attachment 230040 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=230040&action=review > Source/WebCore/editing/MarkupAccumulator.h:55 > + EntityMaskInAttributeValue = EntityAmp | EntityQuot | EntityLt | EntityGt, Since we didn't change flags in EntityMaskInAttributeValue, we should revert the change to this line.
Victor Costan
Comment 9 2014-08-17 23:44:05 PDT
Victor Costan
Comment 10 2014-08-18 00:03:49 PDT
Thank you for the feedback! I uploaded a new patch. Can you please take a look and CQ+ if the patch is acceptable?
Victor Costan
Comment 11 2014-08-22 07:36:41 PDT
Ryoskue: ping? Can you please take another look at this? The efl-wk2 EWS error doesn't appear to be related to the patch. Thank you!
Ahmad Saleem
Comment 12 2022-08-01 14:33:36 PDT
From attached URL JSBin - I get following: *** Safari 15.6 on macOS 12.5 *** Serialized HTML: <div quoteme="< > &amp; &quot; ' &nbsp;"></div> Serialized XML: <div xmlns="http://www.w3.org/1999/xhtml" quoteme="&lt; &gt; &amp; &quot; ' "></div> *** Firefox Nightly 105 *** Serialized HTML: <div quoteme="< > &amp; &quot; ' &nbsp;"></div> Serialized XML: <div xmlns="http://www.w3.org/1999/xhtml" quoteme="&lt; &gt; &amp; &quot; ' "></div> *** Chrome Canary 106 *** Serialized HTML: <div quoteme="< > &amp; &quot; ' &nbsp;"></div> Serialized XML: <div xmlns="http://www.w3.org/1999/xhtml" quoteme="&lt; &gt; &amp; &quot; ' "></div> _____________ Do we need to use anything else? Just wanted to share updated test cases. Thanks!
Ryosuke Niwa
Comment 13 2022-08-01 21:11:48 PDT
This is working as intended now.
Note You need to log in before you can comment on or make changes to this bug.