== What steps will reproduce the problem? 1. Open the document and look at the attribute serialization in the HTML output and XML output. == What is the expected result? The HTML serialization should contain "< >" instead of "< >". == What happens instead? < and > in the attribute value are incorrectly escaped during serialization. In the HTML case, the DOM parsing and serialization specification defers to the HTML specification: http://www.w3.org/html/wg/drafts/html/master/syntax.html#attribute's-serialized-name Steps 1-3 under the "Escaping a string" heading are relevant to attributes. == Please provide any additional information below. Firefox, IE and Chromium do not quote < and >. The fix happened pretty recently in Chromium, so it only shows up in Canary builds. I will prepare a patch for this bug soon.
Created attachment 220575 [details] Patch
Adam, this is a port of the following Blink CL that you reviewed. https://codereview.chromium.org/117103003/ Alexey, can you please take a look?
Comment on attachment 220575 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=220575&action=review > Source/WebCore/ChangeLog:10 > + * editing/MarkupAccumulator.h: EntityMaskInHTMLAttributeValue matches DOM spec. Is this actually the only place where XMLSerializer does not match the "specification" for HTML documents? I thought it didn't care whether it's HTML or XML, unlike in Firefox. > LayoutTests/fast/dom/dom-parse-serialize.html:43 > + '<?xml version="1.0"?>\n<!DOCTYPE doc [\n<!ATTLIST d id ID #IMPLIED>\n]>\n<doc>\n <foo xmlns="foobar">One</foo> <x:bar xmlns:x="barfoo">Two</x:bar>\n <d id="id3">Three</d>\n<f id="&<>>">Four&<></f><empty/><empty></empty></doc>\n'; It's confusing that this change for how HTML is serialized includes a change to an XML test.
(In reply to comment #3) > (From update of attachment 220575 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=220575&action=review > > > Source/WebCore/ChangeLog:10 > > + * editing/MarkupAccumulator.h: EntityMaskInHTMLAttributeValue matches DOM spec. > > Is this actually the only place where XMLSerializer does not match the "specification" for HTML documents? I thought it didn't care whether it's HTML or XML, unlike in Firefox. To the best of my understanding, MarkupAccumulator is used indirectly by Element.innerHTML and Element.outerHTML too. HTMLElement::{innerHTML, outerHTML} (WebCore/html/HTMLElement.cpp) -> WebCore::createMarkup (WebCore/editing/markup.cpp) -> WebCore::createMarkupInternal (WebCore/editing/markup.cpp) -> StyledMarkupAccumulator::serializeNodes (WebCore/editing/markup.cpp) -> StyledMarkupAccumulator::traverseNodesForSerialization (WebCore/editing/markup.cpp) -> MarkupAccumulator::appendStartTag (WebCore/editing/MarkupAccumulator.cpp) -> MarkupAccumulator::appendStartMarkup (WebCore/editing/MarkupAccumulator.cpp) -> StyledMarkupAccumulator::appendElement (WebCore/editing/markup.cpp) -> MarkupAccumulator::appendElement (WebCore/editing/MarkupAccumulator.cpp) -> MarkupAccumulator::appendAttribute (WebCore/editing/MarkupAccumulator.cpp) -> MarkupAccumulator::appendAttributeValue (WebCore/editing/MarkupAccumulator.cpp) MarkupAccumulator::appendAttributeValue is where the mask is chosen, depending on the document type. So changing the mask does impact the serialization of HTML documents. Also, the new layout test that I added shows that either I'm right, or there's another code path that I don't see which uses the same mask. "git grep" tells me the mask name isn't referenced anywhere in WebCore, so I'm reasonably certain that I got this right. > > > LayoutTests/fast/dom/dom-parse-serialize.html:43 > > + '<?xml version="1.0"?>\n<!DOCTYPE doc [\n<!ATTLIST d id ID #IMPLIED>\n]>\n<doc>\n <foo xmlns="foobar">One</foo> <x:bar xmlns:x="barfoo">Two</x:bar>\n <d id="id3">Three</d>\n<f id="&<>>">Four&<></f><empty/><empty></empty></doc>\n'; > > It's confusing that this change for how HTML is serialized includes a change to an XML test. I added that because I wanted to make sure that > can be read from an XML attribute in both its un-escaped and escaped form (>) and it always comes out as > I didn't see this covered in other tests. I thought it's a useful check, from a black-box testing perspective. I can remove it if you disagree.
Created attachment 230040 [details] Patch
Chris, can you please take a look at this? It should be similar to a Blink change that you've seen recently. https://codereview.chromium.org/117103003 Also, at this point, all other major browsers converge on this behavior.
The patch still applies cleanly. I would be grateful for a review.
Comment on attachment 230040 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=230040&action=review > Source/WebCore/editing/MarkupAccumulator.h:55 > + EntityMaskInAttributeValue = EntityAmp | EntityQuot | EntityLt | EntityGt, Since we didn't change flags in EntityMaskInAttributeValue, we should revert the change to this line.
Created attachment 236742 [details] Patch
Thank you for the feedback! I uploaded a new patch. Can you please take a look and CQ+ if the patch is acceptable?
Ryoskue: ping? Can you please take another look at this? The efl-wk2 EWS error doesn't appear to be related to the patch. Thank you!
From attached URL JSBin - I get following: *** Safari 15.6 on macOS 12.5 *** Serialized HTML: <div quoteme="< > & " ' "></div> Serialized XML: <div xmlns="http://www.w3.org/1999/xhtml" quoteme="< > & " ' "></div> *** Firefox Nightly 105 *** Serialized HTML: <div quoteme="< > & " ' "></div> Serialized XML: <div xmlns="http://www.w3.org/1999/xhtml" quoteme="< > & " ' "></div> *** Chrome Canary 106 *** Serialized HTML: <div quoteme="< > & " ' "></div> Serialized XML: <div xmlns="http://www.w3.org/1999/xhtml" quoteme="< > & " ' "></div> _____________ Do we need to use anything else? Just wanted to share updated test cases. Thanks!
This is working as intended now.