Bug 126605 - DOM serialization: < and > should not be quoted in HTML attribute values
Summary: DOM serialization: < and > should not be quoted in HTML attribute values
Status: RESOLVED CONFIGURATION CHANGED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL: http://jsbin.com/IkAnEJEY/1/
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-07 16:51 PST by Victor Costan
Modified: 2022-08-14 12:30 PDT (History)
8 users (show)

See Also:


Attachments
Patch (10.04 KB, patch)
2014-01-07 17:38 PST, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (10.02 KB, patch)
2014-04-23 20:19 PDT, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (9.59 KB, patch)
2014-08-17 23:44 PDT, Victor Costan
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Victor Costan 2014-01-07 16:51:23 PST
== What steps will reproduce the problem?

1. Open the document and look at the attribute serialization in the HTML output and XML output.

== What is the expected result?

The HTML serialization should contain "< >" instead of "&lt; &gt;".

== What happens instead?

< and > in the attribute value are incorrectly escaped during serialization.

In the HTML case, the DOM parsing and serialization specification defers to the HTML specification: http://www.w3.org/html/wg/drafts/html/master/syntax.html#attribute's-serialized-name

Steps 1-3 under the "Escaping a string" heading are relevant to attributes.

== Please provide any additional information below.

Firefox, IE and Chromium do not quote < and >. The fix happened pretty recently in Chromium, so it only shows up in Canary builds.

I will prepare a patch for this bug soon.
Comment 1 Victor Costan 2014-01-07 17:38:27 PST
Created attachment 220575 [details]
Patch
Comment 2 Victor Costan 2014-01-07 18:21:09 PST
Adam, this is a port of the following Blink CL that you reviewed.
https://codereview.chromium.org/117103003/

Alexey, can you please take a look?
Comment 3 Alexey Proskuryakov 2014-01-08 10:14:34 PST
Comment on attachment 220575 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=220575&action=review

> Source/WebCore/ChangeLog:10
> +        * editing/MarkupAccumulator.h: EntityMaskInHTMLAttributeValue matches DOM spec.

Is this actually the only place where XMLSerializer does not match the "specification" for HTML documents? I thought it didn't care whether it's HTML or XML, unlike in Firefox.

> LayoutTests/fast/dom/dom-parse-serialize.html:43
> +    '<?xml version="1.0"?>\n<!DOCTYPE doc [\n<!ATTLIST d id ID #IMPLIED>\n]>\n<doc>\n  <foo xmlns="foobar">One</foo> <x:bar xmlns:x="barfoo">Two</x:bar>\n  <d id="id3">Three</d>\n<f id="&amp;&lt;&gt;>">Four&amp;&lt;&gt;</f><empty/><empty></empty></doc>\n';

It's confusing that this change for how HTML is serialized includes a change to an XML test.
Comment 4 Victor Costan 2014-01-08 10:35:23 PST
(In reply to comment #3)
> (From update of attachment 220575 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=220575&action=review
> 
> > Source/WebCore/ChangeLog:10
> > +        * editing/MarkupAccumulator.h: EntityMaskInHTMLAttributeValue matches DOM spec.
> 
> Is this actually the only place where XMLSerializer does not match the "specification" for HTML documents? I thought it didn't care whether it's HTML or XML, unlike in Firefox.

To the best of my understanding, MarkupAccumulator is used indirectly by Element.innerHTML and Element.outerHTML too.

HTMLElement::{innerHTML, outerHTML} (WebCore/html/HTMLElement.cpp) ->
    WebCore::createMarkup (WebCore/editing/markup.cpp) ->
    WebCore::createMarkupInternal (WebCore/editing/markup.cpp) ->
    StyledMarkupAccumulator::serializeNodes (WebCore/editing/markup.cpp) ->
    StyledMarkupAccumulator::traverseNodesForSerialization (WebCore/editing/markup.cpp) ->
    MarkupAccumulator::appendStartTag (WebCore/editing/MarkupAccumulator.cpp) -> 
    MarkupAccumulator::appendStartMarkup (WebCore/editing/MarkupAccumulator.cpp) -> 
    StyledMarkupAccumulator::appendElement (WebCore/editing/markup.cpp) -> 
    MarkupAccumulator::appendElement (WebCore/editing/MarkupAccumulator.cpp) -> 
    MarkupAccumulator::appendAttribute (WebCore/editing/MarkupAccumulator.cpp) -> 
    MarkupAccumulator::appendAttributeValue (WebCore/editing/MarkupAccumulator.cpp)

MarkupAccumulator::appendAttributeValue is where the mask is chosen, depending on the document type. So changing the mask does impact the serialization of HTML documents.

Also, the new layout test that I added shows that either I'm right, or there's another code path that I don't see which uses the same mask. "git grep" tells me the mask name isn't referenced anywhere in WebCore, so I'm reasonably certain that I got this right.

> 
> > LayoutTests/fast/dom/dom-parse-serialize.html:43
> > +    '<?xml version="1.0"?>\n<!DOCTYPE doc [\n<!ATTLIST d id ID #IMPLIED>\n]>\n<doc>\n  <foo xmlns="foobar">One</foo> <x:bar xmlns:x="barfoo">Two</x:bar>\n  <d id="id3">Three</d>\n<f id="&amp;&lt;&gt;>">Four&amp;&lt;&gt;</f><empty/><empty></empty></doc>\n';
> 
> It's confusing that this change for how HTML is serialized includes a change to an XML test.

I added that because I wanted to make sure that > can be read from an XML attribute in both its un-escaped and escaped form (&gt;) and it always comes out as &gt; I didn't see this covered in other tests. I thought it's a useful check, from a black-box testing perspective. I can remove it if you disagree.
Comment 5 Victor Costan 2014-04-23 20:19:43 PDT
Created attachment 230040 [details]
Patch
Comment 6 Victor Costan 2014-04-23 21:32:10 PDT
Chris, can you please take a look at this?

It should be similar to a Blink change that you've seen recently.
https://codereview.chromium.org/117103003

Also, at this point, all other major browsers converge on this behavior.
Comment 7 Victor Costan 2014-07-06 13:06:16 PDT
The patch still applies cleanly. I would be grateful for a review.
Comment 8 Ryosuke Niwa 2014-08-17 15:06:23 PDT
Comment on attachment 230040 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=230040&action=review

> Source/WebCore/editing/MarkupAccumulator.h:55
> +    EntityMaskInAttributeValue = EntityAmp | EntityQuot | EntityLt | EntityGt,

Since we didn't change flags in EntityMaskInAttributeValue, we should revert the change to this line.
Comment 9 Victor Costan 2014-08-17 23:44:05 PDT
Created attachment 236742 [details]
Patch
Comment 10 Victor Costan 2014-08-18 00:03:49 PDT
Thank you for the feedback!

I uploaded a new patch.

Can you please take a look and CQ+ if the patch is acceptable?
Comment 11 Victor Costan 2014-08-22 07:36:41 PDT
Ryoskue: ping?

Can you please take another look at this?
The efl-wk2 EWS error doesn't appear to be related to the patch.

Thank you!
Comment 12 Ahmad Saleem 2022-08-01 14:33:36 PDT
From attached URL JSBin - I get following:

*** Safari 15.6 on macOS 12.5 ***

Serialized HTML: <div quoteme="< > &amp; &quot; ' &nbsp;"></div>

Serialized XML: <div xmlns="http://www.w3.org/1999/xhtml" quoteme="&lt; &gt; &amp; &quot; '  "></div>

*** Firefox Nightly 105 ***

Serialized HTML: <div quoteme="< > &amp; &quot; ' &nbsp;"></div>

Serialized XML: <div xmlns="http://www.w3.org/1999/xhtml" quoteme="&lt; &gt; &amp; &quot; '  "></div>

*** Chrome Canary 106 ***

Serialized HTML: <div quoteme="< > &amp; &quot; ' &nbsp;"></div>

Serialized XML: <div xmlns="http://www.w3.org/1999/xhtml" quoteme="&lt; &gt; &amp; &quot; '  "></div>

_____________

Do we need to use anything else? Just wanted to share updated test cases. Thanks!
Comment 13 Ryosuke Niwa 2022-08-01 21:11:48 PDT
This is working as intended now.