http://www.whatwg.org/specs/web-apps/current-work/multipage/the-end.html#serializing-html-fragments Specifies that we should append the value of data IDL attribute literally without any escapes for text node under style, script, xmp, iframe, noembed, noframes, or plaintext elements. However, WebKit currently escapes <, >, &, and non-breaking space inside noembed, noframes, and plaintext elements.
Lets make some test cases and double check the other browsers do it that way.
I'm not sure about noembed - Firefox 3.6.8 escapes <, >, &, and non-breaking space in it.
Created attachment 66794 [details] demo
Created attachment 66805 [details] noframes example Interestingly, the results of your test don't match what I saw. Firefox serializes noframes differently depending on what element innerHTML is called on - there is no such difference for noembed!
Created attachment 66806 [details] static html demo
(In reply to comment #1) > Lets make some test cases and double check the other browsers do it that way. I thought I attached the test but apparently not. Added a test + static html demo for MSIE. (In reply to comment #2) > I'm not sure about noembed - Firefox 3.6.8 escapes <, >, &, and non-breaking space in it. Right. Firefox doesn't escape noembed and noframes. (In reply to comment #4) > Interestingly, the results of your test don't match what I saw. Firefox serializes noframes differently depending on what element innerHTML is called on - there is no such difference for noembed! The situation is even worse. static html and dynamic html (appending node manually) give different results. It seems like both WebKit and MSIE drops the contents of noembed and noframes while parsing the document.
It seems like Mac's lexer isn't reading 0xA0 properly. I'm getting 0xFFFD instead and I can't figure out a way to read 0xA0.
(In reply to comment #7) > It seems like Mac's lexer isn't reading 0xA0 properly. I'm getting 0xFFFD instead and I can't figure out a way to read 0xA0. Ugh... this wasn't an issue with Mac platforms. With TOT, we don't read nbsp on all platforms.
Created attachment 66824 [details] static html demo with UTF-8 nbsp The problem was that the default encoding is set to UTF-8 in which case nbsp is encoded as 0xC2 0xA0 but nbsp in the document was 0xA0 (ISO/IEC 8859).
Here's a quick summary: Firefox 3.6.8 escapes text node under noembed and noframes the same way we do. Internet Explorer 8 returns empty string for noembed and noframes. Neither Firefox nor Internet Explorer escapes the text node under plaintext.
I am still able to reproduce this issue in Safari 15.5 on macOS 12.4 using "demo" test case. All other browsers Chrome Canary 104 and Firefox 103 behaves similar. Thanks!
Is this bug reason that we fail following WPT tests: https://wpt.fyi/results/html/syntax/parsing-html-fragments/tokenizer-modes-001.html?label=experimental&label=master&aligned
(In reply to Ahmad Saleem from comment #12) > Is this bug reason that we fail following WPT tests: > > https://wpt.fyi/results/html/syntax/parsing-html-fragments/tokenizer-modes- > 001.html?label=experimental&label=master&aligned Something related to this? https://github.com/WebKit/WebKit/blob/a43d4b3fb6b0e3fe6eebd85112c25653949bfd08/Source/WebCore/html/parser/HTMLTokenizer.cpp#L1398
Committed 262285@main (a641fc693f57): <https://commits.webkit.org/262285@main> Reviewed commits have been landed. Closing PR #12108 and removing active labels.
<rdar://problem/107381507>