Bug 99880 - Always parse pasted fragments as HTML even on XHTML pages
Summary: Always parse pasted fragments as HTML even on XHTML pages
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: HTML Editing (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Adam Klein
URL:
Keywords:
Depends on:
Blocks: 99607
  Show dependency treegraph
 
Reported: 2012-10-19 15:27 PDT by Adam Klein
Modified: 2012-10-23 06:06 PDT (History)
8 users (show)

See Also:


Attachments
Patch (10.30 KB, patch)
2012-10-19 15:36 PDT, Adam Klein
no flags Details | Formatted Diff | Diff
Patch (10.42 KB, patch)
2012-10-19 16:09 PDT, Adam Klein
no flags Details | Formatted Diff | Diff
Patch for landing (6.82 KB, patch)
2012-10-19 16:53 PDT, Adam Klein
no flags Details | Formatted Diff | Diff
Test case (191 bytes, application/xhtml+xml)
2012-10-19 18:54 PDT, Adam Klein
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Klein 2012-10-19 15:27:37 PDT
Always parse pasted fragments as HTML even on XHTML pages
Comment 1 Adam Klein 2012-10-19 15:36:29 PDT
Created attachment 169713 [details]
Patch
Comment 2 Elliott Sprehn 2012-10-19 15:43:07 PDT
Comment on attachment 169713 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=169713&action=review

> Source/WebCore/editing/markup.cpp:670
> +        element->removeChild(child.get());

Is there a reason to unwrap the RefPtr here but pass it below?

> Source/WebCore/editing/markup.cpp:681
> +        if (node->hasTagName(htmlTag) || node->hasTagName(headTag) || node->hasTagName(bodyTag)) {

What's the purpose of this method? I can just put <head><head></head></head> into the pasteboard and you'd leave the inner one behind.
Comment 3 Adam Klein 2012-10-19 16:07:18 PDT
Comment on attachment 169713 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=169713&action=review

>> Source/WebCore/editing/markup.cpp:670
>> +        element->removeChild(child.get());
> 
> Is there a reason to unwrap the RefPtr here but pass it below?

removeChild() takes a raw pointer, insertBefore() takes a PassRefPtr. So I'm using .get() as minimally as I can.

>> Source/WebCore/editing/markup.cpp:681
>> +        if (node->hasTagName(htmlTag) || node->hasTagName(headTag) || node->hasTagName(bodyTag)) {
> 
> What's the purpose of this method? I can just put <head><head></head></head> into the pasteboard and you'd leave the inner one behind.

Added a short comment, see if that helps.
Comment 4 Adam Klein 2012-10-19 16:09:33 PDT
Created attachment 169715 [details]
Patch
Comment 5 Ojan Vafai 2012-10-19 16:25:38 PDT
Comment on attachment 169715 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=169715&action=review

> LayoutTests/ChangeLog:17
> +        * platform/mac/editing/pasteboard/paste-xml-expected.txt: Now that we
> +        use the HTML parser, parsing the paste succeeds and so we insert DOM
> +        instead of plain text. Similar rebaselines may be needed on other
> +        platforms.

maybe point to the pretty-diff bug to explain this diff?
Comment 6 Adam Klein 2012-10-19 16:53:13 PDT
Created attachment 169727 [details]
Patch for landing
Comment 7 Adam Klein 2012-10-19 16:56:47 PDT
Patch for landing is greatly simplified, since it turns out the special code to handle <html>, <body>, and <head> is not needed at all (because we use the HTML5 fragment parsing algorithm with <body> as the context element, which means we ignore all three of those tags.
Comment 8 Adam Klein 2012-10-19 16:59:05 PDT
Comment on attachment 169727 [details]
Patch for landing

Waiting to cq+ so I can watch the waterfalls in case rebaselining is needed.
Comment 9 Alexey Proskuryakov 2012-10-19 17:08:09 PDT
+        The Mac port previously worked around this problem by falling back to plain text
+        when parsing failed, but switching to HTML seems like a clear improvement.

Are you aware of any way to reproduce this in Safari on Mac in practice? I briefly tried, but couldn't get in this code path.
Comment 10 Adam Klein 2012-10-19 17:13:28 PDT
(In reply to comment #9)
> +        The Mac port previously worked around this problem by falling back to plain text
> +        when parsing failed, but switching to HTML seems like a clear improvement.
> 
> Are you aware of any way to reproduce this in Safari on Mac in practice? I briefly tried, but couldn't get in this code path.

All you should need is an xhtml page with a contentEditable element in it. Try to copy some html and paste it into that element. Fwiw, The relevant code is in EditorMac.mm, Editor::documentFragment(). Under DRT, editing/pasteboard/paste-xml.xhtml walks through this "try to parse as XML, fail, fall back to text" case.
Comment 11 Alexey Proskuryakov 2012-10-19 17:20:30 PDT
I tried that, and got styled text pasted. Possibly because there were additional flavors on the pasteboard, but this is why I'm asking for steps to reproduce.
Comment 12 Adam Klein 2012-10-19 18:54:58 PDT
Created attachment 169742 [details]
Test case

Note that the ".xhtml" suffix is necessary for Safari to treat this as an XHTML document. When I select the first line and paste it onto the second line, the bolding disappears.
Comment 13 Alexey Proskuryakov 2012-10-19 22:04:07 PDT
Thank you, I can reproduce now. I experimented with a data:application/xhtml+xml URL, and looks like these do not make XHTML documents in Safari.
Comment 14 WebKit Review Bot 2012-10-23 05:37:11 PDT
Comment on attachment 169727 [details]
Patch for landing

Rejecting attachment 169727 [details] from commit-queue.

Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 2

Last 500 characters of output:
ent): Merge conflict in Source/WebKit2/ChangeLog
Failed to merge in the changes.
Patch failed at 0001 [EFL] Make plugin process debugging easier (PLUGIN_PROCESS_COMMAND_PREFIX)

When you have resolved this problem run "git rebase --continue".
If you would prefer to skip this patch, instead run "git rebase --skip".
To restore the original branch and stop rebasing run "git rebase --abort".

rebase refs/remotes/origin/master: command returned error: 1

Died at Tools/Scripts/update-webkit line 164.

Full output: http://queues.webkit.org/results/14518146
Comment 15 WebKit Review Bot 2012-10-23 06:06:43 PDT
Comment on attachment 169727 [details]
Patch for landing

Clearing flags on attachment: 169727

Committed r132211: <http://trac.webkit.org/changeset/132211>
Comment 16 WebKit Review Bot 2012-10-23 06:06:48 PDT
All reviewed patches have been landed.  Closing bug.