Bug 60644

Summary: WebKit inserts base, link, meta, style, and title elements into an editable region when pasting table cells from Excel
Product: WebKit Reporter: Ryosuke Niwa <rniwa>
Component: HTML EditingAssignee: Ryosuke Niwa <rniwa>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, aestes, ap, commit-queue, darin, dcheng, enrica, eric, ian, jparent, leviw, ojan, tony
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: OS X 10.6   
Bug Depends on:    
Bug Blocks: 60620, 60653    
Attachments:
Description Flags
fixes the bug none

Description Ryosuke Niwa 2011-05-11 12:08:12 PDT
WebKit currently inserts elements that should only appear in head such as link, meta, and style elements when pasting HTML format.  We should strip these elements instead to avoid bloating the resultant markup.
Comment 1 Ryosuke Niwa 2011-05-11 15:27:03 PDT
Created attachment 93195 [details]
fixes the bug
Comment 2 Alexey Proskuryakov 2011-05-11 15:32:56 PDT
How do these elements end up in a fragment parsed from the pasteboard? I'm thinking of bug 60155, where style disappears because it's being moved out to <head>.
Comment 3 Ryosuke Niwa 2011-05-11 15:35:36 PDT
(In reply to comment #2)
> How do these elements end up in a fragment parsed from the pasteboard? I'm thinking of bug 60155, where style disappears because it's being moved out to <head>.

Oh I think it's moved to head when we insert the head into the document.
Comment 4 Ryosuke Niwa 2011-05-11 15:40:07 PDT
Oops, I answered pre-maturely without reading the comments on the bug 60155. My guess is that DOCTYPE, html, or head elements in this test are triggering something in the fragment parsing algorithm to switch to non-body mode.
Comment 5 Ojan Vafai 2011-05-11 15:50:49 PDT
What do IE, Firefox, Opera do in these cases?
Comment 6 Eric Seidel (no email) 2011-05-11 15:57:45 PDT
Comment on attachment 93195 [details]
fixes the bug

I thought the HTML5 parser moved head elemetns to th head?  thus they wouldn't appear in the fragment since fragment contnets are only what shows up in the body...
Comment 7 Ryosuke Niwa 2011-05-11 16:14:27 PDT
(In reply to comment #5)
> What do IE, Firefox, Opera do in these cases?

Firefox passes the test. I couldn't test IE other than verifying that IE doesn't paste those elements into the content editable area since IE doesn't support InsertHTML.
Comment 8 Ian 'Hixie' Hickson 2011-05-11 16:17:05 PDT
Once you're in <body>, nothing gets pushed to <head>. That was changed a while ago. Also even when things did get pushed, they wouldn't if you were using innerHTML (fragment parsing).
Comment 9 Ryosuke Niwa 2011-05-11 16:18:21 PDT
At some point, we'll need to share code that implements:
http://dev.w3.org/2006/webapi/clipops/clipops.html#cross-origin-copy-paste-of-source-code
Comment 10 Ryosuke Niwa 2011-05-11 17:16:45 PDT
Looking at:
http://dev.w3.org/2006/webapi/clipops/clipops.html#cross-origin-copy-paste-of-source-code
1. Parse the code on the clipboard to a DOM tree
2. Remove all of the following elements: SCRIPT, APPLET, OBJECT, FORM, INPUT, BUTTON, TEXTAREA, SELECT, OPTION, OPTGROUP and comment nodes. For all mentioned elements except FORM, also remove all child nodes.
3. Remove any elements whose computed style's display property is 'none'
4. Remove all event handler attributes from all elements
5. Remove all data- attributes from all elements
6. If the implementation supports embedding javascript: URLs or other forms of scripting inside CSS instructions, such scripts must be removed.
7. Remove HTML comments.
Serialize the DOM and return the generated string to the script

We'll need a some style resolution mechanism to do step 3.

Eric, can we do step 3 in the fragment parsing algorithm?  Or do we not support CSS style resolution without a WebView?
Comment 11 Ojan Vafai 2011-05-11 21:01:49 PDT
The test-case in the patch is a bit weird and covers the excel case, but I wonder if it breaks other things. Can you try the following test and see what IE/Firefox/Opera do? I put some pseudo-code, but the copy-paste needs to be done manually. 

Specifically, I think that with this patch, the styled stuff will no longer be styled after the cut+paste. I just want to make sure other browsers do this as well.

<html>
<body>
<div id=editable contentEditable>
<style>...</style>
<div>styled stuff here</div>
</div>
<script>
var editable = document.getElementById('editable');
editable.focus();
document.execCommand('selectall', false false);
document.execCommand('cut', false false);
document.execCommand('paste', false false);
</script>
</body>
</html>
Comment 12 Ryosuke Niwa 2011-05-12 09:17:56 PDT
(In reply to comment #11)
> Specifically, I think that with this patch, the styled stuff will no longer be styled after the cut+paste. I just want to make sure other browsers do this as well.

Sure, in this particular case, we'll lose some style.  But if copy & paste are done across different websites, even weirder thing happens:

Consider pasting:
hello
<style> blockquote { border-left: 2px solid black; }  </style>
<blockquote>world</blockquote>

into
<div contenteditable>
</div>
<blockquote>WebKit</blockquote>

If we had copied style element, blockquote OUTSIDE of the contenteditable region suddenly starts to have a border on the left.

Note: we can fix both of these problems we create a fake document, paste the entire content into the fake document, and then re-serialize the contents using StyledMarkupAccumulator and then re-paste it into the current document.
Comment 13 Ryosuke Niwa 2011-05-12 09:33:07 PDT
FYI, IE8 and FF3.6 seem to preserve style element. Nonetheless, I think this patch is an improvement over the current situation where pasted style element can pollute non-editable regions (or editable regions that do not share the highest editable root) until we can properly handle style element.
Comment 14 Alexey Proskuryakov 2011-05-12 09:50:45 PDT
When I paste formatted text into GMail in Safari 5.0.5 on Mac, the formatting is preserved. Will it still work with this patch?

Did you test any other services that let you create formatted documents on the Web?
Comment 15 Ryosuke Niwa 2011-05-12 11:10:12 PDT
(In reply to comment #14)
> When I paste formatted text into GMail in Safari 5.0.5 on Mac, the formatting is preserved. Will it still work with this patch?
> 
> Did you test any other services that let you create formatted documents on the Web?

This shouldn't break whenever you're copying & pasting within WebKit because WebKit will serialize styles that are coming from rules into inline style declaration.
Comment 16 WebKit Commit Bot 2011-05-12 11:22:10 PDT
Comment on attachment 93195 [details]
fixes the bug

Clearing flags on attachment: 93195

Committed r86360: <http://trac.webkit.org/changeset/86360>
Comment 17 WebKit Commit Bot 2011-05-12 11:22:17 PDT
All reviewed patches have been landed.  Closing bug.
Comment 18 Alexey Proskuryakov 2011-05-12 11:32:16 PDT
> This shouldn't break whenever you're copying & pasting within WebKit because WebKit will serialize styles that are coming from rules into inline style declaration.

I was pasting from TextEdit.
Comment 19 WebKit Commit Bot 2011-05-12 11:37:55 PDT
The commit-queue encountered the following flaky tests while processing attachment 93195 [details]:

http/tests/xmlhttprequest/xmlhttprequest-50ms-download-dispatch.html bug 52016 (author: jchaffraix@webkit.org)
The commit-queue is continuing to process your patch.
Comment 20 Ryosuke Niwa 2011-05-12 13:53:09 PDT
(In reply to comment #18)
> > This shouldn't break whenever you're copying & pasting within WebKit because WebKit will serialize styles that are coming from rules into inline style declaration.
> 
> I was pasting from TextEdit.

TextEdit, as far as I know, inserts inline style declaration so this shouldn't be an issue.