Bug 60620 - [Windows] REGRESSION(r65868): Tables from Excel are pasted as plain text
Summary: [Windows] REGRESSION(r65868): Tables from Excel are pasted as plain text
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P1 Major
Assignee: Nobody
URL:
Keywords: InRadar
Depends on: 60644 62112
Blocks: 41115
  Show dependency treegraph
 
Reported: 2011-05-11 05:05 PDT by Ryosuke Niwa
Modified: 2011-11-25 15:54 PST (History)
9 users (show)

See Also:


Attachments
HTML content to be pasted in step 2 (2.34 KB, text/plain)
2011-05-11 05:39 PDT, Ryosuke Niwa
no flags Details
test for bisection (742 bytes, text/html)
2011-05-11 06:24 PDT, Ryosuke Niwa
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ryosuke Niwa 2011-05-11 05:05:28 PDT
On WebKit TOT, copying & pasting table cells from Microsoft Excel result in table cells being pasted as plain text.  This is due to the fact we're stripping tr, td, and other table cell elements when we're parsing the following HTML fragment (no table element):

 <col width=64 span=2 style='width:48pt'>
 <tr height=20 style='height:15.0pt'>
  <td height=20 class=xl65 width=64 style='height:15.0pt;width:48pt'>hello</span></td>
  <td class=xl65 width=64 style='border-left:none;width:48pt'>world</td>
 </tr>
 <tr height=20 style='height:15.0pt'>
  <td height=20 class=xl65 style='height:15.0pt;border-top:none'>webkit</td>
  <td class=xl65 style='border-top:none;border-left:none'>&nbsp;</td>
 </tr>

http://crbug.com/19360
Comment 1 Ryosuke Niwa 2011-05-11 05:35:07 PDT
Unfortunately, nighties between this range crash on start an I cannot test the behavior.  But I suspect that http://trac.webkit.org/changeset/65868 is the cause.
Comment 2 Ryosuke Niwa 2011-05-11 05:36:41 PDT
To reproduce this bug, insert the markup in comment #1 by execCommand('insertHTML'); WebKit should insert tr's and td's.
Comment 3 Ryosuke Niwa 2011-05-11 05:39:04 PDT
Created attachment 93111 [details]
HTML content to be pasted in step 2

To reproduce the original bug, you must follow the lengthly steps below:

Reproduction steps:
1. Download Windows Clipboard Viewer from http://www.peterbuettner.de/develop/tools/clipview/
2. Launch the program and copy & paste the attached content
3. Type in "HTML Format" (without quotations) into a box right of "Push in Clip"
4. Press "Push in Clip"
5. Open http://www.mozilla.org/editor/midasdemo/ in WebKit Windows port.
6. Paste in the content editable region of the page

Expected result:
"hello", "world", and "WebKit" are in table cells

Actual result:
"hello world WebKit" is pasted as a plain text.
Comment 4 Adam Roben (:aroben) 2011-05-11 06:11:03 PDT
<rdar://problem/9420024>
Comment 5 Ryosuke Niwa 2011-05-11 06:24:24 PDT
Created attachment 93113 [details]
test for bisection
Comment 6 Adam Roben (:aroben) 2011-05-11 07:22:12 PDT
Bisection indicates that r65868 is to blame, as suspected.
Comment 7 Adam Roben (:aroben) 2011-05-11 07:22:38 PDT
r65868 is "Use new HTML5 TreeBuilder for fragment parsing" (bug 44475).
Comment 8 Eric Seidel (no email) 2011-05-11 10:37:53 PDT
What does firefox do?  Should we not be using fragment parsing for copy/paste?
Comment 9 Ryosuke Niwa 2011-05-11 10:44:00 PDT
(In reply to comment #8)
> What does firefox do?  Should we not be using fragment parsing for copy/paste?

They seem to have a special parsing algorithm just to deal with CF HTML :(
Comment 10 Tony Chang 2011-05-11 11:00:15 PDT
(In reply to comment #9)
> (In reply to comment #8)
> > What does firefox do?  Should we not be using fragment parsing for copy/paste?
> 
> They seem to have a special parsing algorithm just to deal with CF HTML :(

Ryosuke and I were looking at the Moz code yesterday.  He's saying there's a parsing algorithm that parses CF_HTML to generate HTML, which I assume gets passed to their HTML5 parser.  WebKit has similar code for parsing CF_HTML into HTML, but it's very simplistic.
Comment 11 Ryosuke Niwa 2011-11-25 15:54:13 PST
This has been fixed on Chromium Windows.