Bug 101120
Summary: | Copying & pasting tables from Excel results in verbose markup | ||
---|---|---|---|
Product: | WebKit | Reporter: | Ryosuke Niwa <rniwa> |
Component: | HTML Editing | Assignee: | Nobody <webkit-unassigned> |
Status: | NEW | ||
Severity: | Normal | CC: | enrica, koivisto, simon.fraser |
Priority: | P2 | Keywords: | InRadar |
Version: | 528+ (Nightly build) | ||
Hardware: | Unspecified | ||
OS: | Unspecified |
Ryosuke Niwa
Copying tables from a html document Excel generated, and pasting into a contenteditable (e.g. Gmail) results in a very verbose markup because we end up inlining all styles.
e.g.
<style>
table {
mso-displayed-decimal-separator:"\.";
mso-displayed-thousand-separator:"\,";
}
td {
mso-style-parent:style0;
padding:0px;
mso-ignore:padding;
color:black;
font-size:12.0pt;
font-weight:400;
font-style:normal;
text-decoration:none;
font-family:Candara, sans-serif;
mso-font-charset:0;
mso-number-format:General;
text-align:general;
vertical-align:bottom;
border:none;
mso-background-source:auto;
mso-pattern:auto;
mso-protection:locked visible;
white-space:nowrap;
mso-rotate:0;
}
</style>
<table border=0 cellpadding=0 cellspacing=0 width=916 style='border-collapse:
collapse;table-layout:fixed;width:916pt'>
<tr height=3 style='mso-height-source:userset;height:3.0pt'>
<td height=3 width=6 style='height:3.0pt;width:6pt'></td>
</tr>
</table>
can be transformed into:
<td height="3" width="6" style="padding: 0px; font-size: 12pt; font-family: Candara, sans-serif; vertical-align: bottom; border: none; white-space: nowrap; height: 3pt; width: 6pt; "></td>
<rdar://problem/7044154>
Attachments | ||
---|---|---|
Add attachment proposed patch, testcase, etc. |
Ryosuke Niwa
We can do better by removing redundant styles like padding: 0px & border: none; since they’re default styles. We can also remove width & height content attributes since they’re specified in CSS anyway:
<td style="font-size: 12pt; font-family: Candara, sans-serif; vertical-align: bottom; white-space: nowrap; height: 3pt; width: 6pt;"></td>
We can also strip whitespaces between :, ;, & , to get:
<td style="font-size:12pt;font-family:Candara,sans-serif;vertical-align:bottom;white-space:nowrap;height:3pt;width:6pt;"></td>
Ryosuke Niwa
To remove properties that match the default style (i.e. redundant), we need some way of knowing their default values. Simon suggested that we might be able to extend StyleSelector to give us a style ignoring inline styles.
Antti, do you think this is feasible?
Antti Koivisto
(In reply to comment #2)
> To remove properties that match the default style (i.e. redundant), we need some way of knowing their default values. Simon suggested that we might be able to extend StyleSelector to give us a style ignoring inline styles.
>
> Antti, do you think this is feasible?
Should be easy. Inline style is included here:
http://trac.webkit.org/browser/trunk/Source/WebCore/css/StyleResolver.cpp?rev=133324#L950
Just needs a new RuleMatchingBehavior.