WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 182564
REGRESSION (
r223440
): Copying & pasting a list from Microsoft Word to TinyMCE fails
https://bugs.webkit.org/show_bug.cgi?id=182564
Summary
REGRESSION (r223440): Copying & pasting a list from Microsoft Word to TinyMCE...
Ryosuke Niwa
Reported
2018-02-06 23:53:15 PST
As reported by the developers of TinyMCE, my work to sanitize the pasted content introduced a new regression that copying & pasting a list item from Microsoft Word into TinyMCE would cause the list items to be plain text.
Attachments
WIP
(9.65 KB, patch)
2018-02-06 23:55 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
WIP2
(19.41 KB, patch)
2018-02-08 00:32 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
WIP3
(19.41 KB, patch)
2018-02-08 00:42 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Fixes the bug
(94.83 KB, patch)
2018-02-08 23:38 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Updated for ToT
(94.82 KB, patch)
2018-02-08 23:41 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from ews205 for win-future
(11.50 MB, application/zip)
2018-02-09 03:40 PST
,
EWS Watchlist
no flags
Details
Skip the test on Windows
(99.31 KB, patch)
2018-02-09 14:43 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Patch for landing
(99.33 KB, patch)
2018-02-09 19:45 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Fixes the off-by-one error
(9.08 KB, patch)
2018-02-12 11:48 PST
,
Ryosuke Niwa
wenson_hsieh
: review+
Details
Formatted Diff
Diff
Fixed off by one error
(9.08 KB, patch)
2018-02-12 12:09 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Fixes Wordpress, evernote, and Gmail
(27.43 KB, patch)
2018-02-12 17:32 PST
,
Ryosuke Niwa
bfulgham
: review+
Details
Formatted Diff
Diff
Fixes older versions of TinyMCE
(26.97 KB, patch)
2018-02-13 14:44 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Challenging document
(40.50 KB, application/msword)
2018-02-13 20:13 PST
,
Andrew Herron
no flags
Details
Fixed Evernote & Wordpress.com & MS Word's compat mode
(81.75 KB, patch)
2018-02-13 22:16 PST
,
Ryosuke Niwa
wenson_hsieh
: review+
Details
Formatted Diff
Diff
Fixed Evernote & Wordpress.com & MS Word's compat mode
(82.63 KB, patch)
2018-02-14 09:58 PST
,
Ryosuke Niwa
no flags
Details
Formatted Diff
Diff
Document with heading lists
(32.31 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2018-02-18 21:36 PST
,
Andrew Herron
no flags
Details
Show Obsolete
(12)
View All
Add attachment
proposed patch, testcase, etc.
Ryosuke Niwa
Comment 1
2018-02-06 23:55:05 PST
Created
attachment 333269
[details]
WIP
Ryosuke Niwa
Comment 2
2018-02-06 23:55:30 PST
<
rdar://problem/36035103
>
Ryosuke Niwa
Comment 3
2018-02-08 00:32:55 PST
Created
attachment 333361
[details]
WIP2
Ryosuke Niwa
Comment 4
2018-02-08 00:33:21 PST
The code change is done. Will write tests next.
Ryosuke Niwa
Comment 5
2018-02-08 00:42:06 PST
Created
attachment 333362
[details]
WIP3 Oops, there was a tiny bug there.
Andrew Herron
Comment 6
2018-02-08 22:45:19 PST
My initial impressions are that this gives us most (if not all) of what we need. It would be really nice if we could get a valid HTML document, though, instead of "<html><style></style>content" :) I'll adjust my code next week to pick up the style tag this way and see if there are any problems.
Ryosuke Niwa
Comment 7
2018-02-08 23:38:00 PST
Created
attachment 333448
[details]
Fixes the bug
Ryosuke Niwa
Comment 8
2018-02-08 23:41:29 PST
Created
attachment 333449
[details]
Updated for ToT
Ryosuke Niwa
Comment 9
2018-02-08 23:42:02 PST
(In reply to Andrew Herron from
comment #6
)
> My initial impressions are that this gives us most (if not all) of what we > need. It would be really nice if we could get a valid HTML document, though, > instead of "<html><style></style>content" :) > > I'll adjust my code next week to pick up the style tag this way and see if > there are any problems.
Oh, what kind of markup do you want to see instead? <html><body><style>?
Andrew Herron
Comment 10
2018-02-09 01:51:45 PST
I can technically handle anything, but a proper HTML document would be easiest to work with :) <html><head><style></style></head><body>content</body></html>
EWS Watchlist
Comment 11
2018-02-09 03:40:25 PST
Comment on
attachment 333449
[details]
Updated for ToT
Attachment 333449
[details]
did not pass win-ews (win): Output:
http://webkit-queues.webkit.org/results/6429691
New failing tests: http/tests/security/clipboard/copy-paste-html-across-origin-strips-mso-list.html
EWS Watchlist
Comment 12
2018-02-09 03:40:35 PST
Created
attachment 333475
[details]
Archive of layout-test-results from ews205 for win-future The attached test failures were seen while running run-webkit-tests on the win-ews. Bot: ews205 Port: win-future Platform: CYGWIN_NT-6.1-2.9.0-0.318-5-3-x86_64-64bit
Ryosuke Niwa
Comment 13
2018-02-09 12:17:48 PST
(In reply to Andrew Herron from
comment #10
)
> I can technically handle anything, but a proper HTML document would be > easiest to work with :) > > <html><head><style></style></head><body>content</body></html>
<html><style>content is a perfectly a valid HTML5 document. Having said that, we can make that change.
Ryosuke Niwa
Comment 14
2018-02-09 12:56:53 PST
(In reply to Ryosuke Niwa from
comment #13
)
> (In reply to Andrew Herron from
comment #10
) > > I can technically handle anything, but a proper HTML document would be > > easiest to work with :) > > > > <html><head><style></style></head><body>content</body></html> > > <html><style>content is a perfectly a valid HTML5 document. Having said > that, we can make that change.
Never mind. This is actually tricky because we're using the fragment parsing algorithm which strips away head.
Wenson Hsieh
Comment 15
2018-02-09 13:54:47 PST
Comment on
attachment 333449
[details]
Updated for ToT View in context:
https://bugs.webkit.org/attachment.cgi?id=333449&action=review
Just some preliminary comments — I'll continue looking.
> Source/WebCore/ChangeLog:25 > + that don't enable custom pasteboard data (detected by the content origin being null), and the HTML
that don't => that doesn't
> Source/WebCore/ChangeLog:30 > + PasteHTML.SripsMSOListWhenMissingMSOHTMLElement
Srips => Strips
> Source/WebCore/ChangeLog:58 > + set to Enable whenever the content COULD be the one generated by Microsoft Word.
Not very intuitive, but sensible I suppose :/ Maybe MSOListQuirks should be renamed to something more like `MayUseMSOListQuirks { No, Yes }` to make this more obvious?
> Source/WebCore/ChangeLog:65 > + attributes, one containing computed styles, and another containing mso-list. Luckily, HTML parsering
parsering => parsing
> Source/WebCore/ChangeLog:66 > + rule dictats that the first attribute always wins when more than attributes of the same name appears,
dictats => dictates more than attributes of the same name appears => multiple attributes of the same name appear
> Source/WebCore/editing/markup.cpp:596 > +bool StyledMarkupAccumulator::appendNodeToPreserveMSOList(Node& node, bool& inMSOList)
Maybe set `inMSOList = false` at the beginning of this function so it always has a correct value after this call?
Ryosuke Niwa
Comment 16
2018-02-09 14:04:26 PST
(In reply to Wenson Hsieh from
comment #15
)
> Comment on
attachment 333449
[details]
> Updated for ToT > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=333449&action=review
> > Just some preliminary comments — I'll continue looking. > > > Source/WebCore/ChangeLog:25 > > + that don't enable custom pasteboard data (detected by the content origin being null), and the HTML > > that don't => that doesn't
Fixed.
> > Source/WebCore/ChangeLog:30 > > + PasteHTML.SripsMSOListWhenMissingMSOHTMLElement > > Srips => Strips
Fixed.
> > Source/WebCore/ChangeLog:58 > > + set to Enable whenever the content COULD be the one generated by Microsoft Word. > > Not very intuitive, but sensible I suppose :/ > > Maybe MSOListQuirks should be renamed to something more like > `MayUseMSOListQuirks { No, Yes }` to make this more obvious?
So the idea here is that MSOListQuirks is like a runtime feature. When the feature is enabled, we'd check for html tag, and do the quirks if needed.
> > Source/WebCore/ChangeLog:65 > > + attributes, one containing computed styles, and another containing mso-list. Luckily, HTML parsering > > parsering => parsing
Fixed.
> > > Source/WebCore/ChangeLog:66 > > + rule dictats that the first attribute always wins when more than attributes of the same name appears, > > dictats => dictates
Fixed.
> more than attributes of the same name appears => multiple attributes of the > same name appear
Fixed.
> > > Source/WebCore/editing/markup.cpp:596 > > +bool StyledMarkupAccumulator::appendNodeToPreserveMSOList(Node& node, bool& inMSOList) > > Maybe set `inMSOList = false` at the beginning of this function so it always > has a correct value after this call?
Oh, the whole point of that boolean is it changes over time during the traversal. It gets updated in this function. Made it a member variable instead to make this clear.
Ryosuke Niwa
Comment 17
2018-02-09 14:43:43 PST
Created
attachment 333520
[details]
Skip the test on Windows
Wenson Hsieh
Comment 18
2018-02-09 16:05:27 PST
> > Not very intuitive, but sensible I suppose :/ > > > > Maybe MSOListQuirks should be renamed to something more like > > `MayUseMSOListQuirks { No, Yes }` to make this more obvious? > > So the idea here is that MSOListQuirks is like a runtime feature. When the > feature is enabled, we'd check for html tag, and do the quirks if needed.
Right, I understand that — it's just that the purpose wasn't obvious to me from the name alone (MSOListQuirks) and I was initially confused why there were 2 flags. Though, I suppose the enum values, ::Enable and ::Disable should be enough to make this clear, so it's fine.
> > Maybe set `inMSOList = false` at the beginning of this function so it always > > has a correct value after this call? > > Oh, the whole point of that boolean is it changes over time during the > traversal. It gets updated in this function. Made it a member variable > instead to make this clear.
Oh ok, now I see! Your new approach makes this a lot clearer. 👍🏻
Wenson Hsieh
Comment 19
2018-02-09 17:00:47 PST
Comment on
attachment 333520
[details]
Skip the test on Windows View in context:
https://bugs.webkit.org/attachment.cgi?id=333520&action=review
> Source/WebCore/editing/MarkupAccumulator.cpp:204 > + ASSERT(length <= text.data().length());
Perhaps `start + length <= text.data().length()`?
> Source/WebCore/editing/markup.cpp:487 > + // Unfortunately, TinyMCE doesn't recognize mso-list inlie style if the style attribute contains properties.
inlie => inline
> Source/WebCore/editing/markup.cpp:600 > + Comment& commentNode = downcast<Comment>(node);
Nit - auto&?
Ryosuke Niwa
Comment 20
2018-02-09 17:22:33 PST
(In reply to Wenson Hsieh from
comment #19
)
> Comment on
attachment 333520
[details]
> Skip the test on Windows > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=333520&action=review
> > > Source/WebCore/editing/MarkupAccumulator.cpp:204 > > + ASSERT(length <= text.data().length()); > > Perhaps `start + length <= text.data().length()`?
Oh oops, nice catch. Fixed.
> > Source/WebCore/editing/markup.cpp:487 > > + // Unfortunately, TinyMCE doesn't recognize mso-list inlie style if the style attribute contains properties. > > inlie => inline
Fixed.
> > Source/WebCore/editing/markup.cpp:600 > > + Comment& commentNode = downcast<Comment>(node); > > Nit - auto&?
Fixed.
Ryosuke Niwa
Comment 21
2018-02-09 19:17:31 PST
(In reply to Wenson Hsieh from
comment #18
)
> > > Not very intuitive, but sensible I suppose :/ > > > > > > Maybe MSOListQuirks should be renamed to something more like > > > `MayUseMSOListQuirks { No, Yes }` to make this more obvious? > > > > So the idea here is that MSOListQuirks is like a runtime feature. When the > > feature is enabled, we'd check for html tag, and do the quirks if needed. > > Right, I understand that — it's just that the purpose wasn't obvious to me > from the name alone (MSOListQuirks) and I was initially confused why there > were 2 flags. Though, I suppose the enum values, ::Enable and ::Disable > should be enough to make this clear, so it's fine.
Okay, I'd rename MSOListQuirks::Enabled to MSOListQuirks::CheckIfNeeded as we discussed in person.
Ryosuke Niwa
Comment 22
2018-02-09 19:45:39 PST
Created
attachment 333550
[details]
Patch for landing
Ryosuke Niwa
Comment 23
2018-02-09 19:56:40 PST
Comment on
attachment 333550
[details]
Patch for landing Wait for EWS first.
Ryosuke Niwa
Comment 24
2018-02-09 21:07:58 PST
Committed
r228352
: <
https://trac.webkit.org/changeset/228352
>
Andrew Herron
Comment 25
2018-02-11 20:48:11 PST
(In reply to Ryosuke Niwa from
comment #14
)
> (In reply to Ryosuke Niwa from
comment #13
) > > > > <html><style>content is a perfectly a valid HTML5 document. Having said > > that, we can make that change. > > Never mind. This is actually tricky because we're using the fragment parsing > algorithm which strips away head.
I realise it's valid, however previously we could rely on the raw clipboard being a "proper" document. But as I said it's not a huge issue it just means some adjustment to how I handle the structure before I can validate if this includes everything we need.
Ryosuke Niwa
Comment 26
2018-02-12 11:48:22 PST
Reopening to attach new patch.
Ryosuke Niwa
Comment 27
2018-02-12 11:48:23 PST
Created
attachment 333610
[details]
Fixes the off-by-one error
EWS Watchlist
Comment 28
2018-02-12 11:49:32 PST
Attachment 333610
[details]
did not pass style-queue: ERROR: Tools/ChangeLog:8: Line contains tab character. [whitespace/tab] [5] ERROR: Tools/ChangeLog:9: Line contains tab character. [whitespace/tab] [5] Total errors found: 2 in 5 files If any of these errors are false positives, please file a bug against check-webkit-style.
Wenson Hsieh
Comment 29
2018-02-12 12:02:17 PST
Comment on
attachment 333610
[details]
Fixes the off-by-one error View in context:
https://bugs.webkit.org/attachment.cgi?id=333610&action=review
r=me
> Tools/ChangeLog:8 > + Added assertions to make sure the trailing } is includd in the style.
Nit - it looks like there's a tab character here.
Ryosuke Niwa
Comment 30
2018-02-12 12:07:51 PST
(In reply to Wenson Hsieh from
comment #29
)
> Comment on
attachment 333610
[details]
> Fixes the off-by-one error > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=333610&action=review
> > r=me > > > Tools/ChangeLog:8 > > + Added assertions to make sure the trailing } is includd in the style. > > Nit - it looks like there's a tab character here.
Oops, yeah, will fix before landing.
Ryosuke Niwa
Comment 31
2018-02-12 12:09:47 PST
Created
attachment 333614
[details]
Fixed off by one error
Ryosuke Niwa
Comment 32
2018-02-12 14:07:45 PST
Comment on
attachment 333614
[details]
Fixed off by one error This doesn't quite cut it. The original patch broke Evernote & Wordpress.com :(
Andrew Herron
Comment 33
2018-02-12 16:37:46 PST
(In reply to Wenson Hsieh from
comment #29
)
> Comment on
attachment 333610
[details]
> Fixes the off-by-one error > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=333610&action=review
> > r=me > > > Tools/ChangeLog:8 > > + Added assertions to make sure the trailing } is includd in the style. > > Nit - it looks like there's a tab character here.
You've also missed the e in "included" :)
Ryosuke Niwa
Comment 34
2018-02-12 17:32:01 PST
Created
attachment 333653
[details]
Fixes Wordpress, evernote, and Gmail
Andrew Herron
Comment 35
2018-02-12 23:32:59 PST
I like where this patch is going with extra style definitions included. Those are the main thing I was concerned we might need to fully process lists :) I wonder whether the variations in behaviour across TinyMCE versions you mention in your changelog are due to plugins; only our PowerPaste plugin uses detailed list processing (it's enabled on tinymce.com), the default paste implementation has very basic word import. Treating the style tag as text may be a bug that I can address rather than needing a webkit workaround (although putting it inside a head tag definitely helps). It's most likely because the content is not inside a body tag - in the current release if a body tag isn't found it assumes the entire clipboard is content. This is related to why I asked for the structure change earlier. I'm going to need to release a plugin update to support this change anyway so if the changes here break paste on the current release that's not the end of the world. I have some feedback on the current patch:
> if (m_shouldPreserveMSOList && element.hasTagName(pTag) && element.getAttribute(classAttr).startsWith("MsoListParagraph"))
This won't catch all list items in the content. Just from a quick search I'm seeing paragraphs with class "MsoHeading7" that have mso-list attributes, as well as "MsoNormal". And I'm sure there are others. Having a whitelist for this seems like the sort of thing that could easily break between versions of word. In addition, this "MsoListParagraph" filter isn't applied to the comment nodes in appendNodeToPreserveMSOList so the results are inconsistent. I'm seeing paragraphs without style mso-list but with the "supportsLists" comments inside. I made a custom patch on my machine that relaxes the override checks a bit and I'm getting _really_ good results.
Ryosuke Niwa
Comment 36
2018-02-13 13:03:40 PST
(In reply to Andrew Herron from
comment #35
) >
> > if (m_shouldPreserveMSOList && element.hasTagName(pTag) && element.getAttribute(classAttr).startsWith("MsoListParagraph")) > > This won't catch all list items in the content. Just from a quick search I'm > seeing paragraphs with class "MsoHeading7" that have mso-list attributes, as > well as "MsoNormal". And I'm sure there are others. Having a whitelist for > this seems like the sort of thing that could easily break between versions > of word.
That doesn't seem to affect the end result.
> In addition, this "MsoListParagraph" filter isn't applied to the comment > nodes in appendNodeToPreserveMSOList so the results are inconsistent. I'm > seeing paragraphs without style mso-list but with the "supportsLists" > comments inside.
Those nodes are taken care of by m_inMSOList check.
> I made a custom patch on my machine that relaxes the override checks a bit > and I'm getting _really_ good results.
What are the changes you made?
Brent Fulgham
Comment 37
2018-02-13 13:43:48 PST
Comment on
attachment 333653
[details]
Fixes Wordpress, evernote, and Gmail View in context:
https://bugs.webkit.org/attachment.cgi?id=333653&action=review
r=me
> Tools/ChangeLog:33 > +
Did you mean to have these two separate ChangeLog entries?
Ryosuke Niwa
Comment 38
2018-02-13 14:44:05 PST
Created
attachment 333727
[details]
Fixes older versions of TinyMCE
Andrew Herron
Comment 39
2018-02-13 16:08:40 PST
(In reply to Ryosuke Niwa from
comment #36
)
> (In reply to Andrew Herron from
comment #35
) > > > > > if (m_shouldPreserveMSOList && element.hasTagName(pTag) && element.getAttribute(classAttr).startsWith("MsoListParagraph")) > > > > This won't catch all list items in the content. Just from a quick search I'm > > seeing paragraphs with class "MsoHeading7" that have mso-list attributes, as > > well as "MsoNormal". And I'm sure there are others. Having a whitelist for > > this seems like the sort of thing that could easily break between versions > > of word. > > That doesn't seem to affect the end result.
This issue means mso-list styles are stripped from actual list items, and every time that happens we can't process the item. If the documents you're testing with all appear fine I will try to find some more complex documents that I can share - my test data mostly comes from customers so I need to be careful.
> > In addition, this "MsoListParagraph" filter isn't applied to the comment > > nodes in appendNodeToPreserveMSOList so the results are inconsistent. I'm > > seeing paragraphs without style mso-list but with the "supportsLists" > > comments inside. > > Those nodes are taken care of by m_inMSOList check.
But that isn't used for the style attribute output. I think the two should be linked somehow. As I said, in some documents the HTML I get on paste has paragraphs with the supportsLists comments and related data but no mso-list style on the P tag. That's the problem. If you're ever going to strip mso-list from a paragraph (which per above is a bad idea imo) it seems like the comments should be removed as well. My code deals with these comments, but only when processing a list paragraph which is detected by the mso-list style. Without mso-list, these comments make it into customer content (as a part of fake list-like paragraphs that are just confusing).
> > I made a custom patch on my machine that relaxes the override checks a bit > > and I'm getting _really_ good results. > > What are the changes you made?
I targeted two checks: - Style attribute preservation relaxed to element.getAttribute(classAttr).startsWith("Mso") - The msoListMode check relaxed to originalMarkup.startsWith("<html xmlns") The second change relates to an email I sent you, the check you use to detect MS Word documents is too specific and doesn't catch everything. I realise these changes are unlikely to be suitable for release, but I wanted to verify if more complete data would work with our list import routines. At first glance the answer appears to be yes.
Ryosuke Niwa
Comment 40
2018-02-13 16:38:30 PST
(In reply to Andrew Herron from
comment #39
)
> > - Style attribute preservation relaxed to > element.getAttribute(classAttr).startsWith("Mso") > - The msoListMode check relaxed to originalMarkup.startsWith("<html xmlns") > > The second change relates to an email I sent you, the check you use to > detect MS Word documents is too specific and doesn't catch everything.
These changes are probably too generic. That would match more documents than we'd like. It would be great if you can share specific documents from which copying & pasting would fail with the latest two patches.
Andrew Herron
Comment 41
2018-02-13 20:13:10 PST
Created
attachment 333761
[details]
Challenging document I found a really old Ephox press release in our test data that is an extreme example of what I'm talking about, it's not handled properly even with the changes I made :) There are newer documents with similar issues but they all have private data in them. I can try to generate my own or sanitise one of them if this document isn't suitable. When pasting from Word v16.9: - the <html> tag doesn't have xmlns:o as the first attribute, failing the overall list check - none of the list paragraphs (inside the table) have a class at all, completely failing the "preserve style attribute" check As a result, even with my relaxed checks this produces paragraphs with no "mso-list" style but retaining the "if supportsLists" comments.
Ryosuke Niwa
Comment 42
2018-02-13 22:16:26 PST
Created
attachment 333768
[details]
Fixed Evernote & Wordpress.com & MS Word's compat mode
Ryosuke Niwa
Comment 43
2018-02-13 22:17:54 PST
(In reply to Andrew Herron from
comment #41
)
> Created
attachment 333761
[details]
> Challenging document
Thanks for the document. This was super helpful!
> There are newer documents with similar issues but they all have private data > in them. I can try to generate my own or sanitise one of them if this > document isn't suitable. > > When pasting from Word v16.9: > > - the <html> tag doesn't have xmlns:o as the first attribute, failing the > overall list check
Indeed. This seems to only occur with the compatibility mode too. I've adjusted the check for look for xmlns:o and xmlns:w within <html ~> instead.
> - none of the list paragraphs (inside the table) have a class at all, > completely failing the "preserve style attribute" check
Indeed. I've fixed it to look for mso-list CSS property instead.
Andrew Herron
Comment 44
2018-02-13 22:32:43 PST
Sounds good! I'm building with the patch now. I'll try to confirm whether this appears to fix our issues before I finish work today - once this is merged I can get other developers to help verify the changes across our entire test document set. We will still need a new plugin release to be more flexible about the document structure it accepts, but other than that I think the changes on our end will be minor.
Andrew Herron
Comment 45
2018-02-13 23:21:32 PST
I pasted a few of our test documents with the patch and it looks good.
Wenson Hsieh
Comment 46
2018-02-14 07:42:16 PST
Comment on
attachment 333768
[details]
Fixed Evernote & Wordpress.com & MS Word's compat mode View in context:
https://bugs.webkit.org/attachment.cgi?id=333768&action=review
> Source/WebCore/editing/markup.cpp:433 > + if (m_shouldPreserveMSOList && element.hasTagName(pTag) && element.getAttribute(styleAttr).contains(";mso-list:"))
Nit - maybe add a new helper method for this called something like `shouldPreserveMSOListStyleForElement`, and turn the above into a one-liner?
Ryosuke Niwa
Comment 47
2018-02-14 09:58:35 PST
Created
attachment 333810
[details]
Fixed Evernote & Wordpress.com & MS Word's compat mode
Ryosuke Niwa
Comment 48
2018-02-14 09:59:08 PST
Comment on
attachment 333810
[details]
Fixed Evernote & Wordpress.com & MS Word's compat mode Already been reviewed. Waiting for EWS.
Ryosuke Niwa
Comment 49
2018-02-14 13:07:44 PST
Committed
r228482
: <
https://trac.webkit.org/changeset/228482
>
Andrew Herron
Comment 50
2018-02-15 21:47:32 PST
I've come across a document that when copied to the clipboard has a much simpler style attribute than others we've been testing with. The entire opening paragraph tag is:
> <p class=Ctrl4Opsomming style='mso-list:l0 level1 lfo1'>
As such it fails the ";mso-list:" check this patch uses and the style attribute is overridden. Could it be made to check if the style starts with "mso-list:" as well? :)
Andrew Herron
Comment 51
2018-02-18 21:36:52 PST
Created
attachment 334136
[details]
Document with heading lists I've found a new failing document that I can attach publicly. I remember thinking when I looked at the "element tag name is p" check that there would probably be cases of heading tags with mso-list attributes, but I don't think I ever mentioned it here. One of our test documents does indeed have that :) I've made some adjustments to the shouldPreserve check and verified this retains the correct style attributes both with this document and the private one mentioned in my previous comment:
> bool StyledMarkupAccumulator::shouldPreserveMSOListStyleForElement(const Element& element) > { > return m_inMSOList || (m_shouldPreserveMSOList && (element.getAttribute(styleAttr).contains(";mso-list:") || element.getAttribute(styleAttr).startsWith("mso-list:"))); > }
Although that's again probably too relaxed - it's really only the primary content block tags (P, H1-6) that can be lists. Not _all_ tags. And to be fair TinyMCE doesn't import this document correctly even when given the right styles. But that's a bug I plan to fix so it'd be nice to have heading list styles retained :)
Ryosuke Niwa
Comment 52
2018-02-19 13:23:53 PST
Let's track this bug in
https://bugs.webkit.org/show_bug.cgi?id=182938
.
Andrew Herron
Comment 53
2018-02-19 15:53:58 PST
Should I log a new bug for the heading lists as well? The document I attached was for that, not the "starts with" issue.
Ryosuke Niwa
Comment 54
2018-02-19 17:17:29 PST
(In reply to Andrew Herron from
comment #53
)
> Should I log a new bug for the heading lists as well? The document I > attached was for that, not the "starts with" issue.
Oh, okay. Please do file one.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug