Bug 181616 - Blob conversion and sanitization doesn't work with Microsoft Word for Mac 2011
Summary: Blob conversion and sanitization doesn't work with Microsoft Word for Mac 2011
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: HTML Editing (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P1 Normal
Assignee: Ryosuke Niwa
URL:
Keywords: InRadar
Depends on: 124391 181920
Blocks:
  Show dependency treegraph
 
Reported: 2018-01-12 15:30 PST by Ryosuke Niwa
Modified: 2018-01-22 13:13 PST (History)
5 users (show)

See Also:


Attachments
WIP (19.47 KB, patch)
2018-01-20 02:05 PST, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
Fixes the bug (20.64 KB, patch)
2018-01-21 22:34 PST, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
Fixes the bug (20.64 KB, patch)
2018-01-21 22:47 PST, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
Patch for landing (20.33 KB, patch)
2018-01-22 10:05 PST, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
Patch for landing (20.25 KB, patch)
2018-01-22 12:38 PST, Ryosuke Niwa
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ryosuke Niwa 2018-01-12 15:30:08 PST
The blob conversion & sanitation upon paste doesn't work
when copying & pasting from Microsoft Word for Mac 2011
as reported by the latest comment on the bug 124391.

I can reproduce this issue locally as well.
Comment 1 Radar WebKit Bug Importer 2018-01-12 15:36:06 PST
<rdar://problem/36484908>
Comment 2 Ryosuke Niwa 2018-01-12 15:39:10 PST
P1 due to the security/privacy implications.
Comment 3 Ryosuke Niwa 2018-01-19 23:07:51 PST
Unfortunately, we can't make the image available in this case because Microsoft Word is putting only file URLs in the HTML content, and we can't grant access those files in the web content process.

Supporting this would involve starting a new web content process, parse the HTML to be pasted, then UI process to add sandbox extensions for those files in the web content process in which the content is pasted.
Comment 4 Ryosuke Niwa 2018-01-19 23:21:20 PST
For now, I'm going to address the bug that we're expoing file URLs in the pasted content.

Since Microsoft Word for Mac 2016 has been out for two years, the recommended path for users who are using 2011 version is to upgrade to 2016 version, which does put webarchive into the pasteboard, and image pasting works as expected in the latest version of Safari.
Comment 5 Ryosuke Niwa 2018-01-20 02:05:01 PST
Created attachment 331838 [details]
WIP
Comment 6 Andrew Herron 2018-01-21 16:26:13 PST
That's interesting. I had assumed you solved bug 124391 by reading in the files referenced by the file URLs :)
Comment 7 Ryosuke Niwa 2018-01-21 22:34:43 PST
Created attachment 331895 [details]
Fixes the bug
Comment 8 Ryosuke Niwa 2018-01-21 22:45:32 PST
Comment on attachment 331895 [details]
Fixes the bug

Oops, I uploaded this patch too early.
Comment 9 Ryosuke Niwa 2018-01-21 22:47:01 PST
Created attachment 331897 [details]
Fixes the bug
Comment 10 Wenson Hsieh 2018-01-22 09:52:31 PST
Comment on attachment 331897 [details]
Fixes the bug

View in context: https://bugs.webkit.org/attachment.cgi?id=331897&action=review

> Source/WebCore/ChangeLog:27
> +        (WebCore::WebContentReader::readHTML): Fixed the bug by sanitizing the markup, and stripping away file URLs.

Hm...this is a bit more aggressive than just stripping away file URLs, since we're stripping away all URLs that are not one of { http:, https:, data: } by using shouldReplaceSubresourceURL as the filter.

I understand that maintaining a whitelist of URLs to allow is safer than the reverse approach, though...but let's make it clear in the ChangeLog.

> Source/WebCore/editing/cocoa/WebContentReaderCocoa.mm:533
> +            removeSubresourceURLAttributes(fragment, [] (const URL& url ) {

Nit - extra space after URL& url

> Source/WebCore/editing/cocoa/WebContentReaderCocoa.mm:552
> +            removeSubresourceURLAttributes(fragment, [] (const URL& url ) {

Nit - extra space after URL& url

> Source/WebCore/editing/markup.h:54
> +String sanitizeMarkup(const String&, std::optional<std::function<void(DocumentFragment&)>> fragmentSanitizer = std::nullopt);

Nit - I think we generally prefer to use WTF::Function over std::function.

> Tools/TestWebKitAPI/Tests/WebKitCocoa/PasteHTML.mm:117
> +TEST(PasteHTML, StripsHTTPURLs)

Did you mean "StripsFileURLs"?

> Tools/TestWebKitAPI/Tests/WebKitCocoa/PasteHTML.mm:131
> +TEST(PasteHTML, DoesNotStripHTTPURLsWhenCustomPasteboardDataIsDisabled)

Ditto, looks like this was meant to be "DoesNotStripFileURLsWhenCustomPasteboardDataIsDisabled"
Comment 11 Ryosuke Niwa 2018-01-22 10:05:15 PST
Created attachment 331940 [details]
Patch for landing
Comment 12 Ryosuke Niwa 2018-01-22 10:06:07 PST
(In reply to Wenson Hsieh from comment #10)
> Comment on attachment 331897 [details]
> Fixes the bug
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=331897&action=review
> 
> > Source/WebCore/ChangeLog:27
> > +        (WebCore::WebContentReader::readHTML): Fixed the bug by sanitizing the markup, and stripping away file URLs.
> 
> Hm...this is a bit more aggressive than just stripping away file URLs, since
> we're stripping away all URLs that are not one of { http:, https:, data: }
> by using shouldReplaceSubresourceURL as the filter.

Good point. Fixed the change log.

> I understand that maintaining a whitelist of URLs to allow is safer than the
> reverse approach, though...but let's make it clear in the ChangeLog.
> 
> > Source/WebCore/editing/cocoa/WebContentReaderCocoa.mm:533
> > +            removeSubresourceURLAttributes(fragment, [] (const URL& url ) {
> 
> Nit - extra space after URL& url

Fixed.

> > Source/WebCore/editing/cocoa/WebContentReaderCocoa.mm:552
> > +            removeSubresourceURLAttributes(fragment, [] (const URL& url ) {
> 
> Nit - extra space after URL& url

Fixed.

> > Source/WebCore/editing/markup.h:54
> > +String sanitizeMarkup(const String&, std::optional<std::function<void(DocumentFragment&)>> fragmentSanitizer = std::nullopt);
> 
> Nit - I think we generally prefer to use WTF::Function over std::function.

Really? I thought we wanted to use std::function instead of WTF::Function going forward. Whatever. I'd use WTF::Function.

> > Tools/TestWebKitAPI/Tests/WebKitCocoa/PasteHTML.mm:117
> > +TEST(PasteHTML, StripsHTTPURLs)
> 
> Did you mean "StripsFileURLs"?

Fixed.

> > Tools/TestWebKitAPI/Tests/WebKitCocoa/PasteHTML.mm:131
> > +TEST(PasteHTML, DoesNotStripHTTPURLsWhenCustomPasteboardDataIsDisabled)
> 
> Ditto, looks like this was meant to be
> "DoesNotStripFileURLsWhenCustomPasteboardDataIsDisabled"

Fixed.
Comment 13 Wenson Hsieh 2018-01-22 10:12:50 PST
Comment on attachment 331897 [details]
Fixes the bug

View in context: https://bugs.webkit.org/attachment.cgi?id=331897&action=review

>>> Source/WebCore/editing/markup.h:54
>>> +String sanitizeMarkup(const String&, std::optional<std::function<void(DocumentFragment&)>> fragmentSanitizer = std::nullopt);
>> 
>> Nit - I think we generally prefer to use WTF::Function over std::function.
> 
> Really? I thought we wanted to use std::function instead of WTF::Function going forward. Whatever. I'd use WTF::Function.

Just asked Fil Pizlo — it's WTF::Function.
Comment 14 WebKit Commit Bot 2018-01-22 10:29:51 PST
Comment on attachment 331940 [details]
Patch for landing

Rejecting attachment 331940 [details] from commit-queue.

Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-01', 'build', '--no-clean', '--no-update', '--build-style=release', '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit

Last 500 characters of output:
urce22.cpp -o /Volumes/Data/EWS/WebKit/WebKitBuild/WebCore.build/Release/WebCore.build/Objects-normal/x86_64/UnifiedSource22.o

** BUILD FAILED **


The following build commands failed:
	CompileC /Volumes/Data/EWS/WebKit/WebKitBuild/WebCore.build/Release/WebCore.build/Objects-normal/x86_64/UnifiedSource8-mm.o /Volumes/Data/EWS/WebKit/WebKitBuild/Release/DerivedSources/WebCore/unified-sources/UnifiedSource8-mm.mm normal x86_64 objective-c++ com.apple.compilers.llvm.clang.1_0.compiler
(1 failure)

Full output: http://webkit-queues.webkit.org/results/6171151
Comment 15 Ryosuke Niwa 2018-01-22 12:38:59 PST
Created attachment 331949 [details]
Patch for landing
Comment 16 WebKit Commit Bot 2018-01-22 13:13:41 PST
Comment on attachment 331949 [details]
Patch for landing

Clearing flags on attachment: 331949

Committed r227351: <https://trac.webkit.org/changeset/227351>
Comment 17 WebKit Commit Bot 2018-01-22 13:13:43 PST
All reviewed patches have been landed.  Closing bug.