RESOLVED FIXED Bug 199975
WebKit should strip away system font names from the pasted content
https://bugs.webkit.org/show_bug.cgi?id=199975
Summary WebKit should strip away system font names from the pasted content
Ryosuke Niwa
Reported 2019-07-19 20:36:30 PDT
WebKit should strip away system font names from the pasted content that Cocoa's HTML writer generates. Otherwise, we end up using the default fallback font like Times New Roman and it would look wrong.
Attachments
Paste side fix (998 bytes, patch)
2019-07-19 20:43 PDT, Ryosuke Niwa
no flags
Fixes the bug (21.14 KB, patch)
2019-07-19 23:25 PDT, Ryosuke Niwa
no flags
Updated per review comment (21.36 KB, patch)
2019-07-22 14:08 PDT, Ryosuke Niwa
no flags
Updated per review comments (23.22 KB, patch)
2019-07-22 15:58 PDT, Ryosuke Niwa
no flags
Patch for landing (23.20 KB, patch)
2019-07-22 19:11 PDT, Ryosuke Niwa
no flags
Archive of layout-test-results from ews100 for mac-highsierra (3.39 MB, application/zip)
2019-07-22 20:12 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews107 for mac-highsierra-wk2 (2.92 MB, application/zip)
2019-07-22 20:21 PDT, EWS Watchlist
no flags
Archive of layout-test-results from ews116 for mac-highsierra (3.24 MB, application/zip)
2019-07-22 20:44 PDT, EWS Watchlist
no flags
Patch for landing (23.31 KB, patch)
2019-07-22 22:54 PDT, Ryosuke Niwa
no flags
Radar WebKit Bug Importer
Comment 1 2019-07-19 20:36:56 PDT
Ryosuke Niwa
Comment 2 2019-07-19 20:43:31 PDT
Created attachment 374539 [details] Paste side fix We can fix this either during copy or upon paste. Fixing this in the paste side is simple to understand. We strip away any .Apple / .SF font names we encounter in ReplaceSelectionCommand. Fixing this in the copy site works even when Cocoa HTML Writer generates HTML because we have copy-paste sanitization code which inserts the original HTML into a dummy document then re-serializes back to HTML before the actual pasting happens. This secondary serialization step uses our copy code. After going back & forth about the two approaches, I think we should go with the copy side fix. It has a few benefits: 1. It would only affect clients that opts-in to copy & paste sanitization. e.g. it won't affect clients of legacy WebKit API and those who opt out of custom pasteboard data. 2. We can preserve font names such as ".SF Blah" that a website inserts as some kind of house keeping purposes (who knows what kind of a website might be doing this). Copy side fix would only affect cross-site pasting and cross-app pasting, which is rare and less likely to be affected by this kind of house keeping code. 3. We won't expose bogus .Apple* or .SF* font names to websites that directly use event.clipboardData.getData during a paste event. Stripping away bogus markup like this is one of the key features / benefit of using copy & paste sanitization code.
Ryosuke Niwa
Comment 3 2019-07-19 23:25:14 PDT
Created attachment 374542 [details] Fixes the bug
Ryosuke Niwa
Comment 4 2019-07-19 23:28:53 PDT
I'm sure there is a better way to detect a system font name... There is some code to that end in platformFontLookupWithFamily and fontNameIsSystemFont in FontCacheCoreText.cpp but it's fairly deep down in font code. Myles, do you have any suggestions on how to expose some of those functions to editing & share code? It doesn't seem like we can really figure out the "bad" state with FontCascade or FontCascadeDescription.
Sam Weinig
Comment 5 2019-07-21 07:08:20 PDT
Comment on attachment 374542 [details] Fixes the bug View in context: https://bugs.webkit.org/attachment.cgi?id=374542&action=review > Source/WebCore/editing/EditingStyle.cpp:1293 > + String stringValue = value.cssText(); > + if (stringValue.startsWith('\'') || stringValue.startsWith('"')) > + stringValue = stringValue.substring(1); > + > + return startsWithLettersIgnoringASCIICase(stringValue, ".sf") || startsWithLettersIgnoringASCIICase(stringValue, ".apple"); Since this logic is a bit platform specific, I think this logic should probably go in one of the Platform layer Font classes. That way, if other platforms have fonts they need to exclude, they can as well.
Myles C. Maxfield
Comment 6 2019-07-21 10:24:02 PDT
Comment on attachment 374542 [details] Fixes the bug View in context: https://bugs.webkit.org/attachment.cgi?id=374542&action=review > Source/WebCore/ChangeLog:10 > + We need to strip away these font names upon paste to avoid these font names falling back to Times New Roman. Why are they falling back to Times New Roman? > Source/WebCore/editing/EditingStyle.cpp:1287 > + if (!hasOneValue) > + return false; This logic doesn't seem to match the function name. Either the function should be renamed to "isForbiddenSystemFont" or this should iterate through the entire list. > Source/WebCore/editing/EditingStyle.cpp:1291 > + if (stringValue.startsWith('\'') || stringValue.startsWith('"')) > + stringValue = stringValue.substring(1); This seems brittle. Instead of generating the cssText() and then trying to parse it here, we can just use the previously-parsed CSSValue itself. >> Source/WebCore/editing/EditingStyle.cpp:1293 >> + return startsWithLettersIgnoringASCIICase(stringValue, ".sf") || startsWithLettersIgnoringASCIICase(stringValue, ".apple"); > > Since this logic is a bit platform specific, I think this logic should probably go in one of the Platform layer Font classes. That way, if other platforms have fonts they need to exclude, they can as well. The Apple policy is that all dot-prefixed names are for internal use only, and should not be leaked into content (e.g. by copy and paste) Also, I'm not sure if you want to forbid "system-ui", which is a system font but wouldn't match this rule. However, "system-ui" is a standardized name, and is implemented in multiple browsers, so maybe it shouldn't be caught here. > Source/WebCore/editing/EditingStyle.cpp:1326 > + m_mutableStyle->removeProperty(CSSPropertyFontFamily); > + fromComputedStyle->removeProperty(CSSPropertyFontFamily); What about font-family: Helvetica, .SomeForbiddenFont;? Are you sure you want to remove the whole property, rather than just the individual item that is forbidden?
Ryosuke Niwa
Comment 7 2019-07-22 12:44:18 PDT
(In reply to Myles C. Maxfield from comment #6) > Comment on attachment 374542 [details] > Fixes the bug > > View in context: > https://bugs.webkit.org/attachment.cgi?id=374542&action=review > > > Source/WebCore/ChangeLog:10 > > + We need to strip away these font names upon paste to avoid these font names falling back to Times New Roman. > > Why are they falling back to Times New Roman? Because that's the default font of WKWebView. But this doesn't matter much. The important feature of a HTML-based editing is so that emails, web sites, etc... which are composed in WebKit to render properly in other browsers. So it's never okay to put bogus font names like these. > > Source/WebCore/editing/EditingStyle.cpp:1287 > > + if (!hasOneValue) > > + return false; > > This logic doesn't seem to match the function name. Either the function > should be renamed to "isForbiddenSystemFont" or this should iterate through > the entire list. > > > Source/WebCore/editing/EditingStyle.cpp:1291 > > + if (stringValue.startsWith('\'') || stringValue.startsWith('"')) > > + stringValue = stringValue.substring(1); > > This seems brittle. Instead of generating the cssText() and then trying to > parse it here, we can just use the previously-parsed CSSValue itself. Sure, I don't think this will really matter though because realistically speaking we're only concerned about the content generated by Cocoa HTML Writer, and nothing else. > >> Source/WebCore/editing/EditingStyle.cpp:1293 > >> + return startsWithLettersIgnoringASCIICase(stringValue, ".sf") || startsWithLettersIgnoringASCIICase(stringValue, ".apple"); > > > > Since this logic is a bit platform specific, I think this logic should probably go in one of the Platform layer Font classes. That way, if other platforms have fonts they need to exclude, they can as well. > > The Apple policy is that all dot-prefixed names are for internal use only, > and should not be leaked into content (e.g. by copy and paste) > > Also, I'm not sure if you want to forbid "system-ui", which is a system font > but wouldn't match this rule. However, "system-ui" is a standardized name, > and is implemented in multiple browsers, so maybe it shouldn't be caught > here. It's probably okay to not strip that since "system-ui" is a standard feature as you say, and no different from specifying a font name that doesn't exist elsewhere. Also, we don't know of any bug report where such a problem exists as of now. > > Source/WebCore/editing/EditingStyle.cpp:1326 > > + m_mutableStyle->removeProperty(CSSPropertyFontFamily); > > + fromComputedStyle->removeProperty(CSSPropertyFontFamily); > > What about font-family: Helvetica, .SomeForbiddenFont;? Are you sure you > want to remove the whole property, rather than just the individual item that > is forbidden? We don't care about those. Really, we're only concerned about the content generated by Cocoa HTML Writer. The chance of code ends up having some side effects in websites that happen to use that kind of font names is so small that it's not worth adding code for it. If we started worrying about every scenario in editing, we wouldn't fix a single bug.
Ryosuke Niwa
Comment 8 2019-07-22 12:49:07 PDT
(In reply to Sam Weinig from comment #5) > Comment on attachment 374542 [details] > Fixes the bug > > View in context: > https://bugs.webkit.org/attachment.cgi?id=374542&action=review > > > Source/WebCore/editing/EditingStyle.cpp:1293 > > + String stringValue = value.cssText(); > > + if (stringValue.startsWith('\'') || stringValue.startsWith('"')) > > + stringValue = stringValue.substring(1); > > + > > + return startsWithLettersIgnoringASCIICase(stringValue, ".sf") || startsWithLettersIgnoringASCIICase(stringValue, ".apple"); > > Since this logic is a bit platform specific, I think this logic should > probably go in one of the Platform layer Font classes. That way, if other > platforms have fonts they need to exclude, they can as well. Indeed that would be good. But which one? There are so many font related classes & files.
Ryosuke Niwa
Comment 9 2019-07-22 14:08:22 PDT
Created attachment 374630 [details] Updated per review comment
Sam Weinig
Comment 10 2019-07-22 14:16:10 PDT
(In reply to Ryosuke Niwa from comment #8) > (In reply to Sam Weinig from comment #5) > > Comment on attachment 374542 [details] > > Fixes the bug > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=374542&action=review > > > > > Source/WebCore/editing/EditingStyle.cpp:1293 > > > + String stringValue = value.cssText(); > > > + if (stringValue.startsWith('\'') || stringValue.startsWith('"')) > > > + stringValue = stringValue.substring(1); > > > + > > > + return startsWithLettersIgnoringASCIICase(stringValue, ".sf") || startsWithLettersIgnoringASCIICase(stringValue, ".apple"); > > > > Since this logic is a bit platform specific, I think this logic should > > probably go in one of the Platform layer Font classes. That way, if other > > platforms have fonts they need to exclude, they can as well. > > Indeed that would be good. But which one? There are so many font related > classes & files. Myles? FontCascadeDescription seems like an ok bet?
Ryosuke Niwa
Comment 11 2019-07-22 14:35:21 PDT
FWIW, I noted in https://bugs.webkit.org/show_bug.cgi?id=199975#c4, FontCacheCoreText.cpp is where system font name check currently lives. FontCascadeDescription is a slightly higher level abstraction so perhaps we want to expose a static function from FontCache class and then re-export it via FontCascadeDescription?
Myles C. Maxfield
Comment 12 2019-07-22 14:54:17 PDT
(In reply to Ryosuke Niwa from comment #11) > FWIW, I noted in https://bugs.webkit.org/show_bug.cgi?id=199975#c4, > FontCacheCoreText.cpp is where system font name check currently lives. > > FontCascadeDescription is a slightly higher level abstraction so perhaps we > want to expose a static function from FontCache class and then re-export it > via FontCascadeDescription? I don't have a philosophical preference. Wherever it seems to architecturally fit is fine with me.
Ryosuke Niwa
Comment 13 2019-07-22 15:16:36 PDT
(In reply to Myles C. Maxfield from comment #12) > (In reply to Ryosuke Niwa from comment #11) > > FWIW, I noted in https://bugs.webkit.org/show_bug.cgi?id=199975#c4, > > FontCacheCoreText.cpp is where system font name check currently lives. > > > > FontCascadeDescription is a slightly higher level abstraction so perhaps we > > want to expose a static function from FontCache class and then re-export it > > via FontCascadeDescription? > > I don't have a philosophical preference. Wherever it seems to > architecturally fit is fine with me. Okay, I'll just add it to FontCache.h for now. We can refactor it later if any.
Ryosuke Niwa
Comment 14 2019-07-22 15:58:48 PDT
Created attachment 374646 [details] Updated per review comments
Darin Adler
Comment 15 2019-07-22 18:48:59 PDT
Comment on attachment 374646 [details] Updated per review comments View in context: https://bugs.webkit.org/attachment.cgi?id=374646&action=review > Source/WebCore/editing/EditingStyle.cpp:1288 > + auto* firstItem = downcast<CSSValueList>(value).item(0); Since we’ve already checked that the length is 1, I don’t think we need to call this "firstItem" because we know it’s the *only* item. > Source/WebCore/editing/EditingStyle.cpp:1290 > + if (!firstItem) > + return false; This can’t happen. We don’t allow null pointers in CSS value lists; the function only returns null if we’re off the end of the list, which can’t happen here. Could even use itemWithoutBoundsCheck here if you like since we already checked the length. In fact, I suggest using a reference for the local variable, rather than a pointer, to avoid bogus null checks below. > Source/WebCore/editing/EditingStyle.cpp:1292 > + String stringValue; This is an unused local variable. > Source/WebCore/editing/EditingStyle.cpp:1294 > + if (!is<CSSPrimitiveValue>(firstItem)) > + return false; Written this way it includes a null check, so is redundant with the null check above. > Source/WebCore/editing/EditingStyle.cpp:1299 > + auto result = downcast<CSSPrimitiveValue>(firstItem)->getStringValue(); > + if (result.hasException()) > + return false; > + auto fontFamily = result.releaseReturnValue(); The getStringValue function is designed for use in JavaScript. The plain old stringValue function is easier to use in C++ code. Since it returns the null string when it's not a string you don’t even have to check, can just use the string. auto fontFamilyName = downcast<CSSPrimitiveValue>(*firstItem).stringValue(); But I think we know that this is a font family, so we don’t need to use the polymorphic stringValue(), instead we could just use this: auto fontFamilyName = downcast<CSSPrimitiveValue>(*firstItem). fontFamily().familyName(); Code would be slightly bigger and slightly faster. Not sure which is better.
Ryosuke Niwa
Comment 16 2019-07-22 18:55:06 PDT
(In reply to Darin Adler from comment #15) > Comment on attachment 374646 [details] > Updated per review comments > > View in context: > https://bugs.webkit.org/attachment.cgi?id=374646&action=review > > > Source/WebCore/editing/EditingStyle.cpp:1288 > > + auto* firstItem = downcast<CSSValueList>(value).item(0); > > Since we’ve already checked that the length is 1, I don’t think we need to > call this "firstItem" because we know it’s the *only* item. Sure, will rename. > > Source/WebCore/editing/EditingStyle.cpp:1290 > > + if (!firstItem) > > + return false; > > This can’t happen. We don’t allow null pointers in CSS value lists; the > function only returns null if we’re off the end of the list, which can’t > happen here. Could even use itemWithoutBoundsCheck here if you like since we > already checked the length. In fact, I suggest using a reference for the > local variable, rather than a pointer, to avoid bogus null checks below. Will do. > > Source/WebCore/editing/EditingStyle.cpp:1292 > > + String stringValue; > > This is an unused local variable. Oops, wil remove. > > Source/WebCore/editing/EditingStyle.cpp:1294 > > + if (!is<CSSPrimitiveValue>(firstItem)) > > + return false; > > Written this way it includes a null check, so is redundant with the null > check above. Will use a reference. > > Source/WebCore/editing/EditingStyle.cpp:1299 > > + auto result = downcast<CSSPrimitiveValue>(firstItem)->getStringValue(); > > + if (result.hasException()) > > + return false; > > + auto fontFamily = result.releaseReturnValue(); > > The getStringValue function is designed for use in JavaScript. The plain old > stringValue function is easier to use in C++ code. Since it returns the null > string when it's not a string you don’t even have to check, can just use the > string. Oh, I didn't see that. > auto fontFamilyName = > downcast<CSSPrimitiveValue>(*firstItem).stringValue(); > > But I think we know that this is a font family, so we don’t need to use the > polymorphic stringValue(), instead we could just use this: > > auto fontFamilyName = downcast<CSSPrimitiveValue>(*firstItem). > fontFamily().familyName(); > > Code would be slightly bigger and slightly faster. Not sure which is better. Okay, will do that.
Ryosuke Niwa
Comment 17 2019-07-22 19:11:25 PDT
Created attachment 374664 [details] Patch for landing
EWS Watchlist
Comment 18 2019-07-22 20:12:49 PDT
Comment on attachment 374664 [details] Patch for landing Attachment 374664 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/12792484 New failing tests: http/tests/images/hidpi-srcset-copy.html
EWS Watchlist
Comment 19 2019-07-22 20:12:51 PDT
Created attachment 374667 [details] Archive of layout-test-results from ews100 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-highsierra Platform: Mac OS X 10.13.6
EWS Watchlist
Comment 20 2019-07-22 20:21:31 PDT
Comment on attachment 374664 [details] Patch for landing Attachment 374664 [details] did not pass mac-wk2-ews (mac-wk2): Output: https://webkit-queues.webkit.org/results/12792492 New failing tests: http/tests/images/hidpi-srcset-copy.html
EWS Watchlist
Comment 21 2019-07-22 20:21:33 PDT
Created attachment 374670 [details] Archive of layout-test-results from ews107 for mac-highsierra-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-highsierra-wk2 Platform: Mac OS X 10.13.6
EWS Watchlist
Comment 22 2019-07-22 20:44:15 PDT
Comment on attachment 374664 [details] Patch for landing Attachment 374664 [details] did not pass mac-debug-ews (mac): Output: https://webkit-queues.webkit.org/results/12792481 New failing tests: http/tests/images/hidpi-srcset-copy.html
EWS Watchlist
Comment 23 2019-07-22 20:44:17 PDT
Created attachment 374671 [details] Archive of layout-test-results from ews116 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews116 Port: mac-highsierra Platform: Mac OS X 10.13.6
Ryosuke Niwa
Comment 24 2019-07-22 22:52:12 PDT
Ah, CSSValueMonospace is not a font family primitive type but rather a value ID.
Ryosuke Niwa
Comment 25 2019-07-22 22:54:05 PDT
Created attachment 374677 [details] Patch for landing
Ryosuke Niwa
Comment 26 2019-07-22 22:54:31 PDT
Comment on attachment 374677 [details] Patch for landing Wait for EWS again.
WebKit Commit Bot
Comment 27 2019-07-23 00:21:36 PDT
Comment on attachment 374677 [details] Patch for landing Clearing flags on attachment: 374677 Committed r247720: <https://trac.webkit.org/changeset/247720>
WebKit Commit Bot
Comment 28 2019-07-23 00:21:38 PDT
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.