Bug 77568

Summary: [Cocoa] Glyph lookup should be language-sensitive (specifically between Yiddish and Hebrew)
Product: WebKit Reporter: Ze'ev <zeevclem>
Component: Layout and RenderingAssignee: Myles C. Maxfield <mmaxfield>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, bashi, benjamin, buildbot, cdumez, cmarcelo, commit-queue, dbates, dino, efidler, ews-watchlist, falken, glenn, Hironori.Fujii, jdaggett, jonlee, jshin, koivisto, mail, mitz, mmaxfield, rniwa, shikisuen, simon.fraser, Stevan_White, thorton, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: 525.x (Safari 3.2)   
Hardware: Mac (Intel)   
OS: OS X 10.7   
See Also: https://bugs.webkit.org/show_bug.cgi?id=179102
Bug Depends on: 10874    
Bug Blocks: 179104, 179105    
Attachments:
Description Flags
Zip file containing the bug illustration files mentioned in the Description.
none
Serbian vs. Eastern Cyrillic
none
Serbian file as displayed by Konqueror 4.7.4
none
Serbian file as displayed by FireFox Nightly
none
Patch
none
Patch
none
Archive of layout-test-results from ews102 for mac-elcapitan
none
Archive of layout-test-results from ews107 for mac-elcapitan-wk2
none
Archive of layout-test-results from ews114 for mac-elcapitan
none
Archive of layout-test-results from ews124 for ios-simulator-wk2
none
Patch
none
Patch
none
Patch
none
Patch
none
Archive of layout-test-results from ews105 for mac-elcapitan-wk2
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
simon.fraser: review+
Doesn't affect font hash
none
Doesn't affect font hash
simon.fraser: review-
Doesn't affect font hash
none
Doesn't affect font hash
simon.fraser: review+
Doesn't affect font hash
none
Doesn't affect font hash none

Ze'ev
Reported 2012-02-01 12:10:27 PST
Created attachment 124985 [details] Zip file containing the bug illustration files mentioned in the Description. Both the Hebrew and Yiddish language use the same Hebrew character set. However, in Yiddish, for the "pasekh tsvey yudn" and "khirik yud" character combinations, the convention is to "raise" the vowel. This is done for Yiddish only, not Hebrew. It is possible on an HTML page to differentiate Yiddish text from Hebrew text using the "lang" attribute and to display the Yiddish glyphs when 'lang="yi"' is specified and to display the Hebrew glyphs when 'lang="he"' is specified; however, WebKit/Safari does not do this. This is not a WebKit-only issue, most browsers have this same bug. Firefox only recently fixed the bug in Firefox 10.0beta (see this Firefox bug, in particular the last few comments in the comment thread: https://bugzilla.mozilla.org/show_bug.cgi?id=694205). In order to demonstrate the problem and to illustrate the correct behaviour, I have attached a zip file with the following: 1. yiddish-hebrew.html : A basic example to illustrate the problem. 2. Firefox.jpg : A screen shot of the correct Hebrew & Yiddish vowel positioning in Firefox 10.0beta. 3. Safari.jpg : A screen shot showing the correct Hebrew & INCORRECT Yiddish vowel positioning in Safari. 4. FreeSans ttf files (4) : The FreeSans font used in the yiddish-hebrew.html example. FreeSans is one of the fonts in GNU FreeFont (http://www.gnu.org/software/freefont/) and the ttf files were built from the current SVN source files. These files should be used to replicate the issue because not all fonts provide support for the different Hebrew/Yiddish glyphs.
Attachments
Zip file containing the bug illustration files mentioned in the Description. (938.25 KB, application/x-zip-compressed)
2012-02-01 12:10 PST, Ze'ev
no flags
Serbian vs. Eastern Cyrillic (805 bytes, text/html)
2012-02-01 13:27 PST, Steve White
no flags
Serbian file as displayed by Konqueror 4.7.4 (32.74 KB, image/png)
2012-02-01 13:29 PST, Steve White
no flags
Serbian file as displayed by FireFox Nightly (47.66 KB, image/png)
2012-02-01 13:31 PST, Steve White
no flags
Patch (6.26 KB, patch)
2017-10-31 12:59 PDT, Myles C. Maxfield
no flags
Patch (7.44 KB, patch)
2017-10-31 13:09 PDT, Myles C. Maxfield
no flags
Archive of layout-test-results from ews102 for mac-elcapitan (1.73 MB, application/zip)
2017-10-31 14:02 PDT, Build Bot
no flags
Archive of layout-test-results from ews107 for mac-elcapitan-wk2 (2.20 MB, application/zip)
2017-10-31 14:15 PDT, Build Bot
no flags
Archive of layout-test-results from ews114 for mac-elcapitan (2.26 MB, application/zip)
2017-10-31 14:23 PDT, Build Bot
no flags
Archive of layout-test-results from ews124 for ios-simulator-wk2 (1.39 MB, application/zip)
2017-10-31 14:30 PDT, Build Bot
no flags
Patch (43.01 KB, patch)
2017-10-31 20:50 PDT, Myles C. Maxfield
no flags
Patch (17.72 KB, patch)
2017-11-01 00:08 PDT, Myles C. Maxfield
no flags
Patch (19.21 KB, patch)
2017-11-01 00:21 PDT, Myles C. Maxfield
no flags
Patch (19.19 KB, patch)
2017-11-01 00:34 PDT, Myles C. Maxfield
no flags
Archive of layout-test-results from ews105 for mac-elcapitan-wk2 (1.23 MB, application/zip)
2017-11-01 02:14 PDT, Build Bot
no flags
Patch (20.01 KB, patch)
2018-01-19 10:21 PST, Myles C. Maxfield
no flags
Patch (20.02 KB, patch)
2018-01-19 11:47 PST, Myles C. Maxfield
no flags
Patch (30.04 KB, patch)
2019-12-18 20:13 PST, Myles C. Maxfield
no flags
Patch (32.18 KB, patch)
2020-01-08 19:37 PST, Myles C. Maxfield
no flags
Patch (33.37 KB, patch)
2020-01-08 19:55 PST, Myles C. Maxfield
no flags
Patch (34.38 KB, patch)
2020-01-10 22:07 PST, Myles C. Maxfield
no flags
Patch (43.05 KB, patch)
2020-01-11 22:59 PST, Myles C. Maxfield
no flags
Patch (44.67 KB, patch)
2020-01-11 23:10 PST, Myles C. Maxfield
no flags
Patch (48.11 KB, patch)
2020-01-11 23:38 PST, Myles C. Maxfield
no flags
Patch (49.35 KB, patch)
2020-01-11 23:55 PST, Myles C. Maxfield
no flags
Patch (51.24 KB, patch)
2020-01-12 00:08 PST, Myles C. Maxfield
no flags
Patch (52.33 KB, patch)
2020-01-12 09:57 PST, Myles C. Maxfield
no flags
Patch (53.58 KB, patch)
2020-01-12 10:30 PST, Myles C. Maxfield
no flags
Patch (54.40 KB, patch)
2020-01-12 12:36 PST, Myles C. Maxfield
simon.fraser: review+
Doesn't affect font hash (32.10 KB, patch)
2020-01-13 17:40 PST, Myles C. Maxfield
no flags
Doesn't affect font hash (32.13 KB, patch)
2020-01-13 18:38 PST, Myles C. Maxfield
simon.fraser: review-
Doesn't affect font hash (32.12 KB, patch)
2020-01-13 18:39 PST, Myles C. Maxfield
no flags
Doesn't affect font hash (29.09 KB, patch)
2020-01-13 19:08 PST, Myles C. Maxfield
simon.fraser: review+
Doesn't affect font hash (29.16 KB, patch)
2020-01-13 22:05 PST, Myles C. Maxfield
no flags
Doesn't affect font hash (29.41 KB, patch)
2020-01-14 12:35 PST, Myles C. Maxfield
no flags
Steve White
Comment 1 2012-02-01 13:17:59 PST
Ze'ev was using a recent build of FreeSerif, from the sources available via SVN at https://savannah.gnu.org/svn/?group=freefont
Steve White
Comment 2 2012-02-01 13:27:17 PST
Created attachment 125002 [details] Serbian vs. Eastern Cyrillic Uses 'lang' attributes to select font features specific to a language. This one uses any recent version of the DejaVu fonts.
Steve White
Comment 3 2012-02-01 13:29:08 PST
Created attachment 125003 [details] Serbian file as displayed by Konqueror 4.7.4 Note that the Russian and Serbian lines are identical. That's not what is intended.
Steve White
Comment 4 2012-02-01 13:31:33 PST
Created attachment 125005 [details] Serbian file as displayed by FireFox Nightly FireFox nightly 12.0a1 (2012-01-31) (This is a quite recent improvement in FireFox.) Note that the Cursive letters are all different, and the 'de' letter is different in the normal face. This is desirable.
Steve White
Comment 5 2012-02-01 13:35:08 PST
So here is another example of the same problem as Ze'ev describes, except with a more accessible font, and this time Cyrillic. Again the HTML 'lang' specifies a language to pick out font features for the given script. Near as I can tell, all WebKit based browsers show this wrong behavior (although Pango and other font rendering libraries make the distinction quite well.) Cheers!
Matt Falkenhagen
Comment 6 2012-02-02 20:45:32 PST
Does this mean it's not sufficient to just choose the correct font for the language, which the work on bug 10874 has basically provided, but we must also ensure the correct glyphs within the font are chosen for the language?
Steve White
Comment 7 2012-02-03 04:07:50 PST
Hi Matt! Not sure just who you mean by "we" here. And I'm a little bewildered by the suggestion that this would *force* somebody to do something. This is an *enabling* technology. * If "we" means "we web page designers" We may be providing a web font with these special features. We do so by specifying the language with the 'lang' tag. But no, we don't *have to*. On the other hand, if we can rely upon browsers to do the right thing, this *allows* us to use the language-specific features of the font. * If "we" means "we WebKit developers" We simply pass the "lang" tag properly to the font rendering layer. It will do the rest. And our product will look good compared to the competition. But no, we don't have to excel, or even keep up with the competition. As to "correct font": yes, separate fonts can be used to display Hebrew and Yiddish, even though they differ only in the placement of a few letters. Likewise with various versions of Cyrillic, and even Latin. It's a waste, in several ways. This longstanding OpenType technology *enables* a simple way to make fine typographic distinctions between languages. If you're not a native reader of those languages, it seems a little delicate. But to native readers, it can make a big difference. Let me emphasize that this is easy. Just properly pass the 'lang' attribute down to the underlying font rendering libraries.
Kenichi Ishibashi
Comment 8 2012-02-03 04:51:29 PST
Hi Steve, (In reply to comment #7) > This longstanding OpenType technology *enables* a simple way to make fine typographic distinctions between languages. If you're not a native reader of those languages, it seems a little delicate. But to native readers, it can make a big difference. I agree that it could be a big difference. > Let me emphasize that this is easy. Just properly pass the 'lang' attribute down to the underlying font rendering libraries. What is the exact mean when you say "passing 'lang' attribute to the underling libraries"? WebKit uses different font rendering libraries for each port. For example, Mac Safari/Chrome use CoreText, Chrome Win uses Uniscribe, and Chrome Linux uses (old) HarfBuzz. Which APIs we can use to solve the problem? I'm happy to create patches if it's actually easy as you said. If you can tell us what OpenType feature should we use, it would be also helpful.
Matt Falkenhagen
Comment 9 2012-02-03 05:00:11 PST
(In reply to comment #7) Steve, thanks for the explanation. In my comment, I had just meant: "to fix the bug, we [people who want to fix the bug] must do ___."
Alexey Proskuryakov
Comment 10 2012-02-03 08:59:05 PST
The feature is locl, see <http://en.wikipedia.org/wiki/OpenType_feature_tag_list#OpenType_Typographic_Features>. Implementing this is likely non-trivial because of glyph cache, which will need to be made aware of this. The Mozilla bug Ze'ev mentioned did not have code changes, but it has links to actual bugs with code changes. Also, I don't know whether CoreText even supports this feature.
Steve White
Comment 11 2012-02-03 14:52:53 PST
Hi Kenichi, You asked What is the exact mean when you say "passing 'lang' attribute to the underling libraries Of course that depends on the underlying libraries. But the 'lang' attribute is just a string from a set of standard language codes. The only complication I can imagine here is that some APIs may be picky about the particular languge code standard they accept (but I don't know that.) The DOM-level *intent* is to mark the text content of the entity as being of a different language than that of enclosing entities. Just how that translates into calls to the underlying formatting software -- you would surely know better than I. In principle it should be easy though (barring the ususual premature optimizations etc). You also asked "what OpenType feature should we use" All OpenType features are ultimately activated by a list of script( lang1, lang2, ... ) Where one of the languages may be the 'default' language. The script is detected from the Unicode range of the input characters. The language is (in the present case) derived from 'lang' tags. The feature is activated if the Unicode of the input characters matches the script and the specified language matches one of the languages in the list. So the answer is "this applies to all OpenType feature lookups". In the Cyrillic example I provided, DejaVu Serif uses the 'locl' feature (this one feature is specifically intended to substitute individual glyphs based on language). In the Hebrew example Ze'ev gave, FreeSans uses 'mark' and 'mkmk' positioning that depend upon language. I'm currently working on another font in which various Indic lookups ('pres', 'blwf', etc) depend upon language. As to the specifics of API calls you need to make: you would know better than I. I can only tell you: they are certainly present on each platform. Let me know if I can provide further examples. I have, e.g. some examples of pango-view making different output based upon language specification. Cheers!
Alexey Proskuryakov
Comment 12 2013-06-01 18:49:19 PDT
*** Bug 117109 has been marked as a duplicate of this bug. ***
Alexey Proskuryakov
Comment 13 2013-08-05 09:45:22 PDT
Shiki Suen
Comment 14 2016-03-06 11:45:34 PST
I tested WebKit-SVN-r197635 nightly build and have found it supports at least 'palt' feature.
Myles C. Maxfield
Comment 15 2016-03-09 16:56:15 PST
With CoreText, this is probably done with kCTLanguageAttributeName instead of font features.
Myles C. Maxfield
Comment 16 2016-03-09 17:19:46 PST
This is implemented in the text shaping step, so that is the part which should be updated to implement this fix. This needs to happen in both the simple text code path and the complex text code path.
Myles C. Maxfield
Comment 17 2016-03-09 17:21:02 PST
Background info: there are some font features which are enabled by default. These features may (or may not!) consult with a language in order to perform their shaping.
Myles C. Maxfield
Comment 18 2017-10-31 12:59:09 PDT
Myles C. Maxfield
Comment 19 2017-10-31 13:09:26 PDT
mitz
Comment 20 2017-10-31 13:12:58 PDT
Comment on attachment 325480 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325480&action=review > Source/WebCore/ChangeLog:21 > + (WebCore::FontCascade::characterRangeCodePath): Any language which has language-specific shaping (such as > + Hebrew) needs to use the complex path. Sounds like this patch is making a bad tradeoff: it is slowing down rendering of all Hebrew text just because of something that happens on fewer than one in ten thousand webpages. Is that so?
Myles C. Maxfield
Comment 21 2017-10-31 13:24:53 PDT
(In reply to mitz from comment #20) > Comment on attachment 325480 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=325480&action=review > > > Source/WebCore/ChangeLog:21 > > + (WebCore::FontCascade::characterRangeCodePath): Any language which has language-specific shaping (such as > > + Hebrew) needs to use the complex path. > > Sounds like this patch is making a bad tradeoff: it is slowing down > rendering of all Hebrew text just because of something that happens on fewer > than one in ten thousand webpages. Is that so? The character scan check must happen before we know which font will be used to render the character, and we need to know which font will be used to render the character before we know whether or not there could be language-specific alternates. It seems that the only way to solve this problem is to either 1) render Hebrew / Yiddish text incorrectly, or 2) completely eliminate one of our two text codepaths.
Myles C. Maxfield
Comment 22 2017-10-31 13:26:03 PDT
(In reply to Myles C. Maxfield from comment #21) > (In reply to mitz from comment #20) > > Comment on attachment 325480 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=325480&action=review > > > > > Source/WebCore/ChangeLog:21 > > > + (WebCore::FontCascade::characterRangeCodePath): Any language which has language-specific shaping (such as > > > + Hebrew) needs to use the complex path. > > > > Sounds like this patch is making a bad tradeoff: it is slowing down > > rendering of all Hebrew text just because of something that happens on fewer > > than one in ten thousand webpages. Is that so? > > The character scan check must happen before we know which font will be used > to render the character, and we need to know which font will be used to > render the character before we know whether or not there could be > language-specific alternates. It seems that the only way to solve this > problem is to either 1) render Hebrew / Yiddish text incorrectly, or 2) > completely eliminate one of our two text codepaths. I guess we could also fix this for Devanagari but not for Hebrew because Devanagari already uses the complex text codepath.
mitz
Comment 23 2017-10-31 13:30:32 PDT
(In reply to Myles C. Maxfield from comment #21) > (In reply to mitz from comment #20) > > Comment on attachment 325480 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=325480&action=review > > > > > Source/WebCore/ChangeLog:21 > > > + (WebCore::FontCascade::characterRangeCodePath): Any language which has language-specific shaping (such as > > > + Hebrew) needs to use the complex path. > > > > Sounds like this patch is making a bad tradeoff: it is slowing down > > rendering of all Hebrew text just because of something that happens on fewer > > than one in ten thousand webpages. Is that so? > > The character scan check must happen before we know which font will be used > to render the character, and we need to know which font will be used to > render the character before we know whether or not there could be > language-specific alternates. It seems that the only way to solve this > problem is to either 1) render Hebrew / Yiddish text incorrectly, or 2) > completely eliminate one of our two text codepaths. I don’t know that there are indeed only two possible choices here, but my understanding is that we shouldn’t make code changes that slow things down.
Build Bot
Comment 24 2017-10-31 14:02:29 PDT
Comment on attachment 325480 [details] Patch Attachment 325480 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/5054208 New failing tests: fast/text/international/pop-up-button-text-alignment-and-direction.html fast/text/international/bidi-LDB-2-formatting-characters.html svg/text/bidi-reorder-value-lists.svg svg/W3C-SVG-1.1-SE/text-intro-09-b.svg fast/ruby/ruby-expansion-cjk-3.html
Build Bot
Comment 25 2017-10-31 14:02:31 PDT
Created attachment 325488 [details] Archive of layout-test-results from ews102 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews102 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 26 2017-10-31 14:15:32 PDT
Comment on attachment 325480 [details] Patch Attachment 325480 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/5054341 New failing tests: fast/text/international/pop-up-button-text-alignment-and-direction.html fast/ruby/ruby-expansion-cjk-3.html fast/text/international/bidi-LDB-2-formatting-characters.html svg/text/bidi-reorder-value-lists.svg svg/W3C-SVG-1.1-SE/text-intro-09-b.svg fast/ruby/ruby-expansion-cjk-2.html
Build Bot
Comment 27 2017-10-31 14:15:34 PDT
Created attachment 325492 [details] Archive of layout-test-results from ews107 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6
Build Bot
Comment 28 2017-10-31 14:22:59 PDT
Comment on attachment 325480 [details] Patch Attachment 325480 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/5054281 New failing tests: fast/text/international/pop-up-button-text-alignment-and-direction.html fast/text/international/bidi-LDB-2-formatting-characters.html svg/text/bidi-reorder-value-lists.svg svg/W3C-SVG-1.1-SE/text-intro-09-b.svg fast/ruby/ruby-expansion-cjk-3.html
Build Bot
Comment 29 2017-10-31 14:23:00 PDT
Created attachment 325494 [details] Archive of layout-test-results from ews114 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Build Bot
Comment 30 2017-10-31 14:30:21 PDT
Comment on attachment 325480 [details] Patch Attachment 325480 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/5054363 New failing tests: fast/text/international/pop-up-button-text-alignment-and-direction.html svg/W3C-SVG-1.1-SE/text-intro-09-b.svg
Build Bot
Comment 31 2017-10-31 14:30:23 PDT
Created attachment 325497 [details] Archive of layout-test-results from ews124 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews124 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.6
Myles C. Maxfield
Comment 32 2017-10-31 14:51:02 PDT
Comment on attachment 325480 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325480&action=review >>>>> Source/WebCore/ChangeLog:21 >>>>> + Hebrew) needs to use the complex path. >>>> >>>> Sounds like this patch is making a bad tradeoff: it is slowing down rendering of all Hebrew text just because of something that happens on fewer than one in ten thousand webpages. Is that so? >>> >>> The character scan check must happen before we know which font will be used to render the character, and we need to know which font will be used to render the character before we know whether or not there could be language-specific alternates. It seems that the only way to solve this problem is to either 1) render Hebrew / Yiddish text incorrectly, or 2) completely eliminate one of our two text codepaths. >> >> I guess we could also fix this for Devanagari but not for Hebrew because Devanagari already uses the complex text codepath. > > I don’t know that there are indeed only two possible choices here, but my understanding is that we shouldn’t make code changes that slow things down. OpenType allows language dependent behavior for most scripts – see, for example, the Writing System Tags appendix for Latin/Greek/Cyrillic: https://www.microsoft.com/typography/OpenTypeDev/standard/intro.htm
Myles C. Maxfield
Comment 33 2017-10-31 20:50:09 PDT
Myles C. Maxfield
Comment 34 2017-10-31 20:56:42 PDT
Comment on attachment 325541 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325541&action=review > Source/WebCore/platform/graphics/FontCascade.cpp:333 > + simpleIterator.finalize(nullptr); All the calls to finalize() are probably not correct, because we should only finalize when we are iterating to the very end of the TextRun.
Myles C. Maxfield
Comment 35 2017-10-31 21:03:53 PDT
Comment on attachment 325541 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325541&action=review I need to split this patch up into a few different pieces: 1) Refactoring GlyphBuffer 2) Using CTFontTransformGlyphsWithLanguage(), but with no handler and no initial advance 3) Adding a handler to CTFontTransformGlyphsWithLanguage() 4) Adding initial advance support to WidthIterator > Source/WebCore/platform/graphics/WidthIterator.cpp:127 > + widthDifference += m_rtlInitialAdvance.width(); This is not correct. Initial advances are paint advances, not layout advances.
Myles C. Maxfield
Comment 36 2017-10-31 21:04:28 PDT
Comment on attachment 325541 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325541&action=review >> Source/WebCore/platform/graphics/FontCascade.cpp:333 >> + simpleIterator.finalize(nullptr); > > All the calls to finalize() are probably not correct, because we should only finalize when we are iterating to the very end of the TextRun. Or are they paint advances? I don't think this API has that distinction at all.
Myles C. Maxfield
Comment 37 2017-11-01 00:08:37 PDT
Myles C. Maxfield
Comment 38 2017-11-01 00:21:51 PDT
Myles C. Maxfield
Comment 39 2017-11-01 00:34:57 PDT
Build Bot
Comment 40 2017-11-01 02:14:33 PDT
Comment on attachment 325552 [details] Patch Attachment 325552 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/5060732 New failing tests: imported/w3c/web-platform-tests/service-workers/cache-storage/serviceworker/cache-match.https.html
Build Bot
Comment 41 2017-11-01 02:14:35 PDT
Created attachment 325557 [details] Archive of layout-test-results from ews105 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews105 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6
Myles C. Maxfield
Comment 42 2017-11-07 12:08:39 PST
Test failure is unrelated.
mitz
Comment 43 2017-11-07 13:02:45 PST
Comment on attachment 325552 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325552&action=review > Source/WebCore/platform/graphics/FontPlatformData.h:250 > + AtomicString m_locale; Is it OK that this member variable doesn’t participate in operator== and in hash()?
Myles C. Maxfield
Comment 44 2017-11-07 18:02:02 PST
Comment on attachment 325552 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325552&action=review >> Source/WebCore/platform/graphics/FontPlatformData.h:250 >> + AtomicString m_locale; > > Is it OK that this member variable doesn’t participate in operator== and in hash()? Nope.
Myles C. Maxfield
Comment 45 2018-01-19 10:21:59 PST
Myles C. Maxfield
Comment 46 2018-01-19 11:47:39 PST
Myles C. Maxfield
Comment 47 2018-05-03 11:18:01 PDT
It's easy to see the problem with initial advances using the test fonts at https://bugs.webkit.org/show_bug.cgi?id=185062
Myles C. Maxfield
Comment 48 2018-10-20 21:43:30 PDT
Comment on attachment 331762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331762&action=review > Source/WebCore/platform/graphics/mac/SimpleFontDataCoreText.cpp:48 > + if (!platformData().locale().isEmpty()) > + CFDictionarySetValue(attributesDictionary.get(), kCTLanguageAttributeName, platformData().locale().string().createCFString().get()); This now needs additional attributes: kCTParagraphStyleAttributeName -> { CTParagraphStyleCreate(...), CTParagraphStyleSetCompositionLanguage(..., kCTCompositionLanguageNone) }
Myles C. Maxfield
Comment 49 2019-12-18 19:06:40 PST
Remaining tasks: 1) Delete CharactersTreatedAsSpace and use string indices instead. This probably necessitates using CTFontShapeGlyphs(). 2) Use the initial advance returned 3) Hook up the callback to allow glyphs to be inserted (the glyph backing store to be reallocated)
Myles C. Maxfield
Comment 50 2019-12-18 20:13:42 PST
Myles C. Maxfield
Comment 51 2020-01-08 19:37:32 PST
Myles C. Maxfield
Comment 52 2020-01-08 19:44:49 PST
One possible test: Rendering U+015E using Avenir-Roman should look different if the language is set to "ro"
Myles C. Maxfield
Comment 53 2020-01-08 19:51:24 PST
Another test: Rendering U+00ED using ComicSansMS should look different if the language is set to "lt"
Myles C. Maxfield
Comment 54 2020-01-08 19:55:58 PST
Myles C. Maxfield
Comment 55 2020-01-10 22:07:57 PST
Myles C. Maxfield
Comment 56 2020-01-11 22:59:59 PST
Myles C. Maxfield
Comment 57 2020-01-11 23:10:17 PST
Myles C. Maxfield
Comment 58 2020-01-11 23:38:06 PST
Myles C. Maxfield
Comment 59 2020-01-11 23:55:36 PST
Myles C. Maxfield
Comment 60 2020-01-12 00:08:39 PST
Myles C. Maxfield
Comment 61 2020-01-12 09:57:02 PST
Myles C. Maxfield
Comment 62 2020-01-12 10:30:34 PST
Myles C. Maxfield
Comment 63 2020-01-12 12:36:11 PST
Simon Fraser (smfr)
Comment 64 2020-01-13 13:55:16 PST
Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review > Source/WebCore/platform/graphics/GlyphBuffer.h:172 > + for (size_t i = 0; i < length; ++i) > + m_fonts[location + i] = font; Maybe i = location; I < location + length, and you could precompute location + length as end. Also originalSize - location is computed multiple times. > Source/WebCore/platform/graphics/WidthIterator.cpp:126 > + // This is totally, 100%, furiously, utterly, frustratingly bogus. > + // There is absolutely no guarantee that glyph indices before shaping have any relation at all with glyph indices after shaping. Perhaps turn this anguished comment into an actionable bug and reference it here. > Source/WebCore/platform/graphics/cocoa/FontCocoa.mm:562 > + auto handler = ^(CFRange range, CGGlyph** newGlyphsPointer, CGSize** newAdvancesPointer) { > + range.location = std::min(std::max(range.location, static_cast<CFIndex>(0)), static_cast<CFIndex>(glyphBuffer.size())); > + if (range.length < 0) { > + range.length = std::min(range.location, -range.length); > + range.location = range.location - range.length; > + glyphBuffer.remove(beginningIndex + range.location, range.length); > + } else > + glyphBuffer.makeHole(beginningIndex + range.location, range.length, this); > + *newGlyphsPointer = glyphBuffer.glyphs(beginningIndex); > + *newAdvancesPointer = glyphBuffer.advances(beginningIndex); > + }; > + CTFontTransformGlyphsWithLanguage(m_platformData.ctFont(), glyphBuffer.glyphs(beginningIndex), reinterpret_cast<CGSize*>(glyphBuffer.advances(beginningIndex)), glyphBuffer.size() - beginningIndex, options, platformData().locale().string().createCFString().get(), handler); It seems that this could fall into pathological O(n^2) cases where makeHole or glyphBuffer.remove are called over and over and each time move the rest of the buffers. Is there a more efficient way?
Myles C. Maxfield
Comment 65 2020-01-13 13:58:01 PST
I should see if there's a way to do this without affecting the font hash
Myles C. Maxfield
Comment 66 2020-01-13 16:17:41 PST
Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review >> Source/WebCore/platform/graphics/cocoa/FontCocoa.mm:562 >> + CTFontTransformGlyphsWithLanguage(m_platformData.ctFont(), glyphBuffer.glyphs(beginningIndex), reinterpret_cast<CGSize*>(glyphBuffer.advances(beginningIndex)), glyphBuffer.size() - beginningIndex, options, platformData().locale().string().createCFString().get(), handler); > > It seems that this could fall into pathological O(n^2) cases where makeHole or glyphBuffer.remove are called over and over and each time move the rest of the buffers. Is there a more efficient way? Yes! I asked the same question of the CoreText team. They replied that any solution to that would involve some more complicated data structure to hold the glyph sequence (like a linked list or a rope or something) which would involve redesigning significant parts of the internals of CoreText. They assume that the glyph sequence is tightly packed in a single array throughout their framework. This "pathological" case happens rarely enough that it isn't worth redesigning huge parts of the CoreText framework.
Myles C. Maxfield
Comment 67 2020-01-13 17:40:11 PST
Created attachment 387600 [details] Doesn't affect font hash
Myles C. Maxfield
Comment 68 2020-01-13 18:38:08 PST
Created attachment 387606 [details] Doesn't affect font hash
Myles C. Maxfield
Comment 69 2020-01-13 18:39:56 PST
Created attachment 387607 [details] Doesn't affect font hash
Simon Fraser (smfr)
Comment 70 2020-01-13 18:45:16 PST
Comment on attachment 387606 [details] Doesn't affect font hash View in context: https://bugs.webkit.org/attachment.cgi?id=387606&action=review > Source/WebCore/platform/graphics/Font.h:286 > + struct CFStringAttributesKey { This struct name makes it sound like attributes of a CFString. > Source/WebCore/platform/graphics/Font.h:322 > + bool enableKerning; > + FontOrientation orientation; > + AtomString locale; It bugs me seeing small members before big members (packing). > Source/WebCore/platform/graphics/GlyphBuffer.h:167 > + auto originalSize = size(); Maybe assert that location < size(). > Source/WebCore/platform/graphics/GlyphBuffer.h:170 > + m_fonts.grow(length); This is wrong. grow() takes the new size, so you want size() + length.
Myles C. Maxfield
Comment 71 2020-01-13 19:08:50 PST
Created attachment 387608 [details] Doesn't affect font hash
Myles C. Maxfield
Comment 72 2020-01-13 22:05:26 PST
Created attachment 387616 [details] Doesn't affect font hash
Myles C. Maxfield
Comment 73 2020-01-14 12:35:43 PST
Created attachment 387690 [details] Doesn't affect font hash
WebKit Commit Bot
Comment 74 2020-01-14 14:13:59 PST
Comment on attachment 387690 [details] Doesn't affect font hash Clearing flags on attachment: 387690 Committed r254534: <https://trac.webkit.org/changeset/254534>
Antti Koivisto
Comment 75 2020-01-29 23:06:18 PST
Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review > Source/WebCore/platform/graphics/FontCascade.cpp:421 > + GlyphBuffer glyphBuffer; > Vector<GlyphBufferGlyph, 16> glyphs; > Vector<GlyphBufferAdvance, 16> advances; GlyphBuffer is a large type (somewhere in ~50KB range). While it is stack-allocated here maybe we should still be worried about memory locality in this super hot function? Both of the vectors are now unused.
Antti Koivisto
Comment 76 2020-01-30 00:42:39 PST
Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review >>> Source/WebCore/platform/graphics/cocoa/FontCocoa.mm:562 >>> + CTFontTransformGlyphsWithLanguage(m_platformData.ctFont(), glyphBuffer.glyphs(beginningIndex), reinterpret_cast<CGSize*>(glyphBuffer.advances(beginningIndex)), glyphBuffer.size() - beginningIndex, options, platformData().locale().string().createCFString().get(), handler); >> >> It seems that this could fall into pathological O(n^2) cases where makeHole or glyphBuffer.remove are called over and over and each time move the rest of the buffers. Is there a more efficient way? > > Yes! I asked the same question of the CoreText team. They replied that any solution to that would involve some more complicated data structure to hold the glyph sequence (like a linked list or a rope or something) which would involve redesigning significant parts of the internals of CoreText. They assume that the glyph sequence is tightly packed in a single array throughout their framework. This "pathological" case happens rarely enough that it isn't worth redesigning huge parts of the CoreText framework. We only need this stuff for a limited set of known languages right? Maybe we just use the old code path in common cases if this turns out to be slower?
Myles C. Maxfield
Comment 77 2022-01-21 11:13:37 PST
Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review >>>> Source/WebCore/platform/graphics/cocoa/FontCocoa.mm:562 >>>> + CTFontTransformGlyphsWithLanguage(m_platformData.ctFont(), glyphBuffer.glyphs(beginningIndex), reinterpret_cast<CGSize*>(glyphBuffer.advances(beginningIndex)), glyphBuffer.size() - beginningIndex, options, platformData().locale().string().createCFString().get(), handler); >>> >>> It seems that this could fall into pathological O(n^2) cases where makeHole or glyphBuffer.remove are called over and over and each time move the rest of the buffers. Is there a more efficient way? >> >> Yes! I asked the same question of the CoreText team. They replied that any solution to that would involve some more complicated data structure to hold the glyph sequence (like a linked list or a rope or something) which would involve redesigning significant parts of the internals of CoreText. They assume that the glyph sequence is tightly packed in a single array throughout their framework. This "pathological" case happens rarely enough that it isn't worth redesigning huge parts of the CoreText framework. > > We only need this stuff for a limited set of known languages right? Maybe we just use the old code path in common cases if this turns out to be slower? Yes, we could. Ever since https://trac.webkit.org/changeset/255988/webkit the locale-specific shaping hasn’t been a regression, so at least for now it doesn't seem like it's necessary to redesign this code. If we ever do encounter some content which triggers this pathological behavior, we can redesign this stuff at that point.
Note You need to log in before you can comment on or make changes to this bug.