Bug 77568

Summary:

[Cocoa] Glyph lookup should be language-sensitive (specifically between Yiddish and Hebrew)

Product:

WebKit

Reporter:

Ze'ev <zeevclem>

Component:

Layout and Rendering

Assignee:

Myles C. Maxfield <mmaxfield>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

ap, bashi, benjamin, buildbot, cdumez, cmarcelo, commit-queue, dbates, dino, efidler, ews-watchlist, falken, glenn, Hironori.Fujii, jdaggett, jonlee, jshin, koivisto, mail, mitz, mmaxfield, rniwa, shikisuen, simon.fraser, Stevan_White, thorton, webkit-bug-importer, youennf

Priority:

Keywords:

InRadar

Version:

525.x (Safari 3.2)

Hardware:

Mac (Intel)

OS:

OS X 10.7

See Also:

https://bugs.webkit.org/show_bug.cgi?id=179102

Bug Depends on:

10874

Bug Blocks:

179104, 179105

Attachments:

Description	Flags
Zip file containing the bug illustration files mentioned in the Description.	none
Serbian vs. Eastern Cyrillic	none
Serbian file as displayed by Konqueror 4.7.4	none
Serbian file as displayed by FireFox Nightly	none
Patch	none
Patch	none
Archive of layout-test-results from ews102 for mac-elcapitan	none
Archive of layout-test-results from ews107 for mac-elcapitan-wk2	none
Archive of layout-test-results from ews114 for mac-elcapitan	none
Archive of layout-test-results from ews124 for ios-simulator-wk2	none
Patch	none
Patch	none
Patch	none
Patch	none
Archive of layout-test-results from ews105 for mac-elcapitan-wk2	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	none
Patch	simon.fraser: review+
Doesn't affect font hash	none
Doesn't affect font hash	simon.fraser: review-
Doesn't affect font hash	none
Doesn't affect font hash	simon.fraser: review+
Doesn't affect font hash	none
Doesn't affect font hash	none

Ze'ev

Reported 2012-02-01 12:10:27 PST

Created attachment 124985 [details] Zip file containing the bug illustration files mentioned in the Description. Both the Hebrew and Yiddish language use the same Hebrew character set. However, in Yiddish, for the "pasekh tsvey yudn" and "khirik yud" character combinations, the convention is to "raise" the vowel. This is done for Yiddish only, not Hebrew. It is possible on an HTML page to differentiate Yiddish text from Hebrew text using the "lang" attribute and to display the Yiddish glyphs when 'lang="yi"' is specified and to display the Hebrew glyphs when 'lang="he"' is specified; however, WebKit/Safari does not do this. This is not a WebKit-only issue, most browsers have this same bug. Firefox only recently fixed the bug in Firefox 10.0beta (see this Firefox bug, in particular the last few comments in the comment thread: https://bugzilla.mozilla.org/show_bug.cgi?id=694205). In order to demonstrate the problem and to illustrate the correct behaviour, I have attached a zip file with the following: 1. yiddish-hebrew.html : A basic example to illustrate the problem. 2. Firefox.jpg : A screen shot of the correct Hebrew & Yiddish vowel positioning in Firefox 10.0beta. 3. Safari.jpg : A screen shot showing the correct Hebrew & INCORRECT Yiddish vowel positioning in Safari. 4. FreeSans ttf files (4) : The FreeSans font used in the yiddish-hebrew.html example. FreeSans is one of the fonts in GNU FreeFont (http://www.gnu.org/software/freefont/) and the ttf files were built from the current SVN source files. These files should be used to replicate the issue because not all fonts provide support for the different Hebrew/Yiddish glyphs.

Attachments
Zip file containing the bug illustration files mentioned in the Description. (938.25 KB, application/x-zip-compressed) 2012-02-01 12:10 PST, Ze'ev	no flags	Details
Serbian vs. Eastern Cyrillic (805 bytes, text/html) 2012-02-01 13:27 PST, Steve White	no flags	Details
Serbian file as displayed by Konqueror 4.7.4 (32.74 KB, image/png) 2012-02-01 13:29 PST, Steve White	no flags	Details
Serbian file as displayed by FireFox Nightly (47.66 KB, image/png) 2012-02-01 13:31 PST, Steve White	no flags	Details
Patch (6.26 KB, patch) 2017-10-31 12:59 PDT, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (7.44 KB, patch) 2017-10-31 13:09 PDT, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Archive of layout-test-results from ews102 for mac-elcapitan (1.73 MB, application/zip) 2017-10-31 14:02 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews107 for mac-elcapitan-wk2 (2.20 MB, application/zip) 2017-10-31 14:15 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews114 for mac-elcapitan (2.26 MB, application/zip) 2017-10-31 14:23 PDT, Build Bot	no flags	Details
Archive of layout-test-results from ews124 for ios-simulator-wk2 (1.39 MB, application/zip) 2017-10-31 14:30 PDT, Build Bot	no flags	Details
Patch (43.01 KB, patch) 2017-10-31 20:50 PDT, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (17.72 KB, patch) 2017-11-01 00:08 PDT, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (19.21 KB, patch) 2017-11-01 00:21 PDT, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (19.19 KB, patch) 2017-11-01 00:34 PDT, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Archive of layout-test-results from ews105 for mac-elcapitan-wk2 (1.23 MB, application/zip) 2017-11-01 02:14 PDT, Build Bot	no flags	Details
Patch (20.01 KB, patch) 2018-01-19 10:21 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (20.02 KB, patch) 2018-01-19 11:47 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (30.04 KB, patch) 2019-12-18 20:13 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (32.18 KB, patch) 2020-01-08 19:37 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (33.37 KB, patch) 2020-01-08 19:55 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (34.38 KB, patch) 2020-01-10 22:07 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (43.05 KB, patch) 2020-01-11 22:59 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (44.67 KB, patch) 2020-01-11 23:10 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (48.11 KB, patch) 2020-01-11 23:38 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (49.35 KB, patch) 2020-01-11 23:55 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (51.24 KB, patch) 2020-01-12 00:08 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (52.33 KB, patch) 2020-01-12 09:57 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (53.58 KB, patch) 2020-01-12 10:30 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (54.40 KB, patch) 2020-01-12 12:36 PST, Myles C. Maxfield	simon.fraser: review+	Details Formatted Diff Diff
Doesn't affect font hash (32.10 KB, patch) 2020-01-13 17:40 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Doesn't affect font hash (32.13 KB, patch) 2020-01-13 18:38 PST, Myles C. Maxfield	simon.fraser: review-	Details Formatted Diff Diff
Doesn't affect font hash (32.12 KB, patch) 2020-01-13 18:39 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Doesn't affect font hash (29.09 KB, patch) 2020-01-13 19:08 PST, Myles C. Maxfield	simon.fraser: review+	Details Formatted Diff Diff
Doesn't affect font hash (29.16 KB, patch) 2020-01-13 22:05 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Doesn't affect font hash (29.41 KB, patch) 2020-01-14 12:35 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Show Obsolete (29) View All Add attachment proposed patch, testcase, etc.

Steve White

Comment 1 2012-02-01 13:17:59 PST

Ze'ev was using a recent build of FreeSerif, from the sources available via SVN at https://savannah.gnu.org/svn/?group=freefont

Steve White

Comment 2 2012-02-01 13:27:17 PST

Created attachment 125002 [details] Serbian vs. Eastern Cyrillic Uses 'lang' attributes to select font features specific to a language. This one uses any recent version of the DejaVu fonts.

Steve White

Comment 3 2012-02-01 13:29:08 PST

Created attachment 125003 [details] Serbian file as displayed by Konqueror 4.7.4 Note that the Russian and Serbian lines are identical. That's not what is intended.

Steve White

Comment 4 2012-02-01 13:31:33 PST

Created attachment 125005 [details] Serbian file as displayed by FireFox Nightly FireFox nightly 12.0a1 (2012-01-31) (This is a quite recent improvement in FireFox.) Note that the Cursive letters are all different, and the 'de' letter is different in the normal face. This is desirable.

Steve White

Comment 5 2012-02-01 13:35:08 PST

So here is another example of the same problem as Ze'ev describes, except with a more accessible font, and this time Cyrillic. Again the HTML 'lang' specifies a language to pick out font features for the given script. Near as I can tell, all WebKit based browsers show this wrong behavior (although Pango and other font rendering libraries make the distinction quite well.) Cheers!

Matt Falkenhagen

Comment 6 2012-02-02 20:45:32 PST

Does this mean it's not sufficient to just choose the correct font for the language, which the work on bug 10874 has basically provided, but we must also ensure the correct glyphs within the font are chosen for the language?

Steve White

Comment 7 2012-02-03 04:07:50 PST

Hi Matt! Not sure just who you mean by "we" here. And I'm a little bewildered by the suggestion that this would *force* somebody to do something. This is an *enabling* technology. * If "we" means "we web page designers" We may be providing a web font with these special features. We do so by specifying the language with the 'lang' tag. But no, we don't *have to*. On the other hand, if we can rely upon browsers to do the right thing, this *allows* us to use the language-specific features of the font. * If "we" means "we WebKit developers" We simply pass the "lang" tag properly to the font rendering layer. It will do the rest. And our product will look good compared to the competition. But no, we don't have to excel, or even keep up with the competition. As to "correct font": yes, separate fonts can be used to display Hebrew and Yiddish, even though they differ only in the placement of a few letters. Likewise with various versions of Cyrillic, and even Latin. It's a waste, in several ways. This longstanding OpenType technology *enables* a simple way to make fine typographic distinctions between languages. If you're not a native reader of those languages, it seems a little delicate. But to native readers, it can make a big difference. Let me emphasize that this is easy. Just properly pass the 'lang' attribute down to the underlying font rendering libraries.

Kenichi Ishibashi

Comment 8 2012-02-03 04:51:29 PST

Hi Steve, (In reply to comment #7) > This longstanding OpenType technology *enables* a simple way to make fine typographic distinctions between languages. If you're not a native reader of those languages, it seems a little delicate. But to native readers, it can make a big difference. I agree that it could be a big difference. > Let me emphasize that this is easy. Just properly pass the 'lang' attribute down to the underlying font rendering libraries. What is the exact mean when you say "passing 'lang' attribute to the underling libraries"? WebKit uses different font rendering libraries for each port. For example, Mac Safari/Chrome use CoreText, Chrome Win uses Uniscribe, and Chrome Linux uses (old) HarfBuzz. Which APIs we can use to solve the problem? I'm happy to create patches if it's actually easy as you said. If you can tell us what OpenType feature should we use, it would be also helpful.

Matt Falkenhagen

Comment 9 2012-02-03 05:00:11 PST

(In reply to comment #7) Steve, thanks for the explanation. In my comment, I had just meant: "to fix the bug, we [people who want to fix the bug] must do ___."

Alexey Proskuryakov

Comment 10 2012-02-03 08:59:05 PST

The feature is locl, see <http://en.wikipedia.org/wiki/OpenType_feature_tag_list#OpenType_Typographic_Features>. Implementing this is likely non-trivial because of glyph cache, which will need to be made aware of this. The Mozilla bug Ze'ev mentioned did not have code changes, but it has links to actual bugs with code changes. Also, I don't know whether CoreText even supports this feature.

Steve White

Comment 11 2012-02-03 14:52:53 PST

Hi Kenichi, You asked What is the exact mean when you say "passing 'lang' attribute to the underling libraries Of course that depends on the underlying libraries. But the 'lang' attribute is just a string from a set of standard language codes. The only complication I can imagine here is that some APIs may be picky about the particular languge code standard they accept (but I don't know that.) The DOM-level *intent* is to mark the text content of the entity as being of a different language than that of enclosing entities. Just how that translates into calls to the underlying formatting software -- you would surely know better than I. In principle it should be easy though (barring the ususual premature optimizations etc). You also asked "what OpenType feature should we use" All OpenType features are ultimately activated by a list of script( lang1, lang2, ... ) Where one of the languages may be the 'default' language. The script is detected from the Unicode range of the input characters. The language is (in the present case) derived from 'lang' tags. The feature is activated if the Unicode of the input characters matches the script and the specified language matches one of the languages in the list. So the answer is "this applies to all OpenType feature lookups". In the Cyrillic example I provided, DejaVu Serif uses the 'locl' feature (this one feature is specifically intended to substitute individual glyphs based on language). In the Hebrew example Ze'ev gave, FreeSans uses 'mark' and 'mkmk' positioning that depend upon language. I'm currently working on another font in which various Indic lookups ('pres', 'blwf', etc) depend upon language. As to the specifics of API calls you need to make: you would know better than I. I can only tell you: they are certainly present on each platform. Let me know if I can provide further examples. I have, e.g. some examples of pango-view making different output based upon language specification. Cheers!

Alexey Proskuryakov

Comment 12 2013-06-01 18:49:19 PDT

*** Bug 117109 has been marked as a duplicate of this bug. ***

Alexey Proskuryakov

Comment 13 2013-08-05 09:45:22 PDT

<rdar://problem/14649193>

Shiki Suen

Comment 14 2016-03-06 11:45:34 PST

I tested WebKit-SVN-r197635 nightly build and have found it supports at least 'palt' feature.

Myles C. Maxfield

Comment 15 2016-03-09 16:56:15 PST

With CoreText, this is probably done with kCTLanguageAttributeName instead of font features.

Myles C. Maxfield

Comment 16 2016-03-09 17:19:46 PST

This is implemented in the text shaping step, so that is the part which should be updated to implement this fix. This needs to happen in both the simple text code path and the complex text code path.

Myles C. Maxfield

Comment 17 2016-03-09 17:21:02 PST

Background info: there are some font features which are enabled by default. These features may (or may not!) consult with a language in order to perform their shaping.

Myles C. Maxfield

Comment 18 2017-10-31 12:59:09 PDT

Created attachment 325477 [details] Patch

Myles C. Maxfield

Comment 19 2017-10-31 13:09:26 PDT

Created attachment 325480 [details] Patch

mitz

Comment 20 2017-10-31 13:12:58 PDT

Comment on attachment 325480 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325480&action=review > Source/WebCore/ChangeLog:21 > + (WebCore::FontCascade::characterRangeCodePath): Any language which has language-specific shaping (such as > + Hebrew) needs to use the complex path. Sounds like this patch is making a bad tradeoff: it is slowing down rendering of all Hebrew text just because of something that happens on fewer than one in ten thousand webpages. Is that so?

Myles C. Maxfield

Comment 21 2017-10-31 13:24:53 PDT

(In reply to mitz from comment #20) > Comment on attachment 325480 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=325480&action=review > > > Source/WebCore/ChangeLog:21 > > + (WebCore::FontCascade::characterRangeCodePath): Any language which has language-specific shaping (such as > > + Hebrew) needs to use the complex path. > > Sounds like this patch is making a bad tradeoff: it is slowing down > rendering of all Hebrew text just because of something that happens on fewer > than one in ten thousand webpages. Is that so? The character scan check must happen before we know which font will be used to render the character, and we need to know which font will be used to render the character before we know whether or not there could be language-specific alternates. It seems that the only way to solve this problem is to either 1) render Hebrew / Yiddish text incorrectly, or 2) completely eliminate one of our two text codepaths.

Myles C. Maxfield

Comment 22 2017-10-31 13:26:03 PDT

(In reply to Myles C. Maxfield from comment #21) > (In reply to mitz from comment #20) > > Comment on attachment 325480 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=325480&action=review > > > > > Source/WebCore/ChangeLog:21 > > > + (WebCore::FontCascade::characterRangeCodePath): Any language which has language-specific shaping (such as > > > + Hebrew) needs to use the complex path. > > > > Sounds like this patch is making a bad tradeoff: it is slowing down > > rendering of all Hebrew text just because of something that happens on fewer > > than one in ten thousand webpages. Is that so? > > The character scan check must happen before we know which font will be used > to render the character, and we need to know which font will be used to > render the character before we know whether or not there could be > language-specific alternates. It seems that the only way to solve this > problem is to either 1) render Hebrew / Yiddish text incorrectly, or 2) > completely eliminate one of our two text codepaths. I guess we could also fix this for Devanagari but not for Hebrew because Devanagari already uses the complex text codepath.

mitz

Comment 23 2017-10-31 13:30:32 PDT

(In reply to Myles C. Maxfield from comment #21) > (In reply to mitz from comment #20) > > Comment on attachment 325480 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=325480&action=review > > > > > Source/WebCore/ChangeLog:21 > > > + (WebCore::FontCascade::characterRangeCodePath): Any language which has language-specific shaping (such as > > > + Hebrew) needs to use the complex path. > > > > Sounds like this patch is making a bad tradeoff: it is slowing down > > rendering of all Hebrew text just because of something that happens on fewer > > than one in ten thousand webpages. Is that so? > > The character scan check must happen before we know which font will be used > to render the character, and we need to know which font will be used to > render the character before we know whether or not there could be > language-specific alternates. It seems that the only way to solve this > problem is to either 1) render Hebrew / Yiddish text incorrectly, or 2) > completely eliminate one of our two text codepaths. I don’t know that there are indeed only two possible choices here, but my understanding is that we shouldn’t make code changes that slow things down.

Build Bot

Comment 24 2017-10-31 14:02:29 PDT

Comment on attachment 325480 [details] Patch Attachment 325480 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/5054208 New failing tests: fast/text/international/pop-up-button-text-alignment-and-direction.html fast/text/international/bidi-LDB-2-formatting-characters.html svg/text/bidi-reorder-value-lists.svg svg/W3C-SVG-1.1-SE/text-intro-09-b.svg fast/ruby/ruby-expansion-cjk-3.html

Build Bot

Comment 25 2017-10-31 14:02:31 PDT

Created attachment 325488 [details] Archive of layout-test-results from ews102 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews102 Port: mac-elcapitan Platform: Mac OS X 10.11.6

Build Bot

Comment 26 2017-10-31 14:15:32 PDT

Comment on attachment 325480 [details] Patch Attachment 325480 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/5054341 New failing tests: fast/text/international/pop-up-button-text-alignment-and-direction.html fast/ruby/ruby-expansion-cjk-3.html fast/text/international/bidi-LDB-2-formatting-characters.html svg/text/bidi-reorder-value-lists.svg svg/W3C-SVG-1.1-SE/text-intro-09-b.svg fast/ruby/ruby-expansion-cjk-2.html

Build Bot

Comment 27 2017-10-31 14:15:34 PDT

Created attachment 325492 [details] Archive of layout-test-results from ews107 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6

Build Bot

Comment 28 2017-10-31 14:22:59 PDT

Comment on attachment 325480 [details] Patch Attachment 325480 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/5054281 New failing tests: fast/text/international/pop-up-button-text-alignment-and-direction.html fast/text/international/bidi-LDB-2-formatting-characters.html svg/text/bidi-reorder-value-lists.svg svg/W3C-SVG-1.1-SE/text-intro-09-b.svg fast/ruby/ruby-expansion-cjk-3.html

Build Bot

Comment 29 2017-10-31 14:23:00 PDT

Created attachment 325494 [details] Archive of layout-test-results from ews114 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-elcapitan Platform: Mac OS X 10.11.6

Build Bot

Comment 30 2017-10-31 14:30:21 PDT

Comment on attachment 325480 [details] Patch Attachment 325480 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/5054363 New failing tests: fast/text/international/pop-up-button-text-alignment-and-direction.html svg/W3C-SVG-1.1-SE/text-intro-09-b.svg

Build Bot

Comment 31 2017-10-31 14:30:23 PDT

Created attachment 325497 [details] Archive of layout-test-results from ews124 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews124 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.6

Myles C. Maxfield

Comment 32 2017-10-31 14:51:02 PDT

Comment on attachment 325480 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325480&action=review >>>>> Source/WebCore/ChangeLog:21 >>>>> + Hebrew) needs to use the complex path. >>>> >>>> Sounds like this patch is making a bad tradeoff: it is slowing down rendering of all Hebrew text just because of something that happens on fewer than one in ten thousand webpages. Is that so? >>> >>> The character scan check must happen before we know which font will be used to render the character, and we need to know which font will be used to render the character before we know whether or not there could be language-specific alternates. It seems that the only way to solve this problem is to either 1) render Hebrew / Yiddish text incorrectly, or 2) completely eliminate one of our two text codepaths. >> >> I guess we could also fix this for Devanagari but not for Hebrew because Devanagari already uses the complex text codepath. > > I don’t know that there are indeed only two possible choices here, but my understanding is that we shouldn’t make code changes that slow things down. OpenType allows language dependent behavior for most scripts – see, for example, the Writing System Tags appendix for Latin/Greek/Cyrillic: https://www.microsoft.com/typography/OpenTypeDev/standard/intro.htm

Myles C. Maxfield

Comment 33 2017-10-31 20:50:09 PDT

Created attachment 325541 [details] Patch

Myles C. Maxfield

Comment 34 2017-10-31 20:56:42 PDT

Comment on attachment 325541 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325541&action=review > Source/WebCore/platform/graphics/FontCascade.cpp:333 > + simpleIterator.finalize(nullptr); All the calls to finalize() are probably not correct, because we should only finalize when we are iterating to the very end of the TextRun.

Myles C. Maxfield

Comment 35 2017-10-31 21:03:53 PDT

Comment on attachment 325541 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325541&action=review I need to split this patch up into a few different pieces: 1) Refactoring GlyphBuffer 2) Using CTFontTransformGlyphsWithLanguage(), but with no handler and no initial advance 3) Adding a handler to CTFontTransformGlyphsWithLanguage() 4) Adding initial advance support to WidthIterator > Source/WebCore/platform/graphics/WidthIterator.cpp:127 > + widthDifference += m_rtlInitialAdvance.width(); This is not correct. Initial advances are paint advances, not layout advances.

Myles C. Maxfield

Comment 36 2017-10-31 21:04:28 PDT

Comment on attachment 325541 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325541&action=review >> Source/WebCore/platform/graphics/FontCascade.cpp:333 >> + simpleIterator.finalize(nullptr); > > All the calls to finalize() are probably not correct, because we should only finalize when we are iterating to the very end of the TextRun. Or are they paint advances? I don't think this API has that distinction at all.

Myles C. Maxfield

Comment 37 2017-11-01 00:08:37 PDT

Created attachment 325548 [details] Patch

Myles C. Maxfield

Comment 38 2017-11-01 00:21:51 PDT

Created attachment 325550 [details] Patch

Myles C. Maxfield

Comment 39 2017-11-01 00:34:57 PDT

Created attachment 325552 [details] Patch

Build Bot

Comment 40 2017-11-01 02:14:33 PDT

Comment on attachment 325552 [details] Patch Attachment 325552 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/5060732 New failing tests: imported/w3c/web-platform-tests/service-workers/cache-storage/serviceworker/cache-match.https.html

Build Bot

Comment 41 2017-11-01 02:14:35 PDT

Created attachment 325557 [details] Archive of layout-test-results from ews105 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews105 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6

Myles C. Maxfield

Comment 42 2017-11-07 12:08:39 PST

Test failure is unrelated.

mitz

Comment 43 2017-11-07 13:02:45 PST

Comment on attachment 325552 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325552&action=review > Source/WebCore/platform/graphics/FontPlatformData.h:250 > + AtomicString m_locale; Is it OK that this member variable doesn’t participate in operator== and in hash()?

Myles C. Maxfield

Comment 44 2017-11-07 18:02:02 PST

Comment on attachment 325552 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=325552&action=review >> Source/WebCore/platform/graphics/FontPlatformData.h:250 >> + AtomicString m_locale; > > Is it OK that this member variable doesn’t participate in operator== and in hash()? Nope.

Myles C. Maxfield

Comment 45 2018-01-19 10:21:59 PST

Created attachment 331749 [details] Patch

Myles C. Maxfield

Comment 46 2018-01-19 11:47:39 PST

Created attachment 331762 [details] Patch

Myles C. Maxfield

Comment 47 2018-05-03 11:18:01 PDT

It's easy to see the problem with initial advances using the test fonts at https://bugs.webkit.org/show_bug.cgi?id=185062

Myles C. Maxfield

Comment 48 2018-10-20 21:43:30 PDT

Comment on attachment 331762 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=331762&action=review > Source/WebCore/platform/graphics/mac/SimpleFontDataCoreText.cpp:48 > + if (!platformData().locale().isEmpty()) > + CFDictionarySetValue(attributesDictionary.get(), kCTLanguageAttributeName, platformData().locale().string().createCFString().get()); This now needs additional attributes: kCTParagraphStyleAttributeName -> { CTParagraphStyleCreate(...), CTParagraphStyleSetCompositionLanguage(..., kCTCompositionLanguageNone) }

Myles C. Maxfield

Comment 49 2019-12-18 19:06:40 PST

Remaining tasks: 1) Delete CharactersTreatedAsSpace and use string indices instead. This probably necessitates using CTFontShapeGlyphs(). 2) Use the initial advance returned 3) Hook up the callback to allow glyphs to be inserted (the glyph backing store to be reallocated)

Myles C. Maxfield

Comment 50 2019-12-18 20:13:42 PST

Created attachment 386055 [details] Patch

Myles C. Maxfield

Comment 51 2020-01-08 19:37:32 PST

Created attachment 387182 [details] Patch

Myles C. Maxfield

Comment 52 2020-01-08 19:44:49 PST

One possible test: Rendering U+015E using Avenir-Roman should look different if the language is set to "ro"

Myles C. Maxfield

Comment 53 2020-01-08 19:51:24 PST

Another test: Rendering U+00ED using ComicSansMS should look different if the language is set to "lt"

Myles C. Maxfield

Comment 54 2020-01-08 19:55:58 PST

Created attachment 387183 [details] Patch

Myles C. Maxfield

Comment 55 2020-01-10 22:07:57 PST

Created attachment 387424 [details] Patch

Myles C. Maxfield

Comment 56 2020-01-11 22:59:59 PST

Created attachment 387456 [details] Patch

Myles C. Maxfield

Comment 57 2020-01-11 23:10:17 PST

Created attachment 387457 [details] Patch

Myles C. Maxfield

Comment 58 2020-01-11 23:38:06 PST

Created attachment 387458 [details] Patch

Myles C. Maxfield

Comment 59 2020-01-11 23:55:36 PST

Created attachment 387459 [details] Patch

Myles C. Maxfield

Comment 60 2020-01-12 00:08:39 PST

Created attachment 387460 [details] Patch

Myles C. Maxfield

Comment 61 2020-01-12 09:57:02 PST

Created attachment 387471 [details] Patch

Myles C. Maxfield

Comment 62 2020-01-12 10:30:34 PST

Created attachment 387473 [details] Patch

Myles C. Maxfield

Comment 63 2020-01-12 12:36:11 PST

Created attachment 387487 [details] Patch

Simon Fraser (smfr)

Comment 64 2020-01-13 13:55:16 PST

Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review > Source/WebCore/platform/graphics/GlyphBuffer.h:172 > + for (size_t i = 0; i < length; ++i) > + m_fonts[location + i] = font; Maybe i = location; I < location + length, and you could precompute location + length as end. Also originalSize - location is computed multiple times. > Source/WebCore/platform/graphics/WidthIterator.cpp:126 > + // This is totally, 100%, furiously, utterly, frustratingly bogus. > + // There is absolutely no guarantee that glyph indices before shaping have any relation at all with glyph indices after shaping. Perhaps turn this anguished comment into an actionable bug and reference it here. > Source/WebCore/platform/graphics/cocoa/FontCocoa.mm:562 > + auto handler = ^(CFRange range, CGGlyph** newGlyphsPointer, CGSize** newAdvancesPointer) { > + range.location = std::min(std::max(range.location, static_cast<CFIndex>(0)), static_cast<CFIndex>(glyphBuffer.size())); > + if (range.length < 0) { > + range.length = std::min(range.location, -range.length); > + range.location = range.location - range.length; > + glyphBuffer.remove(beginningIndex + range.location, range.length); > + } else > + glyphBuffer.makeHole(beginningIndex + range.location, range.length, this); > + *newGlyphsPointer = glyphBuffer.glyphs(beginningIndex); > + *newAdvancesPointer = glyphBuffer.advances(beginningIndex); > + }; > + CTFontTransformGlyphsWithLanguage(m_platformData.ctFont(), glyphBuffer.glyphs(beginningIndex), reinterpret_cast<CGSize*>(glyphBuffer.advances(beginningIndex)), glyphBuffer.size() - beginningIndex, options, platformData().locale().string().createCFString().get(), handler); It seems that this could fall into pathological O(n^2) cases where makeHole or glyphBuffer.remove are called over and over and each time move the rest of the buffers. Is there a more efficient way?

Myles C. Maxfield

Comment 65 2020-01-13 13:58:01 PST

I should see if there's a way to do this without affecting the font hash

Myles C. Maxfield

Comment 66 2020-01-13 16:17:41 PST

Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review >> Source/WebCore/platform/graphics/cocoa/FontCocoa.mm:562 >> + CTFontTransformGlyphsWithLanguage(m_platformData.ctFont(), glyphBuffer.glyphs(beginningIndex), reinterpret_cast<CGSize*>(glyphBuffer.advances(beginningIndex)), glyphBuffer.size() - beginningIndex, options, platformData().locale().string().createCFString().get(), handler); > > It seems that this could fall into pathological O(n^2) cases where makeHole or glyphBuffer.remove are called over and over and each time move the rest of the buffers. Is there a more efficient way? Yes! I asked the same question of the CoreText team. They replied that any solution to that would involve some more complicated data structure to hold the glyph sequence (like a linked list or a rope or something) which would involve redesigning significant parts of the internals of CoreText. They assume that the glyph sequence is tightly packed in a single array throughout their framework. This "pathological" case happens rarely enough that it isn't worth redesigning huge parts of the CoreText framework.

Myles C. Maxfield

Comment 67 2020-01-13 17:40:11 PST

Created attachment 387600 [details] Doesn't affect font hash

Myles C. Maxfield

Comment 68 2020-01-13 18:38:08 PST

Created attachment 387606 [details] Doesn't affect font hash

Myles C. Maxfield

Comment 69 2020-01-13 18:39:56 PST

Created attachment 387607 [details] Doesn't affect font hash

Simon Fraser (smfr)

Comment 70 2020-01-13 18:45:16 PST

Comment on attachment 387606 [details] Doesn't affect font hash View in context: https://bugs.webkit.org/attachment.cgi?id=387606&action=review > Source/WebCore/platform/graphics/Font.h:286 > + struct CFStringAttributesKey { This struct name makes it sound like attributes of a CFString. > Source/WebCore/platform/graphics/Font.h:322 > + bool enableKerning; > + FontOrientation orientation; > + AtomString locale; It bugs me seeing small members before big members (packing). > Source/WebCore/platform/graphics/GlyphBuffer.h:167 > + auto originalSize = size(); Maybe assert that location < size(). > Source/WebCore/platform/graphics/GlyphBuffer.h:170 > + m_fonts.grow(length); This is wrong. grow() takes the new size, so you want size() + length.

Myles C. Maxfield

Comment 71 2020-01-13 19:08:50 PST

Created attachment 387608 [details] Doesn't affect font hash

Myles C. Maxfield

Comment 72 2020-01-13 22:05:26 PST

Created attachment 387616 [details] Doesn't affect font hash

Myles C. Maxfield

Comment 73 2020-01-14 12:35:43 PST

Created attachment 387690 [details] Doesn't affect font hash

WebKit Commit Bot

Comment 74 2020-01-14 14:13:59 PST

Comment on attachment 387690 [details] Doesn't affect font hash Clearing flags on attachment: 387690 Committed r254534: <https://trac.webkit.org/changeset/254534>

Antti Koivisto

Comment 75 2020-01-29 23:06:18 PST

Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review > Source/WebCore/platform/graphics/FontCascade.cpp:421 > + GlyphBuffer glyphBuffer; > Vector<GlyphBufferGlyph, 16> glyphs; > Vector<GlyphBufferAdvance, 16> advances; GlyphBuffer is a large type (somewhere in ~50KB range). While it is stack-allocated here maybe we should still be worried about memory locality in this super hot function? Both of the vectors are now unused.

Antti Koivisto

Comment 76 2020-01-30 00:42:39 PST

Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review >>> Source/WebCore/platform/graphics/cocoa/FontCocoa.mm:562 >>> + CTFontTransformGlyphsWithLanguage(m_platformData.ctFont(), glyphBuffer.glyphs(beginningIndex), reinterpret_cast<CGSize*>(glyphBuffer.advances(beginningIndex)), glyphBuffer.size() - beginningIndex, options, platformData().locale().string().createCFString().get(), handler); >> >> It seems that this could fall into pathological O(n^2) cases where makeHole or glyphBuffer.remove are called over and over and each time move the rest of the buffers. Is there a more efficient way? > > Yes! I asked the same question of the CoreText team. They replied that any solution to that would involve some more complicated data structure to hold the glyph sequence (like a linked list or a rope or something) which would involve redesigning significant parts of the internals of CoreText. They assume that the glyph sequence is tightly packed in a single array throughout their framework. This "pathological" case happens rarely enough that it isn't worth redesigning huge parts of the CoreText framework. We only need this stuff for a limited set of known languages right? Maybe we just use the old code path in common cases if this turns out to be slower?

Myles C. Maxfield

Comment 77 2022-01-21 11:13:37 PST

Comment on attachment 387487 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=387487&action=review >>>> Source/WebCore/platform/graphics/cocoa/FontCocoa.mm:562 >>>> + CTFontTransformGlyphsWithLanguage(m_platformData.ctFont(), glyphBuffer.glyphs(beginningIndex), reinterpret_cast<CGSize*>(glyphBuffer.advances(beginningIndex)), glyphBuffer.size() - beginningIndex, options, platformData().locale().string().createCFString().get(), handler); >>> >>> It seems that this could fall into pathological O(n^2) cases where makeHole or glyphBuffer.remove are called over and over and each time move the rest of the buffers. Is there a more efficient way? >> >> Yes! I asked the same question of the CoreText team. They replied that any solution to that would involve some more complicated data structure to hold the glyph sequence (like a linked list or a rope or something) which would involve redesigning significant parts of the internals of CoreText. They assume that the glyph sequence is tightly packed in a single array throughout their framework. This "pathological" case happens rarely enough that it isn't worth redesigning huge parts of the CoreText framework. > > We only need this stuff for a limited set of known languages right? Maybe we just use the old code path in common cases if this turns out to be slower? Yes, we could. Ever since https://trac.webkit.org/changeset/255988/webkit the locale-specific shaping hasn’t been a regression, so at least for now it doesn't seem like it's necessary to redesign this code. If we ever do encounter some content which triggers this pathological behavior, we can redesign this stuff at that point.

Note You need to log in before you can comment on or make changes to this bug.