Bug 235053

Summary:

REGRESSION(r281389): using font-variant-ligatures causes Unicode bidi isolation control characters to render

Product:

WebKit

Reporter:

Cameron McCormack (:heycam) <heycam>

Component:

Text

Assignee:

Myles C. Maxfield <mmaxfield>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

clopez, darin, ews-watchlist, mmaxfield, webkit-bug-importer, youennf

Priority:

Keywords:

InRadar

Version:

WebKit Local Build

Hardware:

Unspecified

OS:

Unspecified

Bug Depends on:

215643

Bug Blocks:

Attachments:

Description	Flags
test	none
Patch	none
Patch	none
WIP	none
Patch	none
Patch	darin: review+
Patch for committing	none

Cameron McCormack (:heycam)

Reported 2022-01-10 16:42:48 PST

Created attachment 448813 [details] test See the attached test.

Attachments
test (103 bytes, text/html) 2022-01-10 16:42 PST, Cameron McCormack (:heycam)	no flags	Details
Patch (71.13 KB, patch) 2022-01-16 18:36 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (72.14 KB, patch) 2022-01-16 20:32 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
WIP (987.79 KB, patch) 2022-01-16 21:29 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (72.15 KB, patch) 2022-01-16 23:45 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Patch (72.10 KB, patch) 2022-01-17 01:56 PST, Myles C. Maxfield	darin: review+	Details Formatted Diff Diff
Patch for committing (73.95 KB, patch) 2022-01-17 16:26 PST, Myles C. Maxfield	no flags	Details Formatted Diff Diff
Show Obsolete (4) View All Add attachment proposed patch, testcase, etc.

Myles C. Maxfield

Comment 1 2022-01-10 16:50:48 PST

Reproduces on Big Sur. Doesn't reproduce on Monterey.

Myles C. Maxfield

Comment 2 2022-01-10 16:51:28 PST

Doesn't reproduce on 20G405

Myles C. Maxfield

Comment 3 2022-01-10 16:53:41 PST

Doesn't reproduce on 21A559

Radar WebKit Bug Importer

Comment 4 2022-01-11 16:15:53 PST

<rdar://problem/87425066>

Myles C. Maxfield

Comment 5 2022-01-16 17:18:52 PST

I see what's happening. The call to m_font.fontForCombiningCharacterSequence() in the complex text codepath is now returning nullptr when it used to be returning some font.

Myles C. Maxfield

Comment 6 2022-01-16 17:20:48 PST

We're hitting this block in ComplexTextController::collectComplexTextRunsForCharacters(): if (!font) { // Create a run of missing glyphs from the primary font. m_complexTextRuns.append(ComplexTextRun::create(m_font.primaryFont(), cp, stringLocation, length, 0, length, m_run.ltr())); return; }

Myles C. Maxfield

Comment 7 2022-01-16 18:36:43 PST

Created attachment 449299 [details] Patch

EWS Watchlist

Comment 8 2022-01-16 18:38:10 PST

This patch modifies the imported WPT tests. Please ensure that any changes on the tests (not coming from a WPT import) are exported to WPT. Please see https://trac.webkit.org/wiki/WPTExportProcess

Myles C. Maxfield

Comment 9 2022-01-16 20:32:25 PST

Created attachment 449302 [details] Patch

Myles C. Maxfield

Comment 10 2022-01-16 21:29:44 PST

Created attachment 449303 [details] WIP

Myles C. Maxfield

Comment 11 2022-01-16 23:39:50 PST

Looks like soft hyphen (U+00AD) is rendering

Myles C. Maxfield

Comment 12 2022-01-16 23:45:31 PST

Created attachment 449312 [details] Patch

Myles C. Maxfield

Comment 13 2022-01-17 01:56:14 PST

Created attachment 449318 [details] Patch

Darin Adler

Comment 14 2022-01-17 10:05:12 PST

Comment on attachment 449318 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=449318&action=review > Source/WebCore/platform/graphics/ComplexTextController.cpp:834 > + if (!u_hasBinaryProperty(character, UCHAR_DEFAULT_IGNORABLE_CODE_POINT) && character != objectReplacementCharacter) Is this efficient enough as-is, or do we need a fast path that skips the call to u_hasBinaryProperty for the most common characters? I suppose we did a memory allocation here so it’s not super-performance-sensitive. > Source/WebCore/platform/graphics/WidthIterator.cpp:622 > + || (characterResponsibleForThisGlyph >= deleteCharacter && characterResponsibleForThisGlyph < noBreakSpace) > + || characterResponsibleForThisGlyph == objectReplacementCharacter > + || u_hasBinaryProperty(characterResponsibleForThisGlyph, UCHAR_DEFAULT_IGNORABLE_CODE_POINT)) { I don’t understand the [nullCharacter,space) and [delete,noBreakSpace) parts of this expression; is that an optimization or are those not ignorable? Can we make this clearer with a helper function? If it’s just an optimization that could go into the helper function. Since this repeats the UCHAR_DEFAULT_IGNORABLE_CODE_POINT + objectReplacementCharacter logic in ComplexTextController.cpp, a helper would make both call sites less wordy and give us a place to write a comment or use a name that makes it clear why we need that exception. Even if those other checks are not just an optimization.

Myles C. Maxfield

Comment 15 2022-01-17 16:15:11 PST

Comment on attachment 449318 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=449318&action=review >> Source/WebCore/platform/graphics/ComplexTextController.cpp:834 >> + if (!u_hasBinaryProperty(character, UCHAR_DEFAULT_IGNORABLE_CODE_POINT) && character != objectReplacementCharacter) > > Is this efficient enough as-is, or do we need a fast path that skips the call to u_hasBinaryProperty for the most common characters? I suppose we did a memory allocation here so it’s not super-performance-sensitive. The implementation of u_hasBinaryProperty() just does some bit math, and 2 reads from data tables in https://github.com/unicode-org/icu/blob/main/icu4c/source/common/uchar_props_data.h. From my testing, each call takes 4.65 nanoseconds on my development machine. The machine is a 2.3GHz machine, so if I did the math correctly, that means each call took between 10 and 11 cycles. (Of course, my test program is just a loop that calls the function, so we can assume in this test everything is in cache lines, etc.) But even with that disclaimer, I think these results indicate that it's probably fast enough. >> Source/WebCore/platform/graphics/WidthIterator.cpp:622 >> + || u_hasBinaryProperty(characterResponsibleForThisGlyph, UCHAR_DEFAULT_IGNORABLE_CODE_POINT)) { > > I don’t understand the [nullCharacter,space) and [delete,noBreakSpace) parts of this expression; is that an optimization or are those not ignorable? Can we make this clearer with a helper function? If it’s just an optimization that could go into the helper function. Since this repeats the UCHAR_DEFAULT_IGNORABLE_CODE_POINT + objectReplacementCharacter logic in ComplexTextController.cpp, a helper would make both call sites less wordy and give us a place to write a comment or use a name that makes it clear why we need that exception. Even if those other checks are not just an optimization. Yep - [delete,noBreakSpace) are not ignorable. The first ignorable code point is U+00AD SOFT HYPHEN. The helper function is a good idea. In fact, we already have some similar helper functions: FontCascade::treatAsZeroWidthSpace() and FontCascade::treatAsZeroWidthSpaceInComplexScript(). I'll make a 3rd one for now, and then in a follow-up patch I'll see if I can unify them.

Myles C. Maxfield

Comment 16 2022-01-17 16:26:06 PST

Created attachment 449353 [details] Patch for committing

Myles C. Maxfield

Comment 17 2022-01-17 19:02:30 PST

Committed r288107 (246121@trunk): <https://commits.webkit.org/246121@trunk>

Myles C. Maxfield

Comment 18 2022-01-17 20:07:55 PST

https://github.com/web-platform-tests/wpt/pull/32421

Darin Adler

Comment 19 2022-01-18 13:23:14 PST

Comment on attachment 449353 [details] Patch for committing View in context: https://bugs.webkit.org/attachment.cgi?id=449353&action=review > Source/WebCore/platform/graphics/FontCascade.h:269 > + return (character >= nullCharacter && character < space) > + || (character >= deleteCharacter && character < noBreakSpace) > + || character == objectReplacementCharacter Still unclear if these special cases are needed for correctness or for performance. If the checks are needed for correctness, a comment might make it clearer why we are ignoring characters that are not "default ignorable". If the checks are there as a performance optimization, then either we can delete the checks because the performance of u_hasBinaryProperty is quite fast, or if we do think we need an optimization we didn’t do it well, since characters like space and the letter "a" take the slowest path.

Myles C. Maxfield

Comment 20 2022-01-21 14:19:07 PST

Comment on attachment 449353 [details] Patch for committing View in context: https://bugs.webkit.org/attachment.cgi?id=449353&action=review >> Source/WebCore/platform/graphics/FontCascade.h:269 >> + || character == objectReplacementCharacter > > Still unclear if these special cases are needed for correctness or for performance. > > If the checks are needed for correctness, a comment might make it clearer why we are ignoring characters that are not "default ignorable". > > If the checks are there as a performance optimization, then either we can delete the checks because the performance of u_hasBinaryProperty is quite fast, or if we do think we need an optimization we didn’t do it well, since characters like space and the letter "a" take the slowest path. Right, yes - with clear eyes, I looked this today and realized the comment says "characters in set X should have special processing" and then the implementation of the function returns set Y!! Let me see if I can do some additional research today and clear this up.

Myles C. Maxfield

Comment 21 2022-01-25 17:36:23 PST

Comment on attachment 449353 [details] Patch for committing View in context: https://bugs.webkit.org/attachment.cgi?id=449353&action=review >>> Source/WebCore/platform/graphics/FontCascade.h:269 >>> + || character == objectReplacementCharacter >> >> Still unclear if these special cases are needed for correctness or for performance. >> >> If the checks are needed for correctness, a comment might make it clearer why we are ignoring characters that are not "default ignorable". >> >> If the checks are there as a performance optimization, then either we can delete the checks because the performance of u_hasBinaryProperty is quite fast, or if we do think we need an optimization we didn’t do it well, since characters like space and the letter "a" take the slowest path. > > Right, yes - with clear eyes, I looked this today and realized the comment says "characters in set X should have special processing" and then the implementation of the function returns set Y!! > > Let me see if I can do some additional research today and clear this up. I'm working on this at https://bugs.webkit.org/show_bug.cgi?id=235622. It's still far from being ready, though.

Note You need to log in before you can comment on or make changes to this bug.