Bug 235053

Summary: REGRESSION(r281389): using font-variant-ligatures causes Unicode bidi isolation control characters to render
Product: WebKit Reporter: Cameron McCormack (:heycam) <heycam>
Component: TextAssignee: Myles C. Maxfield <mmaxfield>
Status: RESOLVED FIXED    
Severity: Normal CC: clopez, darin, ews-watchlist, mmaxfield, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Local Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 215643    
Bug Blocks:    
Attachments:
Description Flags
test
none
Patch
none
Patch
none
WIP
none
Patch
none
Patch
darin: review+
Patch for committing none

Cameron McCormack (:heycam)
Reported 2022-01-10 16:42:48 PST
Created attachment 448813 [details] test See the attached test.
Attachments
test (103 bytes, text/html)
2022-01-10 16:42 PST, Cameron McCormack (:heycam)
no flags
Patch (71.13 KB, patch)
2022-01-16 18:36 PST, Myles C. Maxfield
no flags
Patch (72.14 KB, patch)
2022-01-16 20:32 PST, Myles C. Maxfield
no flags
WIP (987.79 KB, patch)
2022-01-16 21:29 PST, Myles C. Maxfield
no flags
Patch (72.15 KB, patch)
2022-01-16 23:45 PST, Myles C. Maxfield
no flags
Patch (72.10 KB, patch)
2022-01-17 01:56 PST, Myles C. Maxfield
darin: review+
Patch for committing (73.95 KB, patch)
2022-01-17 16:26 PST, Myles C. Maxfield
no flags
Myles C. Maxfield
Comment 1 2022-01-10 16:50:48 PST
Reproduces on Big Sur. Doesn't reproduce on Monterey.
Myles C. Maxfield
Comment 2 2022-01-10 16:51:28 PST
Doesn't reproduce on 20G405
Myles C. Maxfield
Comment 3 2022-01-10 16:53:41 PST
Doesn't reproduce on 21A559
Radar WebKit Bug Importer
Comment 4 2022-01-11 16:15:53 PST
Myles C. Maxfield
Comment 5 2022-01-16 17:18:52 PST
I see what's happening. The call to m_font.fontForCombiningCharacterSequence() in the complex text codepath is now returning nullptr when it used to be returning some font.
Myles C. Maxfield
Comment 6 2022-01-16 17:20:48 PST
We're hitting this block in ComplexTextController::collectComplexTextRunsForCharacters(): if (!font) { // Create a run of missing glyphs from the primary font. m_complexTextRuns.append(ComplexTextRun::create(m_font.primaryFont(), cp, stringLocation, length, 0, length, m_run.ltr())); return; }
Myles C. Maxfield
Comment 7 2022-01-16 18:36:43 PST
EWS Watchlist
Comment 8 2022-01-16 18:38:10 PST
This patch modifies the imported WPT tests. Please ensure that any changes on the tests (not coming from a WPT import) are exported to WPT. Please see https://trac.webkit.org/wiki/WPTExportProcess
Myles C. Maxfield
Comment 9 2022-01-16 20:32:25 PST
Myles C. Maxfield
Comment 10 2022-01-16 21:29:44 PST
Myles C. Maxfield
Comment 11 2022-01-16 23:39:50 PST
Looks like soft hyphen (U+00AD) is rendering
Myles C. Maxfield
Comment 12 2022-01-16 23:45:31 PST
Myles C. Maxfield
Comment 13 2022-01-17 01:56:14 PST
Darin Adler
Comment 14 2022-01-17 10:05:12 PST
Comment on attachment 449318 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=449318&action=review > Source/WebCore/platform/graphics/ComplexTextController.cpp:834 > + if (!u_hasBinaryProperty(character, UCHAR_DEFAULT_IGNORABLE_CODE_POINT) && character != objectReplacementCharacter) Is this efficient enough as-is, or do we need a fast path that skips the call to u_hasBinaryProperty for the most common characters? I suppose we did a memory allocation here so it’s not super-performance-sensitive. > Source/WebCore/platform/graphics/WidthIterator.cpp:622 > + || (characterResponsibleForThisGlyph >= deleteCharacter && characterResponsibleForThisGlyph < noBreakSpace) > + || characterResponsibleForThisGlyph == objectReplacementCharacter > + || u_hasBinaryProperty(characterResponsibleForThisGlyph, UCHAR_DEFAULT_IGNORABLE_CODE_POINT)) { I don’t understand the [nullCharacter,space) and [delete,noBreakSpace) parts of this expression; is that an optimization or are those not ignorable? Can we make this clearer with a helper function? If it’s just an optimization that could go into the helper function. Since this repeats the UCHAR_DEFAULT_IGNORABLE_CODE_POINT + objectReplacementCharacter logic in ComplexTextController.cpp, a helper would make both call sites less wordy and give us a place to write a comment or use a name that makes it clear why we need that exception. Even if those other checks are not just an optimization.
Myles C. Maxfield
Comment 15 2022-01-17 16:15:11 PST
Comment on attachment 449318 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=449318&action=review >> Source/WebCore/platform/graphics/ComplexTextController.cpp:834 >> + if (!u_hasBinaryProperty(character, UCHAR_DEFAULT_IGNORABLE_CODE_POINT) && character != objectReplacementCharacter) > > Is this efficient enough as-is, or do we need a fast path that skips the call to u_hasBinaryProperty for the most common characters? I suppose we did a memory allocation here so it’s not super-performance-sensitive. The implementation of u_hasBinaryProperty() just does some bit math, and 2 reads from data tables in https://github.com/unicode-org/icu/blob/main/icu4c/source/common/uchar_props_data.h. From my testing, each call takes 4.65 nanoseconds on my development machine. The machine is a 2.3GHz machine, so if I did the math correctly, that means each call took between 10 and 11 cycles. (Of course, my test program is just a loop that calls the function, so we can assume in this test everything is in cache lines, etc.) But even with that disclaimer, I think these results indicate that it's probably fast enough. >> Source/WebCore/platform/graphics/WidthIterator.cpp:622 >> + || u_hasBinaryProperty(characterResponsibleForThisGlyph, UCHAR_DEFAULT_IGNORABLE_CODE_POINT)) { > > I don’t understand the [nullCharacter,space) and [delete,noBreakSpace) parts of this expression; is that an optimization or are those not ignorable? Can we make this clearer with a helper function? If it’s just an optimization that could go into the helper function. Since this repeats the UCHAR_DEFAULT_IGNORABLE_CODE_POINT + objectReplacementCharacter logic in ComplexTextController.cpp, a helper would make both call sites less wordy and give us a place to write a comment or use a name that makes it clear why we need that exception. Even if those other checks are not just an optimization. Yep - [delete,noBreakSpace) are not ignorable. The first ignorable code point is U+00AD SOFT HYPHEN. The helper function is a good idea. In fact, we already have some similar helper functions: FontCascade::treatAsZeroWidthSpace() and FontCascade::treatAsZeroWidthSpaceInComplexScript(). I'll make a 3rd one for now, and then in a follow-up patch I'll see if I can unify them.
Myles C. Maxfield
Comment 16 2022-01-17 16:26:06 PST
Created attachment 449353 [details] Patch for committing
Myles C. Maxfield
Comment 17 2022-01-17 19:02:30 PST
Myles C. Maxfield
Comment 18 2022-01-17 20:07:55 PST
Darin Adler
Comment 19 2022-01-18 13:23:14 PST
Comment on attachment 449353 [details] Patch for committing View in context: https://bugs.webkit.org/attachment.cgi?id=449353&action=review > Source/WebCore/platform/graphics/FontCascade.h:269 > + return (character >= nullCharacter && character < space) > + || (character >= deleteCharacter && character < noBreakSpace) > + || character == objectReplacementCharacter Still unclear if these special cases are needed for correctness or for performance. If the checks are needed for correctness, a comment might make it clearer why we are ignoring characters that are not "default ignorable". If the checks are there as a performance optimization, then either we can delete the checks because the performance of u_hasBinaryProperty is quite fast, or if we do think we need an optimization we didn’t do it well, since characters like space and the letter "a" take the slowest path.
Myles C. Maxfield
Comment 20 2022-01-21 14:19:07 PST
Comment on attachment 449353 [details] Patch for committing View in context: https://bugs.webkit.org/attachment.cgi?id=449353&action=review >> Source/WebCore/platform/graphics/FontCascade.h:269 >> + || character == objectReplacementCharacter > > Still unclear if these special cases are needed for correctness or for performance. > > If the checks are needed for correctness, a comment might make it clearer why we are ignoring characters that are not "default ignorable". > > If the checks are there as a performance optimization, then either we can delete the checks because the performance of u_hasBinaryProperty is quite fast, or if we do think we need an optimization we didn’t do it well, since characters like space and the letter "a" take the slowest path. Right, yes - with clear eyes, I looked this today and realized the comment says "characters in set X should have special processing" and then the implementation of the function returns set Y!! Let me see if I can do some additional research today and clear this up.
Myles C. Maxfield
Comment 21 2022-01-25 17:36:23 PST
Comment on attachment 449353 [details] Patch for committing View in context: https://bugs.webkit.org/attachment.cgi?id=449353&action=review >>> Source/WebCore/platform/graphics/FontCascade.h:269 >>> + || character == objectReplacementCharacter >> >> Still unclear if these special cases are needed for correctness or for performance. >> >> If the checks are needed for correctness, a comment might make it clearer why we are ignoring characters that are not "default ignorable". >> >> If the checks are there as a performance optimization, then either we can delete the checks because the performance of u_hasBinaryProperty is quite fast, or if we do think we need an optimization we didn’t do it well, since characters like space and the letter "a" take the slowest path. > > Right, yes - with clear eyes, I looked this today and realized the comment says "characters in set X should have special processing" and then the implementation of the function returns set Y!! > > Let me see if I can do some additional research today and clear this up. I'm working on this at https://bugs.webkit.org/show_bug.cgi?id=235622. It's still far from being ready, though.
Note You need to log in before you can comment on or make changes to this bug.