Summary: | REGRESSION(r281389): using font-variant-ligatures causes Unicode bidi isolation control characters to render | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Cameron McCormack (:heycam) <heycam> | ||||||||||||||||
Component: | Text | Assignee: | Myles C. Maxfield <mmaxfield> | ||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||
Severity: | Normal | CC: | clopez, darin, ews-watchlist, mmaxfield, webkit-bug-importer, youennf | ||||||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||||||
Version: | WebKit Local Build | ||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||
Bug Depends on: | 215643 | ||||||||||||||||||
Bug Blocks: | |||||||||||||||||||
Attachments: |
|
Reproduces on Big Sur. Doesn't reproduce on Monterey. Doesn't reproduce on 20G405 Doesn't reproduce on 21A559 I see what's happening. The call to m_font.fontForCombiningCharacterSequence() in the complex text codepath is now returning nullptr when it used to be returning some font. We're hitting this block in ComplexTextController::collectComplexTextRunsForCharacters(): if (!font) { // Create a run of missing glyphs from the primary font. m_complexTextRuns.append(ComplexTextRun::create(m_font.primaryFont(), cp, stringLocation, length, 0, length, m_run.ltr())); return; } Created attachment 449299 [details]
Patch
This patch modifies the imported WPT tests. Please ensure that any changes on the tests (not coming from a WPT import) are exported to WPT. Please see https://trac.webkit.org/wiki/WPTExportProcess Created attachment 449302 [details]
Patch
Created attachment 449303 [details]
WIP
Looks like soft hyphen (U+00AD) is rendering Created attachment 449312 [details]
Patch
Created attachment 449318 [details]
Patch
Comment on attachment 449318 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=449318&action=review > Source/WebCore/platform/graphics/ComplexTextController.cpp:834 > + if (!u_hasBinaryProperty(character, UCHAR_DEFAULT_IGNORABLE_CODE_POINT) && character != objectReplacementCharacter) Is this efficient enough as-is, or do we need a fast path that skips the call to u_hasBinaryProperty for the most common characters? I suppose we did a memory allocation here so it’s not super-performance-sensitive. > Source/WebCore/platform/graphics/WidthIterator.cpp:622 > + || (characterResponsibleForThisGlyph >= deleteCharacter && characterResponsibleForThisGlyph < noBreakSpace) > + || characterResponsibleForThisGlyph == objectReplacementCharacter > + || u_hasBinaryProperty(characterResponsibleForThisGlyph, UCHAR_DEFAULT_IGNORABLE_CODE_POINT)) { I don’t understand the [nullCharacter,space) and [delete,noBreakSpace) parts of this expression; is that an optimization or are those not ignorable? Can we make this clearer with a helper function? If it’s just an optimization that could go into the helper function. Since this repeats the UCHAR_DEFAULT_IGNORABLE_CODE_POINT + objectReplacementCharacter logic in ComplexTextController.cpp, a helper would make both call sites less wordy and give us a place to write a comment or use a name that makes it clear why we need that exception. Even if those other checks are not just an optimization. Comment on attachment 449318 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=449318&action=review >> Source/WebCore/platform/graphics/ComplexTextController.cpp:834 >> + if (!u_hasBinaryProperty(character, UCHAR_DEFAULT_IGNORABLE_CODE_POINT) && character != objectReplacementCharacter) > > Is this efficient enough as-is, or do we need a fast path that skips the call to u_hasBinaryProperty for the most common characters? I suppose we did a memory allocation here so it’s not super-performance-sensitive. The implementation of u_hasBinaryProperty() just does some bit math, and 2 reads from data tables in https://github.com/unicode-org/icu/blob/main/icu4c/source/common/uchar_props_data.h. From my testing, each call takes 4.65 nanoseconds on my development machine. The machine is a 2.3GHz machine, so if I did the math correctly, that means each call took between 10 and 11 cycles. (Of course, my test program is just a loop that calls the function, so we can assume in this test everything is in cache lines, etc.) But even with that disclaimer, I think these results indicate that it's probably fast enough. >> Source/WebCore/platform/graphics/WidthIterator.cpp:622 >> + || u_hasBinaryProperty(characterResponsibleForThisGlyph, UCHAR_DEFAULT_IGNORABLE_CODE_POINT)) { > > I don’t understand the [nullCharacter,space) and [delete,noBreakSpace) parts of this expression; is that an optimization or are those not ignorable? Can we make this clearer with a helper function? If it’s just an optimization that could go into the helper function. Since this repeats the UCHAR_DEFAULT_IGNORABLE_CODE_POINT + objectReplacementCharacter logic in ComplexTextController.cpp, a helper would make both call sites less wordy and give us a place to write a comment or use a name that makes it clear why we need that exception. Even if those other checks are not just an optimization. Yep - [delete,noBreakSpace) are not ignorable. The first ignorable code point is U+00AD SOFT HYPHEN. The helper function is a good idea. In fact, we already have some similar helper functions: FontCascade::treatAsZeroWidthSpace() and FontCascade::treatAsZeroWidthSpaceInComplexScript(). I'll make a 3rd one for now, and then in a follow-up patch I'll see if I can unify them. Created attachment 449353 [details]
Patch for committing
Committed r288107 (246121@trunk): <https://commits.webkit.org/246121@trunk> Comment on attachment 449353 [details] Patch for committing View in context: https://bugs.webkit.org/attachment.cgi?id=449353&action=review > Source/WebCore/platform/graphics/FontCascade.h:269 > + return (character >= nullCharacter && character < space) > + || (character >= deleteCharacter && character < noBreakSpace) > + || character == objectReplacementCharacter Still unclear if these special cases are needed for correctness or for performance. If the checks are needed for correctness, a comment might make it clearer why we are ignoring characters that are not "default ignorable". If the checks are there as a performance optimization, then either we can delete the checks because the performance of u_hasBinaryProperty is quite fast, or if we do think we need an optimization we didn’t do it well, since characters like space and the letter "a" take the slowest path. Comment on attachment 449353 [details] Patch for committing View in context: https://bugs.webkit.org/attachment.cgi?id=449353&action=review >> Source/WebCore/platform/graphics/FontCascade.h:269 >> + || character == objectReplacementCharacter > > Still unclear if these special cases are needed for correctness or for performance. > > If the checks are needed for correctness, a comment might make it clearer why we are ignoring characters that are not "default ignorable". > > If the checks are there as a performance optimization, then either we can delete the checks because the performance of u_hasBinaryProperty is quite fast, or if we do think we need an optimization we didn’t do it well, since characters like space and the letter "a" take the slowest path. Right, yes - with clear eyes, I looked this today and realized the comment says "characters in set X should have special processing" and then the implementation of the function returns set Y!! Let me see if I can do some additional research today and clear this up. Comment on attachment 449353 [details] Patch for committing View in context: https://bugs.webkit.org/attachment.cgi?id=449353&action=review >>> Source/WebCore/platform/graphics/FontCascade.h:269 >>> + || character == objectReplacementCharacter >> >> Still unclear if these special cases are needed for correctness or for performance. >> >> If the checks are needed for correctness, a comment might make it clearer why we are ignoring characters that are not "default ignorable". >> >> If the checks are there as a performance optimization, then either we can delete the checks because the performance of u_hasBinaryProperty is quite fast, or if we do think we need an optimization we didn’t do it well, since characters like space and the letter "a" take the slowest path. > > Right, yes - with clear eyes, I looked this today and realized the comment says "characters in set X should have special processing" and then the implementation of the function returns set Y!! > > Let me see if I can do some additional research today and clear this up. I'm working on this at https://bugs.webkit.org/show_bug.cgi?id=235622. It's still far from being ready, though. |
Created attachment 448813 [details] test See the attached test.