Bug 234451 - Use character names instead of hex codes in FontCascade.h
Summary: Use character names instead of hex codes in FontCascade.h
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Myles C. Maxfield
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2021-12-17 16:03 PST by Myles C. Maxfield
Modified: 2021-12-21 01:53 PST (History)
3 users (show)

See Also:


Attachments
Patch (2.63 KB, patch)
2021-12-17 16:07 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Myles C. Maxfield 2021-12-17 16:03:56 PST
Use character names instead of hex codes in FontCascade.h
Comment 1 Myles C. Maxfield 2021-12-17 16:07:32 PST
Created attachment 447489 [details]
Patch
Comment 2 EWS 2021-12-19 17:13:45 PST
Committed r287250 (245408@main): <https://commits.webkit.org/245408@main>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 447489 [details].
Comment 3 Radar WebKit Bug Importer 2021-12-19 17:14:16 PST
<rdar://problem/86703692>
Comment 4 Darin Adler 2021-12-19 17:25:05 PST
Comment on attachment 447489 [details]
Patch

I agree that these are easier to understand when it's ==.

But when using <= and >= I am not so sure.
Comment 5 Myles C. Maxfield 2021-12-20 11:31:36 PST
I think it could go either way. I did them here for consistency, but I’m happy to put them back if people (or you) think it’s better.
Comment 6 Darin Adler 2021-12-20 12:10:39 PST
May seem like overkill, but I would have made named functions for those sets of characters rather than using <= >= inline in that function. Might even put those named functions in CharacterNames.h.
Comment 7 Darin Adler 2021-12-21 01:53:40 PST
Comment on attachment 447489 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=447489&action=review

> Source/WebCore/platform/graphics/FontCascade.h:264
> +    static bool treatAsZeroWidthSpaceInComplexScript(UChar32 c) { return c < space || (c >= deleteCharacter && c < noBreakSpace) || c == softHyphen || c == zeroWidthSpace || (c >= leftToRightMark && c <= rightToLeftMark) || (c >= leftToRightEmbed && c <= rightToLeftOverride) || c == zeroWidthNoBreakSpace || c == objectReplacementCharacter; }

It seems like we should change these:

    c < space
        -> isC0ControlCharacter(c)
    (c >= leftToRightMark && c <= rightToLeftMark) || (c >= leftToRightEmbed && c <= rightToLeftOverride)
        -> isDirectionalFormattingCharacter(c)

But to deserve the name isDirectionalFormattingCharacter, it would include 5 other characters <https://www.unicode.org/reports/tr9/#Directional_Formatting_Characters>. It seems clear that those 5 other characters also should also be treated as zero-width space. Unless maybe it’s not so important to have that be an exhaustive list. Unclear what the importance of including characters here is, and how we test it. Presumably it works around incorrectly constructed fonts where those characters aren’t zero-width?

If we were not using an isDirectionalFormattingCharacter function, then we'd want to do this:

    (c >= leftToRightMark && c <= rightToLeftMark)
        -> c == leftToRightMark || c == rightToLeftMark

But more importantly, what character are covered in the range leftToRightEmbed through rightToLeftOverride is not self-explanatory.