WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
187209
[Cocoa] LastResort in the font family list causes emoji with joiners to be rendered as multiple .notdef characters
https://bugs.webkit.org/show_bug.cgi?id=187209
Summary
[Cocoa] LastResort in the font family list causes emoji with joiners to be re...
Myles C. Maxfield
Reported
2018-06-29 17:20:33 PDT
[Cocoa] LastResort in the font family list causes emoji with joiners to be rendered as multiple .notdef characters
Attachments
Patch
(12.82 KB, patch)
2018-06-29 17:30 PDT
,
Myles C. Maxfield
darin
: review+
Details
Formatted Diff
Diff
Patch for committing
(14.81 KB, patch)
2018-07-01 04:47 PDT
,
Myles C. Maxfield
commit-queue
: commit-queue-
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Myles C. Maxfield
Comment 1
2018-06-29 17:30:32 PDT
Created
attachment 343984
[details]
Patch
Myles C. Maxfield
Comment 2
2018-06-29 17:31:06 PDT
<
rdar://problem/40920785
>
Darin Adler
Comment 3
2018-06-29 17:49:54 PDT
Comment on
attachment 343984
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=343984&action=review
> Source/WebCore/platform/graphics/Font.cpp:158 > + // FIXME: Consider reordering these so the most common ones are at the front. > + // Doing this could cause the BitVector to fit inside inline storage and therefore > + // be both a performance and a memory progression.
What if there’s a typo in the numbers or a logic mistake in this function where we accidentally used a number twice? How would we notice? What test would detect the mistake?
> Source/WebCore/platform/graphics/Font.cpp:585 > + // This is very similar to static_cast<bool>(glyphForCharacter(character)) > + // except that glyphForCharacter() maps certain code points to ZWS (because they > + // shouldn't show up in the fast text codepath). This function doesn't do that > + // mapping, and instead is as honest as possible about what code points the font > + // supports.
The glyphForCharacter function maps certain code points to zero-width spaces because we never want those code points to render visibly. We want that behavior for both the fast and slow text code paths. I wish the comment was clearer on this point. And I wish I understood how we will get that behavior. The comment does not make clear why "as honest as possible" is needed here. I think the idea is that "even if we don’t want the code points to render visibly we want to correctly understood the code points are 'supported' because we makes some other kind of decision based on that". But I don’t fully understand. The comment also doesn’t make it clear why we decided that this can be the slower function and the other one must be the faster function, rather than the other way around. Seems natural to make the "honest" function the fast one, but I’m sure we have a good reason not to do that.
> Source/WebCore/platform/graphics/cocoa/FontCocoa.mm:612 > + CFIndex count; > + if (U_IS_BMP(character)) { > + codeUnits[0] = character; > + count = 1; > + } else { > + codeUnits[0] = U16_LEAD(character); > + codeUnits[1] = U16_TRAIL(character); > + count = 2; > + }
This code can be written like this: CFIndex count = 0; U16_APPEND_UNSAFE(codeUnits, count, character); I think it’s better that way.
Myles C. Maxfield
Comment 4
2018-07-01 04:44:36 PDT
Comment on
attachment 343984
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=343984&action=review
>> Source/WebCore/platform/graphics/Font.cpp:158 >> + // be both a performance and a memory progression. > > What if there’s a typo in the numbers or a logic mistake in this function where we accidentally used a number twice? How would we notice? What test would detect the mistake?
There might be a way to do it at compile time with tuples and template parameter packs, but I can't figure out how. I'll just add some ASSERT()s. 😕
>> Source/WebCore/platform/graphics/Font.cpp:585 >> + // supports. > > The glyphForCharacter function maps certain code points to zero-width spaces because we never want those code points to render visibly. We want that behavior for both the fast and slow text code paths. I wish the comment was clearer on this point. And I wish I understood how we will get that behavior. > > The comment does not make clear why "as honest as possible" is needed here. I think the idea is that "even if we don’t want the code points to render visibly we want to correctly understood the code points are 'supported' because we makes some other kind of decision based on that". But I don’t fully understand. > > The comment also doesn’t make it clear why we decided that this can be the slower function and the other one must be the faster function, rather than the other way around. Seems natural to make the "honest" function the fast one, but I’m sure we have a good reason not to do that.
Yeah, when I first discovered this, I was surprised like you were to see that we didn't have any custom code for the complex path (and instead the custom code only handles the simple path). Upon further investigation, I found that Core Text handles this for us. Mapping characters to ZWS when we want them to be invisible has two problems: 1) Fonts can totally render visible glyphs for ZWS 2) Performing the substation at the code point level means that the results of mapping the code points to glyphs will be affected, and the shaping rules inside fonts won't expect these new glyphIDs to be used. Shaping won't work correctly. Instead, the better solution to this is to not do any code-point-level substitution, but instead run shaping just like normal, and then after shaping, look through the transformed glyphs to see if any of them came from these special characters. If so, we can literally remove the glyph from the stream, thereby forcing it not to render. The better solution is too risky to do now, but
https://bugs.webkit.org/show_bug.cgi?id=187166
is for it. I'm not sure what you mean about fast vs slow functions; both the honest and dishonest ones should be fast. The dishonest one is cached using GlyphPages, and the honest one is cached using a bit vector.
Myles C. Maxfield
Comment 5
2018-07-01 04:47:44 PDT
Created
attachment 344048
[details]
Patch for committing
WebKit Commit Bot
Comment 6
2018-07-01 05:30:00 PDT
Comment on
attachment 344048
[details]
Patch for committing Rejecting
attachment 344048
[details]
from commit-queue. Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-02', 'validate-changelog', '--check-oops', '--non-interactive', 344048, '--port=mac']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit ChangeLog entry in LayoutTests/ChangeLog contains OOPS!. Full output:
https://webkit-queues.webkit.org/results/8403296
Darin Adler
Comment 7
2018-07-01 13:25:19 PDT
Comment on
attachment 343984
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=343984&action=review
>>> Source/WebCore/platform/graphics/Font.cpp:585 >>> + // supports. >> >> The glyphForCharacter function maps certain code points to zero-width spaces because we never want those code points to render visibly. We want that behavior for both the fast and slow text code paths. I wish the comment was clearer on this point. And I wish I understood how we will get that behavior. >> >> The comment does not make clear why "as honest as possible" is needed here. I think the idea is that "even if we don’t want the code points to render visibly we want to correctly understood the code points are 'supported' because we makes some other kind of decision based on that". But I don’t fully understand. >> >> The comment also doesn’t make it clear why we decided that this can be the slower function and the other one must be the faster function, rather than the other way around. Seems natural to make the "honest" function the fast one, but I’m sure we have a good reason not to do that. > > Yeah, when I first discovered this, I was surprised like you were to see that we didn't have any custom code for the complex path (and instead the custom code only handles the simple path). Upon further investigation, I found that Core Text handles this for us. > > Mapping characters to ZWS when we want them to be invisible has two problems: > 1) Fonts can totally render visible glyphs for ZWS > 2) Performing the substation at the code point level means that the results of mapping the code points to glyphs will be affected, and the shaping rules inside fonts won't expect these new glyphIDs to be used. Shaping won't work correctly. > > Instead, the better solution to this is to not do any code-point-level substitution, but instead run shaping just like normal, and then after shaping, look through the transformed glyphs to see if any of them came from these special characters. If so, we can literally remove the glyph from the stream, thereby forcing it not to render. > > The better solution is too risky to do now, but
https://bugs.webkit.org/show_bug.cgi?id=187166
is for it. > > I'm not sure what you mean about fast vs slow functions; both the honest and dishonest ones should be fast. The dishonest one is cached using GlyphPages, and the honest one is cached using a bit vector.
We could have decided to cache the honest one using glyph pages and the dishonest one with a bit vector. The one with the bit vector will be slower than the one that uses the glyph pages alone.
Myles C. Maxfield
Comment 8
2018-07-01 18:30:35 PDT
(In reply to Darin Adler from
comment #7
)
> Comment on
attachment 343984
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=343984&action=review
> > >>> Source/WebCore/platform/graphics/Font.cpp:585 > >>> + // supports. > >> > >> The glyphForCharacter function maps certain code points to zero-width spaces because we never want those code points to render visibly. We want that behavior for both the fast and slow text code paths. I wish the comment was clearer on this point. And I wish I understood how we will get that behavior. > >> > >> The comment does not make clear why "as honest as possible" is needed here. I think the idea is that "even if we don’t want the code points to render visibly we want to correctly understood the code points are 'supported' because we makes some other kind of decision based on that". But I don’t fully understand. > >> > >> The comment also doesn’t make it clear why we decided that this can be the slower function and the other one must be the faster function, rather than the other way around. Seems natural to make the "honest" function the fast one, but I’m sure we have a good reason not to do that. > > > > Yeah, when I first discovered this, I was surprised like you were to see that we didn't have any custom code for the complex path (and instead the custom code only handles the simple path). Upon further investigation, I found that Core Text handles this for us. > > > > Mapping characters to ZWS when we want them to be invisible has two problems: > > 1) Fonts can totally render visible glyphs for ZWS > > 2) Performing the substation at the code point level means that the results of mapping the code points to glyphs will be affected, and the shaping rules inside fonts won't expect these new glyphIDs to be used. Shaping won't work correctly. > > > > Instead, the better solution to this is to not do any code-point-level substitution, but instead run shaping just like normal, and then after shaping, look through the transformed glyphs to see if any of them came from these special characters. If so, we can literally remove the glyph from the stream, thereby forcing it not to render. > > > > The better solution is too risky to do now, but
https://bugs.webkit.org/show_bug.cgi?id=187166
is for it. > > > > I'm not sure what you mean about fast vs slow functions; both the honest and dishonest ones should be fast. The dishonest one is cached using GlyphPages, and the honest one is cached using a bit vector. > > We could have decided to cache the honest one using glyph pages and the > dishonest one with a bit vector. The one with the bit vector will be slower > than the one that uses the glyph pages alone.
The dishonest one is for the fast text codepath, which is the most common code path used on the Web today. That is the one that uses glyph pages.
Myles C. Maxfield
Comment 9
2018-07-01 18:36:53 PDT
Committed
r233413
: <
https://trac.webkit.org/changeset/233413
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug