Bug 141047 - Additional emoji support
Summary: Additional emoji support
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Layout and Rendering (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Enrica Casucci
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-29 14:01 PST by Enrica Casucci
Modified: 2015-02-03 13:38 PST (History)
9 users (show)

See Also:


Attachments
Patch (9.33 KB, patch)
2015-01-29 15:56 PST, Enrica Casucci
no flags Details | Formatted Diff | Diff
Patch2 (8.31 KB, patch)
2015-01-29 17:31 PST, Enrica Casucci
buildbot: commit-queue-
Details | Formatted Diff | Diff
Archive of layout-test-results from ews100 for mac-mavericks (581.63 KB, application/zip)
2015-01-29 20:58 PST, Build Bot
no flags Details
Archive of layout-test-results from ews106 for mac-mavericks-wk2 (701.73 KB, application/zip)
2015-01-29 21:06 PST, Build Bot
no flags Details
Patch3 (13.82 KB, patch)
2015-02-02 16:50 PST, Enrica Casucci
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Enrica Casucci 2015-01-29 14:01:55 PST
Adding support for emoji variations and emoji groups.

rdar://problem/19045135
Comment 1 Enrica Casucci 2015-01-29 15:56:22 PST
Created attachment 245665 [details]
Patch
Comment 2 WebKit Commit Bot 2015-01-29 15:58:34 PST
Attachment 245665 [details] did not pass style-queue:


ERROR: Source/WebCore/platform/graphics/FontCascade.cpp:712:  Multi line control clauses should use braces.  [whitespace/braces] [4]
Total errors found: 1 in 9 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Enrica Casucci 2015-01-29 17:31:51 PST
Created attachment 245678 [details]
Patch2

After discussing the patch in person with Dan Bernstein, I made some changes.
Comment 4 Build Bot 2015-01-29 20:58:11 PST
Comment on attachment 245678 [details]
Patch2

Attachment 245678 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.appspot.com/results/5327720862973952

New failing tests:
fast/text/format-control.html
Comment 5 Build Bot 2015-01-29 20:58:14 PST
Created attachment 245689 [details]
Archive of layout-test-results from ews100 for mac-mavericks

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews100  Port: mac-mavericks  Platform: Mac OS X 10.9.5
Comment 6 Build Bot 2015-01-29 21:06:27 PST
Comment on attachment 245678 [details]
Patch2

Attachment 245678 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/6411648201916416

New failing tests:
fast/text/format-control.html
Comment 7 Build Bot 2015-01-29 21:06:31 PST
Created attachment 245690 [details]
Archive of layout-test-results from ews106 for mac-mavericks-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106  Port: mac-mavericks-wk2  Platform: Mac OS X 10.9.5
Comment 8 mitz 2015-01-29 21:55:14 PST
The fast/text/format-control.html failure shows an existing bug being uncovered by the patch: the complex text path lets ligatures form around a zero-width joiner, whereas the fast path doesn’t allow it. Testing in TextEdit suggests that the correct behavior is to not form a ligature, so it’s a bug in the complex path.
Comment 9 Enrica Casucci 2015-02-02 16:50:04 PST
Created attachment 245914 [details]
Patch3

Updated patch after discussing with Dan.
It also fixes line breaking and cursor movement.
Comment 10 WebKit Commit Bot 2015-02-02 16:53:27 PST
Attachment 245914 [details] did not pass style-queue:


ERROR: Source/WebCore/platform/text/TextBreakIterator.cpp:210:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/WebCore/platform/text/TextBreakIterator.cpp:232:  One space before end of line comments  [whitespace/comments] [5]
ERROR: Source/WebCore/platform/text/TextBreakIterator.cpp:251:  One space before end of line comments  [whitespace/comments] [5]
Total errors found: 3 in 10 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 11 Darin Adler 2015-02-03 08:30:59 PST
Comment on attachment 245914 [details]
Patch3

View in context: https://bugs.webkit.org/attachment.cgi?id=245914&action=review

I think this code is probably correct, but there are no comments explaining the behavior, so I think we should do better.

> Source/WebCore/platform/graphics/FontCascade.cpp:643
> +    bool emojiGroupCandidate = false;

Normally we would not name a boolean variable "candidate" since the boolean is not itself a candidate. Maybe sawEmojiGroupCandidate is the right name for this? I’m nots sure I understand the meaning of the boolean (see below).

> Source/WebCore/platform/graphics/FontCascade.cpp:713
> +        if (c == 0x200D && emojiGroupCandidate)
> +            return Complex;

I don’t understand the logic here. Are we saying that a 0x200D immediately following a 0x1F466-0x1F469 makes the string need the Complex code path, or that a 0x200D anywhere after a 0x1F466-0x1F469 makes the string complex? The code implements the latter.

> Source/WebCore/rendering/RenderText.cpp:1441
> +inline bool isEmojiGroupCandidate(UChar32 c)

These functions should be "static inline" rather than just "inline" since we want internal linkage within this file, not external linkage. These should use the word "character" instead of the letter "c".

> Source/WebCore/rendering/RenderText.cpp:1459
>      bool sawRegionalIndicator = false;

It’s strange that sawRegionalIndicator gets set and then is left set to true no matter what intervening characters we see.

> Source/WebCore/rendering/RenderText.cpp:1460
> +    bool sawEmojiGroupCandidate = false;

I think this should be named previousCharacterWasEmojiGroupCandidate. Although I’m surprised that these code points are called “emoji group candidates”. Is that really the correct terminology?

> Source/WebCore/rendering/RenderText.cpp:1477
> +        if (sawEmojiGroupCandidate) {
> +            sawEmojiGroupCandidate = false;
> +            if (character == 0x200D)
> +                continue;
> +            U16_FWD_1_UNSAFE(text, current);
> +            break;
> +        }

I don’t understand why this is the correct thing to do. Maybe a comment is needed.

Could we please use the name zeroWidthJoiner instead of the constant 0x200D?

> Source/WebCore/rendering/RenderText.cpp:1480
> +        if (sawEmojiModifier)
> +            break;

I don’t understand why this is the correct thing to do. Maybe a comment is needed.

> LayoutTests/ChangeLog:9
> +        * editing/deleting/delete-emoji.html: Added.

Why do we need separate results for each platform? The files all look the same to me? Can’t we just put a result in this directory instead of in all four of those other directories?

> LayoutTests/editing/deleting/delete-emoji.html:8
> +Markup.description("This test verifies that emoji groups and emoji with variations are delete correctly");

"are deleted" rather than "are delete"

I’d like to see test cases covering incorrect sequences, not just correct ones.
Comment 12 Enrica Casucci 2015-02-03 11:32:40 PST
> Normally we would not name a boolean variable "candidate" since the boolean
> is not itself a candidate. Maybe sawEmojiGroupCandidate is the right name
> for this? I’m nots sure I understand the meaning of the boolean (see below).

I'll change the variable name,
> 
> > Source/WebCore/platform/graphics/FontCascade.cpp:713
> > +        if (c == 0x200D && emojiGroupCandidate)
> > +            return Complex;
> 
> I don’t understand the logic here. Are we saying that a 0x200D immediately
> following a 0x1F466-0x1F469 makes the string need the Complex code path, or
> that a 0x200D anywhere after a 0x1F466-0x1F469 makes the string complex? The
> code implements the latter.
It is immediately following. I can make this strictly accept only the immediately following.
> 
> > Source/WebCore/rendering/RenderText.cpp:1460
> > +    bool sawEmojiGroupCandidate = false;
> 
> I think this should be named previousCharacterWasEmojiGroupCandidate.
> Although I’m surprised that these code points are called “emoji group
> candidates”. Is that really the correct terminology?
> 
As far as I know yes, it is the correct terminology.
> 
> Could we please use the name zeroWidthJoiner instead of the constant 0x200D?
> 
> > Source/WebCore/rendering/RenderText.cpp:1480
> > +        if (sawEmojiModifier)
> > +            break;
> 
> I don’t understand why this is the correct thing to do. Maybe a comment is
> needed.
> 
> > LayoutTests/ChangeLog:9
> > +        * editing/deleting/delete-emoji.html: Added.
> 
> Why do we need separate results for each platform? The files all look the
> same to me? Can’t we just put a result in this directory instead of in all
> four of those other directories?
> 
> > LayoutTests/editing/deleting/delete-emoji.html:8
> > +Markup.description("This test verifies that emoji groups and emoji with variations are delete correctly");
> 
> "are deleted" rather than "are delete"
> 
> I’d like to see test cases covering incorrect sequences, not just correct
> ones.
will do.
Comment 13 Enrica Casucci 2015-02-03 13:38:30 PST
Committed revision 179567.