WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
65395
Regional indicator symbols that are combined should behave as a single character when editing
https://bugs.webkit.org/show_bug.cgi?id=65395
Summary
Regional indicator symbols that are combined should behave as a single charac...
mitz
Reported
2011-07-29 14:39:43 PDT
When two regional indicator symbols combine to form a single flag, the insertion point or a selection boundary should not be allowed between them, and delete commands should delete the pair as a unit.
Attachments
part 1 - backspace
(2.40 KB, patch)
2011-08-15 10:48 PDT
,
Alexey Proskuryakov
no flags
Details
Formatted Diff
Diff
part 1 - backspace
(5.26 KB, patch)
2011-08-15 14:23 PDT
,
Alexey Proskuryakov
mitz: review+
Details
Formatted Diff
Diff
part 1 - backspace
(2.92 KB, patch)
2011-08-15 14:37 PDT
,
Alexey Proskuryakov
no flags
Details
Formatted Diff
Diff
part 1 - backspace
(5.36 KB, patch)
2011-08-15 14:38 PDT
,
Alexey Proskuryakov
webkit.review.bot
: commit-queue-
Details
Formatted Diff
Diff
part 2 - cursor movement iterator (WIP)
(3.57 KB, patch)
2011-08-17 10:03 PDT
,
Alexey Proskuryakov
no flags
Details
Formatted Diff
Diff
part 2 - cursor movement iterator
(5.77 KB, patch)
2011-08-18 11:10 PDT
,
Alexey Proskuryakov
no flags
Details
Formatted Diff
Diff
Show Obsolete
(4)
View All
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2011-07-29 14:39:56 PDT
<
rdar://problem/9866836
>
Alexey Proskuryakov
Comment 2
2011-07-29 20:57:38 PDT
Why?
mitz
Comment 3
2011-07-29 21:01:12 PDT
Because they were most likely entered as a single symbol, and because the behavior when the insertion point is between the two symbols is unexpected and most likely undesirable.
Alexey Proskuryakov
Comment 4
2011-07-29 21:38:19 PDT
Does the Unicode spec agree?
mitz
Comment 5
2011-07-30 11:17:32 PDT
That’s a good question. I am not sure where and how the Unicode standard address this sort of question. For example, is it the Unicode standard that dictates WebKit’s behavior for decomposed sequences such as e + ´ vs. Hebrew letters with vowel marks such as א + ָ Can you help me look?
mitz
Comment 6
2011-07-30 11:50:25 PDT
I found this note in UAX #29 <
http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries
>: “Font-based information may be required to determine the appropriate unit to use for UI purposes, such as identification of boundaries for first-letter paragraph styling. For example, such a unit could be a ligature formed of two grapheme clusters, such as لا (Arabic lam + alef).” I think it may apply here.
Alexey Proskuryakov
Comment 7
2011-07-30 20:50:00 PDT
In more technical terms, my question was whether these symbols form extended grapheme clusters. In WebCore, these are returned by characterBreakIterator, which just calls ICU. We sometimes allow cursor positioning inside extended grapheme clusters. We should probably never disallow cursor positioning at a place where a character break can occur (e.g. for truncation).
mitz
Comment 8
2011-07-30 20:57:24 PDT
(In reply to
comment #7
)
> In more technical terms, my question was whether these symbols form extended grapheme clusters.
I believe that the answer is “not by default, but possibly by tailoring”.
> In WebCore, these are returned by characterBreakIterator, which just calls ICU. > > We sometimes allow cursor positioning inside extended grapheme clusters. We should probably never disallow cursor positioning at a place where a character break can occur (e.g. for truncation).
It now occurs to me that WebKit should also prohibit line breaks between two regional indicator symbols that combine, even in line wrapping modes that allow breaking between “characters”.
mitz
Comment 9
2011-08-10 08:38:10 PDT
For ICU-based ports, this is probably going to be resolved externally to WebKit via <
http://unicode.org/cldr/trac/ticket/4076
>.
Alexey Proskuryakov
Comment 10
2011-08-15 10:48:41 PDT
Created
attachment 103927
[details]
part 1 - backspace
Alexey Proskuryakov
Comment 11
2011-08-15 10:49:45 PDT
Some EWS bots are expected to fail due to respective platforms not using ICU. I'll add the test to skipped lists when landing.
mitz
Comment 12
2011-08-15 11:44:54 PDT
Comment on
attachment 103927
[details]
part 1 - backspace View in context:
https://bugs.webkit.org/attachment.cgi?id=103927&action=review
> Source/WebCore/ChangeLog:14 > + (WebCore::RenderText::previousOffsetForBackwardDeletion):
No comment?
> Source/WebCore/rendering/RenderText.cpp:1504 > + // Pairs of regional indicators are combined into flag emoji.
s/are/may be/. Most pairs don’t make up a flag.
> Source/WebCore/rendering/RenderText.cpp:1536 > + // We assume that regional indicators came in pairs. Otherwise we'd need to perform unlimited lookahead > + // to determine whether to delete an odd one at the end. > + if (isRegionalIndicator(character)) { > + if (sawRegionalIndicator) > + break; > + sawRegionalIndicator = true; > + continue; > + }
How will this treat a sequence of regional indicator, mark, regional indicator?
Alexey Proskuryakov
Comment 13
2011-08-15 11:56:56 PDT
> No comment?
Yup. Any suggestions that would be helpful for people reading the ChangeLog?
> > Source/WebCore/rendering/RenderText.cpp:1504 > > + // Pairs of regional indicators are combined into flag emoji.
they all do when deleting, and all that can practically occur do in rendering.
> How will this treat a sequence of regional indicator, mark, regional indicator?
It will delete mark, regional indicator (i.e. correctly, I think). Should I add a test for this case?
mitz
Comment 14
2011-08-15 12:02:24 PDT
(In reply to
comment #13
)
> > No comment? > > Yup. Any suggestions that would be helpful for people reading the ChangeLog?
A description of the new behavior.
> > > > Source/WebCore/rendering/RenderText.cpp:1504 > > > + // Pairs of regional indicators are combined into flag emoji. > > they all do when deleting, and all that can practically occur do in rendering.
I think this comment is saying something that’s not true. I’m not sure how it helps. I’m fine with removing it or with making a more truthful statement.
> > > How will this treat a sequence of regional indicator, mark, regional indicator? > > It will delete mark, regional indicator (i.e. correctly, I think). Should I add a test for this case?
Interesting. So the fact that the boolean is never reset to false really never matters? Please add tests.
Alexey Proskuryakov
Comment 15
2011-08-15 12:11:41 PDT
>A description of the new behavior.
I don't understand what you are saying. Do you want something like:
> + (WebCore::RenderText::previousOffsetForBackwardDeletion): Fix this bug.
?
> I think this comment is saying something that’s not true. I’m not sure how it helps. I’m fine with removing it or with making a more truthful statement.
Regional indicators and their extremely weird implementation is certainly not something many people in the world know about. This comment answers both questions and provides keywords for further searching. Unless you can suggest a better comment, I'm going to keep it as is.
> Interesting. So the fact that the boolean is never reset to false really never matters?
We certainly never want to reset it to false - that would mean that we delete a terribly long sequence of characters.
Alexey Proskuryakov
Comment 16
2011-08-15 13:00:10 PDT
Comment on
attachment 103927
[details]
part 1 - backspace I realized that I didn't include my test in the patch. Also, Dan convinced me to fix another edge case that I didn't care about at first.
Alexey Proskuryakov
Comment 17
2011-08-15 14:23:26 PDT
Created
attachment 103951
[details]
part 1 - backspace
WebKit Review Bot
Comment 18
2011-08-15 14:26:51 PDT
Attachment 103951
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/edit..." exit_code: 1 LayoutTests/ChangeLog:1: ChangeLog entry has no bug number [changelog/bugnumber] [5] Total errors found: 1 in 5 files If any of these errors are false positives, please file a bug against check-webkit-style.
Alexey Proskuryakov
Comment 19
2011-08-15 14:37:26 PDT
Created
attachment 103955
[details]
part 1 - backspace Patch preparation fail. Still need EWS results.
Alexey Proskuryakov
Comment 20
2011-08-15 14:38:20 PDT
Created
attachment 103956
[details]
part 1 - backspace More fail...
WebKit Review Bot
Comment 21
2011-08-15 15:08:07 PDT
Comment on
attachment 103956
[details]
part 1 - backspace
Attachment 103956
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/9380913
New failing tests: editing/deleting/regional-indicators.html
Alexey Proskuryakov
Comment 22
2011-08-15 15:39:50 PDT
Comment on
attachment 103956
[details]
part 1 - backspace Hmm, I'm surprised that Qt, Gtk and EFL all pass, but OK. Not sure what could be wrong with Chromium, need to see failing results.
WebKit Review Bot
Comment 23
2011-08-15 16:18:39 PDT
Comment on
attachment 103956
[details]
part 1 - backspace Rejecting
attachment 103956
[details]
from commit-queue. Failed to run "['./Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '--bot-id=ec2-cq-02', '--port..." exit_code: 2 Last 500 characters of output: deleting/regional-indicators.html = TEXT Regressions: Unexpected image mismatch : (5) fast/text/atsui-multiple-renderers.html = IMAGE fast/text/international/danda-space.html = IMAGE fast/text/international/thai-baht-space.html = IMAGE fast/text/international/thai-line-breaks.html = IMAGE platform/chromium-linux/fast/text/international/complex-joining-using-gpos.html = IMAGE Regressions: Unexpected image and text mismatch : (1) svg/custom/svg-fonts-word-spacing.html = IMAGE+TEXT Full output:
http://queues.webkit.org/results/9380937
Alexey Proskuryakov
Comment 24
2011-08-15 16:38:15 PDT
There is something wrong going on with the chromium bot. Committed manually as <
http://trac.webkit.org/r93068
>, and reverted an accidental change in <
http://trac.webkit.org/changeset/93069
>. Keeping the bug open for other fixes.
Adam Barth
Comment 25
2011-08-15 17:31:33 PDT
> There is something wrong going on with the chromium bot.
Nothing is wrong with the bot. I've filed
https://bugs.webkit.org/show_bug.cgi?id=66268
about this test failing on Chromium.
Alexey Proskuryakov
Comment 26
2011-08-17 10:03:32 PDT
Created
attachment 104185
[details]
part 2 - cursor movement iterator (WIP)
Alexey Proskuryakov
Comment 27
2011-08-17 10:11:35 PDT
In the attached patch for cursor movement iterator, I have changed ICU rules in the same way that is being considered for mainstream ICU in <
http://unicode.org/cldr/trac/ticket/4076
>. While this change works with default ICU character break rules, it does not work with our customized rules for cursor movement, and I cannot figure out why. CC'ing Hironori Bono, who wrote our custom rules. The problem is that for a string of several national flag characters, we get a break opportunity inside a flag. For example, with asterisk denoting a single regional indicator symbol (AKA half of an emoji flag), I'm getting these breaks: |****|*|*| While for a sequence of two flags, breaks are correct: |**|**|
Hironori Bono
Comment 28
2011-08-18 04:50:02 PDT
Greetings Alexey, Thank you for noticing it. Even though I have not investigated this issue so deeply, it seems to be a side-effect of the '!!chain;' rule. I will update the rule so it works with regional indicator symbols. Regards, Hironori Bono E-mail:
hbono@chromium.org
Alexey Proskuryakov
Comment 29
2011-08-18 08:56:44 PDT
This turned out to be my mistake - I copied the rules from ICU patch incorrectly. I'll post an updated patch shortly, sorry for the noise!
Alexey Proskuryakov
Comment 30
2011-08-18 11:10:47 PDT
Created
attachment 104367
[details]
part 2 - cursor movement iterator
Alexey Proskuryakov
Comment 31
2011-08-18 11:19:16 PDT
Comment on
attachment 104367
[details]
part 2 - cursor movement iterator Will need failing results on non-ICU platforms...
WebKit Review Bot
Comment 32
2011-08-18 13:33:11 PDT
Comment on
attachment 104367
[details]
part 2 - cursor movement iterator Clearing flags on attachment: 104367 Committed
r93344
: <
http://trac.webkit.org/changeset/93344
>
WebKit Review Bot
Comment 33
2011-08-18 13:33:18 PDT
All reviewed patches have been landed. Closing bug.
Alexey Proskuryakov
Comment 34
2011-08-18 14:52:19 PDT
Landed failing results for Qt in
r93351
, filed
bug 66500
.
Alexey Proskuryakov
Comment 35
2011-08-18 14:56:54 PDT
Bug 66501
/r93353 for Gtk.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug