Bug 53696 - Caret is rendered at an incorrect position at the boundary of Arabic number in a LTR context
: Caret is rendered at an incorrect position at the boundary of Arabic number i...
Status: RESOLVED INVALID
: WebKit
Layout and Rendering
: 528+ (Nightly build)
: PC Mac OS X 10.5
: P2 Normal
Assigned To:
:
:
:
: 49111
  Show dependency treegraph
 
Reported: 2011-02-03 11:03 PST by
Modified: 2011-04-06 10:35 PST (History)


Attachments
demo (1.14 KB, text/html)
2011-02-03 11:03 PST, Ryosuke Niwa
no flags Details
work in progress (3.19 KB, patch)
2011-02-03 17:06 PST, Ryosuke Niwa
no flags Review Patch | Details | Formatted Diff | Diff
work in progress 2 (3.93 KB, patch)
2011-02-04 11:40 PST, Ryosuke Niwa
no flags Review Patch | Details | Formatted Diff | Diff
fix the bug (hopefully correct) (16.66 KB, patch)
2011-02-04 16:29 PST, Ryosuke Niwa
no flags Review Patch | Details | Formatted Diff | Diff
fix the bug (removed eol-style changes) (15.01 KB, patch)
2011-02-04 16:33 PST, Ryosuke Niwa
no flags Review Patch | Details | Formatted Diff | Diff
Rewrote while loops as for loops (15.69 KB, patch)
2011-02-23 17:52 PST, Ryosuke Niwa
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2011-02-03 11:03:28 PST
Created an attachment (id=81089) [details]
demo

In the attachment, offset 2 is rendered on the right of Arabic numbers but it should be rendered on the left of Arabic numbers (between Arabic text and Arabic numbers).
------- Comment #1 From 2011-02-03 17:06:17 PST -------
Created an attachment (id=81148) [details]
work in progress
------- Comment #2 From 2011-02-03 17:14:12 PST -------
This bug seems to caused by Position::getInlineBoxAndOffset.  One bug was that in tertiary run cases, we were using the opposite offsets.  When we're on the right edge and setting to the left edge of that run, we must be setting to the right edge of the previous leaf child.

The other but is in the case of inlineBox->direction() == primaryDirection.  We weren't considering next/previous leaf children with a higher bidi-level when previous/leaf leaf children were not present.  But I'm less confident about that fix.

Dan, Xiaomei, could you take a look at my work in progress patch and do the sanity check?

I don't know how to test all the edges though.  It seems like I must be adding 10-20 pixel tests because there seems to be no way of obtaining the caret position from JavaScript.
------- Comment #3 From 2011-02-04 11:40:48 PST -------
Created an attachment (id=81256) [details]
work in progress 2
------- Comment #4 From 2011-02-04 16:29:32 PST -------
Created an attachment (id=81317) [details]
fix the bug (hopefully correct)
------- Comment #5 From 2011-02-04 16:33:01 PST -------
Created an attachment (id=81318) [details]
fix the bug (removed eol-style changes)
------- Comment #6 From 2011-02-04 16:52:07 PST -------
(From update of attachment 81318 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=81318&action=review

> Source/WebCore/dom/Position.cpp:1126
> -            if (!nextBox || nextBox->bidiLevel() >= level)
> +            if (nextBox && nextBox->bidiLevel() >= level)
>                  return;
>  
> +            if (!nextBox) {
> +                InlineBox* prevBox = inlineBox;
> +                do {
> +                    if (prevBox->bidiLevel() < level) {
> +                        inlineBox = prevBox;
> +                        caretOffset = inlineBox->caretRightmostOffset();
> +                        break;
> +                    }
> +                    prevBox = prevBox->prevLeafChild();
> +                } while (prevBox);
> +                return;
> +            }
> +

I'm not confident about this change.  I think it makes sense but I might be missing some cases since I'm not that familiar with this code.
------- Comment #7 From 2011-02-22 18:46:44 PST -------
Hi, could someone review my patch or give me some feedback?
------- Comment #8 From 2011-02-23 12:11:36 PST -------
Going to take a look at this today.
------- Comment #9 From 2011-02-23 13:48:38 PST -------
(From update of attachment 81318 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=81318&action=review

> Source/WebCore/dom/Position.cpp:1123
> +                InlineBox* prevBox = inlineBox;
> +                do {
> +                    if (prevBox->bidiLevel() < level) {
> +                        inlineBox = prevBox;
> +                        caretOffset = inlineBox->caretRightmostOffset();
> +                        break;
> +                    }
> +                    prevBox = prevBox->prevLeafChild();
> +                } while (prevBox);

Is there a reason not to write this as a for() loop?
------- Comment #10 From 2011-02-23 13:56:19 PST -------
The current behavior in attachment 81089 [details] matches the behavior of NSTextView in Mac OS X 10.6. Why do you deem it incorrect?
------- Comment #11 From 2011-02-23 15:00:34 PST -------
(In reply to comment #10)
> The current behavior in attachment 81089 [details] [details] matches the behavior of NSTextView in Mac OS X 10.6. Why do you deem it incorrect?

It is not possible to place cursor between the Arabic letters and Arabic numbers in current behavior. Is it normal?
------- Comment #12 From 2011-02-23 15:05:24 PST -------
(In reply to comment #11)
> (In reply to comment #10)
> > The current behavior in attachment 81089 [details] [details] [details] matches the behavior of NSTextView in Mac OS X 10.6. Why do you deem it incorrect?
> 
> It is not possible to place cursor between the Arabic letters and Arabic numbers in current behavior. Is it normal?

I am not sure what exactly “normal” means. Not being able to place the cursor between the letters and the numerals is a difference between WebKit and AppKit’s NSTextView. In NSTextView, the caret renders between the letters and the numerals when it’s at offset 0 (i.e. before the numerals).
------- Comment #13 From 2011-02-23 15:07:28 PST -------
(In reply to comment #12)
> (In reply to comment #11)
> > (In reply to comment #10)
> > > The current behavior in attachment 81089 [details] matches the behavior of NSTextView in Mac OS X 10.6. Why do you deem it incorrect?
> > 
> > It is not possible to place cursor between the Arabic letters and Arabic numbers in current behavior. Is it normal?
> 
> I am not sure what exactly “normal” means. Not being able to place the cursor between the letters and the numerals is a difference between WebKit and AppKit’s NSTextView. In NSTextView, the caret renders between the letters and the numerals when it’s at offset 0 (i.e. before the numerals).

I guess this sounds like a contradiction. I should clarify that comment #10 was about where the cursor is rendered when it’s at offset 2, and not about the entire test case.
------- Comment #14 From 2011-02-23 17:04:53 PST -------
(In reply to comment #10)
> The current behavior in attachment 81089 [details] [details] matches the behavior of NSTextView in Mac OS X 10.6. Why do you deem it incorrect?

The problem is that both offsets 2 and 4 correspond to the rightmost location visually, and there's no logical offset that corresponds to the location between Arabic letters and numbers.

Of offsets 2 and 4, offset 4 is the one that needs to be on the rightmost location because we're in a LTR block and the location is at the end of a RTL run.  Offset 2 must be between the letters and the numbers (i.e. on the left of a LTR run of Arabic numbers) because it's at the end of a LTR run inside a RTL run.
------- Comment #15 From 2011-02-23 17:52:07 PST -------
Created an attachment (id=83593) [details]
Rewrote while loops as for loops
------- Comment #16 From 2011-02-23 17:52:53 PST -------
(In reply to comment #9)
> (From update of attachment 81318 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=81318&action=review
> 
> > Source/WebCore/dom/Position.cpp:1123
> > +                InlineBox* prevBox = inlineBox;
> > +                do {
> > +                    if (prevBox->bidiLevel() < level) {
> > +                        inlineBox = prevBox;
> > +                        caretOffset = inlineBox->caretRightmostOffset();
> > +                        break;
> > +                    }
> > +                    prevBox = prevBox->prevLeafChild();
> > +                } while (prevBox);
> 
> Is there a reason not to write this as a for() loop?

Very good point.  Fixed.
------- Comment #17 From 2011-02-23 17:55:53 PST -------
(In reply to comment #12)
> (In reply to comment #11)
> > It is not possible to place cursor between the Arabic letters and Arabic numbers in current behavior. Is it normal?
> 
> I am not sure what exactly “normal” means. Not being able to place the cursor between the letters and the numerals is a difference between WebKit and AppKit’s NSTextView. In NSTextView, the caret renders between the letters and the numerals when it’s at offset 0 (i.e. before the numerals).

Ah, that's interesting.  NSTextView seems to follow Firefox/Internet Explorer convention then.  For WebKit, we can't use offset 0 to mean the location between the letters and the numbers because the offset 0 corresponds and only corresponds to the leftmost boundary of the line.
------- Comment #18 From 2011-02-23 17:59:18 PST -------
(In reply to comment #17)

>  For WebKit, we can't use offset 0 to mean the location between the letters and the numbers because the offset 0 corresponds and only corresponds to the leftmost boundary of the line.

I don’t understand this explanation (or maybe it’s not an explanation?).
------- Comment #19 From 2011-02-23 18:12:16 PST -------
(In reply to comment #18)
> (In reply to comment #17)
> 
> >  For WebKit, we can't use offset 0 to mean the location between the letters and the numbers because the offset 0 corresponds and only corresponds to the leftmost boundary of the line.
> 
> I don’t understand this explanation (or maybe it’s not an explanation?).

Let me try again.  So in WebKit, we use the convention that the offsets at the boundary of a run follows the direction of the surrounding run.  For example, if we have CBA (RTL text) in a LTR run, then we have offsets as (0)C(2)B(1)A(3) while we have natural (3)C(2)B(1)A(0) if it were to appear in a RTL run.

Now, the text in question is CBAabc.  Since "abc" is inside a RTL run of CBA, offsets at the boundaries follows RTL conventions as in (3)a(1)b(2)c(0).  However, the outer RTL run is in a LTR block, which means offset at the boundary of the RTL run has to follow LTR convention of the outer block hence we have: (0)CBA(3)a(1)b(2)c.  Now because CBA(abc) is a RTL run inside a LTR block, the offsets at its boundary follows LTR convention, resulting in (0)C(5)B(4)A(3)a(1)b(2)c(6).

Does this make sense?  Or am I misunderstanding something?
------- Comment #20 From 2011-02-24 09:15:40 PST -------
(In reply to comment #19)
> (In reply to comment #18)
> > (In reply to comment #17)
> > 
> > >  For WebKit, we can't use offset 0 to mean the location between the letters and the numbers because the offset 0 corresponds and only corresponds to the leftmost boundary of the line.
> > 
> > I don’t understand this explanation (or maybe it’s not an explanation?).
> 
> Let me try again.  So in WebKit, we use the convention that the offsets at the boundary of a run follows the direction of the surrounding run.

Perhaps this is the wrong “convention” to use in this case. What the code currently does was arrived at, I think, through trying to match the native text system. If I or someone else got some cases wrong, then that should be fixed. There is no apparent reason to diverge from the native behavior.
------- Comment #21 From 2011-02-24 16:59:21 PST -------
(In reply to comment #20)
> Perhaps this is the wrong “convention” to use in this case. What the code currently does was arrived at, I think, through trying to match the native text system. If I or someone else got some cases wrong, then that should be fixed. There is no apparent reason to diverge from the native behavior.

Mn... what is the right convention to use in this case then?
------- Comment #22 From 2011-02-24 17:15:39 PST -------
(In reply to comment #21)
> (In reply to comment #20)
> > Perhaps this is the wrong “convention” to use in this case. What the code currently does was arrived at, I think, through trying to match the native text system. If I or someone else got some cases wrong, then that should be fixed. There is no apparent reason to diverge from the native behavior.
> 
> Mn... what is the right convention to use in this case then?

In this case, NSTextView maps offset 0 to the position between the letters and the numerals. I don’t know yet what the general principle is, nor whether this is a bug. I am going to try to find out more.
------- Comment #23 From 2011-02-24 18:10:40 PST -------
(In reply to comment #22)
> (In reply to comment #21)
> > Mn... what is the right convention to use in this case then?
> 
> In this case, NSTextView maps offset 0 to the position between the letters and the numerals. I don’t know yet what the general principle is, nor whether this is a bug. I am going to try to find out more.

Ok, thanks!  Is there some application I can use to test NSTextView behavior myself?  In particular, how do I know which offset corresponds to which insertion point location?
------- Comment #24 From 2011-02-24 19:24:24 PST -------
(In reply to comment #23)
> (In reply to comment #22)
> > (In reply to comment #21)
> > > Mn... what is the right convention to use in this case then?
> > 
> > In this case, NSTextView maps offset 0 to the position between the letters and the numerals. I don’t know yet what the general principle is, nor whether this is a bug. I am going to try to find out more.
> 
> Ok, thanks!  Is there some application I can use to test NSTextView behavior myself? 

Yes, TextEdit. Make sure to explicitly set the paragraph direction, since the default is “automatic” (natural). Also, it’s less confusing if you disable the split caret in International & Text preferences.

> In particular, how do I know which offset corresponds to which insertion point location?

I usually just press Return to insert a newline to see where I was :)
------- Comment #25 From 2011-02-27 22:05:41 PST -------
(In reply to comment #24)
> (In reply to comment #23)
> > In particular, how do I know which offset corresponds to which insertion point location?
> 
> I usually just press Return to insert a newline to see where I was :)

Using this method, I found that the caret is at offset 2 if the text direction is explicitly set to RTL but it's at offset 0 if the text direction is explicitly set to LTR.

Now, if I set the text direction to LTR, and put the caret on the rightmost boundary by clicking on the far right of the text, then the split caret shows up on the upper rightmost corner and the lower leftmost corner.  If I then press left key (assuming selection is on default mode), then the lower caret moves to the boundary between arabic letters and numerals.  And pressing enter here results in inserting \n at offset 2 as well.  And this lower caret is what I was trying to implement.

Now, TextEdit seems to have various insertion point bugs with this particular input.  In particular, putting the insertion point between alphabet and numeral, and then moving to the left results in infinite loop on the single letter ‎ر.  Given that I'm not even sure if we should try to mimic TextEdit or NSTextView.
------- Comment #26 From 2011-03-03 22:20:10 PST -------
Hi Dan,

I tested TextEdit's behavior by your method and it seems that TextEdit follows Firefox / Internet Explorer's offset behavior here: (4)B(3)A(0)1(1)2(2).  But even WebKit follows this convention when inserting new line because the enclosing block in this case is LTR.

One potential problem with this research method is the location at which character is inserted/removed depends on the natural direction of the character and the direction of the enclosing block.  For example, if I change the direction of the document to RTL, then the apparent offsets on TextEdit change to (4)B(3)A(2)1(1)2(0).

Anyway, if you think we should align WebKit's behavior to that of TextEdit, then it seems like we need to reopen and fix https://bugs.webkit.org/show_bug.cgi?id=52176.
------- Comment #27 From 2011-03-07 18:41:23 PST -------
After some discussion with Xiaomei, I'm little more convinced that my current approach (at least offset assignments) is indeed correct.  We considered several possibilites for offsets in CBA123 (arabic numbers followed by arabic letters in a LTR block) here:

First, the following is the offsets assigned by Firefox/IE.
Internet Explorer: (0)(1)(7)C(6)B(5)A(4)1(2)2(3)3
Firefox: (0)(7)C(6)B(5)A(1)(4)1(2)2(3)3

Note that neither assignments seem ideal in this case and both of them have offset 7 even though there are only 6 letters.  Now, let us consider 4 offset assignments:

(0)C(5)B(4)A(1)1(2)2(3)3(6) - The offsets between ABC contradict with those of Firefox/IE.  Also it's odd that even though (1) corresponds to the position before 2 but it appears between A and 1.

(0)C(6)B(5)A(1)1(2)2(3)3(4) - This is a little better in that offsets between CBA match with those of Firefox/IE.

(6)C(5)B(4)A(0)1(1)2(2)3(3) - This is similar to what TextEdit, Firefox, Internet Explorer does but contradicts with the offset assignments in WebKit when CBA appeared in a LTR block: (0)C(2)B(1)A(3).  Also, the offsets between ABC contradict with those of Firefox/IE.

(0)C(5)B(4)A(3)1(1)2(2)3(6) - This is similar to the above but all offsets except (0) and (6) flows in the same order as those of Firefox and Internet Explorer except (1) and we're off by 1.
------- Comment #28 From 2011-03-07 18:58:07 PST -------
I think we need the general principal here so that the visible positions are anticipated, if not able to be standardized.

For example, given a LTR box inside a LTR block.

There are 9 basic scenarios considering the box's surrounding box's directionality:

xLx (x menas NULL box).
xLL
xLR

LLx
LLL
LLR

RLx
RLL
RLR

In each scenario, how the visible positions should be laid out when surrounding boxes' bidi level is same as/less/greater than the box.

Same for a RTL box in LTR block, LTR/RTL box in RTL block.

Then, can the rule be generalized?
------- Comment #29 From 2011-03-31 02:05:30 PST -------
After long attempts to fix this bug, I figured out that our current behavior matches NSTextView.  So this bug is actually invalid.  We need to implement split caret / hinted-caret on Mac / non-Mac in order to allow carets between letters.
------- Comment #30 From 2011-04-06 10:35:34 PST -------
(From update of attachment 83593 [details])
Cleared review? from attachment 83593 [details] so that this bug does not appear in http://webkit.org/pending-review.  If you would like this patch reviewed, please attach it to a new bug (or re-open this bug before marking it for review again).