Bug 57340 - REGRESSION(r74971): Selection doesn't work correctly in BiDi Text
Summary: REGRESSION(r74971): Selection doesn't work correctly in BiDi Text
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P1 Normal
Assignee: Ryosuke Niwa
URL:
Keywords: Regression
Depends on: 64626 65356 67869
Blocks:
  Show dependency treegraph
 
Reported: 2011-03-29 07:34 PDT by Jeremy Moskovich
Modified: 2011-09-26 11:29 PDT (History)
12 users (show)

See Also:


Attachments
Testcase (368 bytes, text/html)
2011-03-29 07:35 PDT, Jeremy Moskovich
no flags Details
Screenshot - trying to select the first letter. (14.27 KB, image/png)
2011-03-29 07:36 PDT, Jeremy Moskovich
no flags Details
Testcase (236 bytes, text/html; charset=ISO-8859-8-I)
2011-03-29 07:38 PDT, Jeremy Moskovich
no flags Details
test case corresponding to comment #6 (277 bytes, text/html)
2011-07-06 12:10 PDT, Xiaomei Ji
no flags Details
work in progress (6.42 KB, patch)
2011-07-29 16:58 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
work in progress 2 (20.55 KB, patch)
2011-07-31 00:04 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
work in progress 3 (17.10 KB, patch)
2011-08-19 18:56 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
work in progress 4 (25.98 KB, patch)
2011-08-23 10:21 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
work in progress 5 (28.68 KB, patch)
2011-09-01 12:54 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
work in progress 6 (23.80 KB, patch)
2011-09-08 20:31 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
initial patch (29.99 KB, patch)
2011-09-09 23:10 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
Updated per Eric's suggestion (29.94 KB, patch)
2011-09-12 17:08 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
Fixed qt build (29.93 KB, patch)
2011-09-12 18:02 PDT, Ryosuke Niwa
no flags Details | Formatted Diff | Diff
fixes the bug (29.91 KB, patch)
2011-09-21 13:22 PDT, Ryosuke Niwa
eric: review+
eric: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy Moskovich 2011-03-29 07:34:53 PDT
In the attached testcase, try selecting the rightmost letter on the first line by clicking to it's right and dragging left.

Expected results:
The first letter should be selected.

Actual results:
The entire run EXCEPT for the first letter is selected.

See attached screenshot.
Comment 1 Jeremy Moskovich 2011-03-29 07:35:19 PDT
Created attachment 87323 [details]
Testcase
Comment 2 Jeremy Moskovich 2011-03-29 07:36:17 PDT
Created attachment 87324 [details]
Screenshot - trying to select the first letter.
Comment 3 Jeremy Moskovich 2011-03-29 07:38:21 PDT
Created attachment 87325 [details]
Testcase
Comment 4 Eric Seidel (no email) 2011-03-29 10:48:25 PDT
Since we have a test case, bisect-builds, or a tiny script written for git bisect should easily be able to find the regressing revision.
Comment 5 Ryosuke Niwa 2011-05-06 12:42:03 PDT
I think this is a dup of the bug 59435.
Comment 6 Xiaomei Ji 2011-07-06 12:08:49 PDT
I am copying comments from issue 59435
https://bugs.webkit.org/show_bug.cgi?id=59435#c9

(In reply to comment #7)
> Created an attachment (id=91394) [details] [details]
> test case
> 
> I attached this simple html with a <p> and <div> (no <br>), 
> but the selection is not correct in the ways that:
> 1. select from left to right, you can not select the rightmost character.
> 2. select from rightmost to left, you can not select the rightmost character.

For example, given the logical text "abcABC" in LTR context, whose display is "abcCBA", when caret is placed rightmost at "abcCBA|", the returned offset from 
http://trac.webkit.org/changeset/74971/trunk/WebCore/rendering/InlineTextBox.cpp is swapped from 0 to 3 due to the box and its containing block are in different directionality. Select left and stops at "abcCB|A", the offset returned is 1, which is correct. The correct selection range should be from offset 0 to 1 and selects 'A', instead, the selection range is from offset 3 to 1 and "CB" is selected.

Similar when select left to right.
Comment 7 Xiaomei Ji 2011-07-06 12:10:36 PDT
Created attachment 99859 [details]
test case corresponding to comment #6
Comment 8 Ryosuke Niwa 2011-07-11 18:00:17 PDT
Fixing this bug would require us implementing some hack around FrameSelection.

To see the fundamental issue around this,
1. Open up TextEdit on Mac
2. Type "‎לשנת מיליון שקלים"
3. Set the writing direction to LTR
4. Enlarge the font size and resize the window so that each word appears on a separate line.

Place the insertion point on the right of the first line.  You can confirm that the insertion point is at offset 4 by inserting some letter or resizing the window.

Now extend the selection to the left by mouse or keyboard.  Observe that ‎"ל" is elected.  This is the first letter in the text and the only way this text can be selected was if selection starts at offset 0 and ends at 1 respectively.  In other words, TextEdit is changing the selection base from offset 4 to offset 0 when the user extends selection.

To implement this behavior in WebKit, we'd have to detect when the selection base is at the line boundary and the line's direction and block's direction differ and modify the base; we need to restore the original base when the user extends the selection back to the original base or extends the selection to a different line.
Comment 9 Ryosuke Niwa 2011-07-29 16:58:50 PDT
Created attachment 102410 [details]
work in progress

Here's work in progress.  I'd have to cleanup the code and push the changes to FrameSelection because we have to restore the original base when scripts collapse selection, etc...
Comment 10 Ryosuke Niwa 2011-07-31 00:04:16 PDT
Created attachment 102456 [details]
work in progress 2

This patch fixes basic cases.  Jeremy's test case (attachment 87325 [details]) still doesn't work due to the bug 65356.
Comment 11 Ryosuke Niwa 2011-08-19 18:56:42 PDT
Created attachment 104608 [details]
work in progress 3

Updated the patch after http://trac.webkit.org/changeset/93221.  Need to investigate why the following 2 tests fail:
editing/selection/select-bidi-run.html
editing/selection/select-from-textfield-outwards.html
Comment 12 Ryosuke Niwa 2011-08-23 10:21:40 PDT
Created attachment 104862 [details]
work in progress 4

I'm trying to make this patch look nicer by pushing code into RenderedPosition, but having a trouble determining a good abstraction layer.
Comment 13 Ryosuke Niwa 2011-09-01 12:54:03 PDT
Created attachment 106014 [details]
work in progress 5
Comment 14 Eric Seidel (no email) 2011-09-01 13:31:16 PDT
Comment on attachment 106014 [details]
work in progress 5

View in context: https://bugs.webkit.org/attachment.cgi?id=106014&action=review

> Source/WebCore/editing/RenderedPosition.cpp:118
> +        if (!boxOnLeft || boxOnLeft->bidiLevel() < box->bidiLevel()) {
> +            return AtLeftBoundary;
> +        }

{ } ?

> Source/WebCore/editing/RenderedPosition.cpp:122
> +    } else {
> +    }

?

> Source/WebCore/editing/RenderedPosition.cpp:123
> +    if (m_offset == m_inlineBox->caretRightmostOffset()) {

Should this just early return !=?

> Source/WebCore/editing/RenderedPosition.cpp:153
> +    do {

I don't think there is any advantage to using do while over while here.

> Source/WebCore/editing/RenderedPosition.cpp:196
> +/*
> +    if (atRightmostOffsetInBox())
> +        return bidiLevel() < bidiLevelOfRun && nextLeafChild() && nextLeafChild()->bidiLevel() >= bidiLevelOfRun;
> +*/

?

> Source/WebCore/editing/RenderedPosition.cpp:213
> +/*
> +    if (atLeftmostOffsetInBox())
> +        return bidiLevel() < bidiLevelOfRun && prevLeafChild() && prevLeafChild()->bidiLevel() > bidiLevel;
> +*/

We don't normally commit commented out code.

Oh, now I see, this isnt' actually up for review.
Comment 15 Ryosuke Niwa 2011-09-01 13:36:02 PDT
(In reply to comment #14)
> Oh, now I see, this isnt' actually up for review.

Yeah, I was working on this last week but I have to work on some editing/forms related regressions this week so I'm uploading it here FYI.  The patch works as is (except few edge cases I have to address) but I want to organize code more cleanly.
Comment 16 Ryosuke Niwa 2011-09-08 20:31:04 PDT
Created attachment 106833 [details]
work in progress 6

I need a few more iterations but I welcome your early design review.
Comment 17 Ryosuke Niwa 2011-09-09 18:39:21 PDT
Unfortunately, I can't get the Jeremy's test case to a satisfactory level but Safari 5 didn't handle it well either so I would not consider it as a future improvement potential.
Comment 18 Ryosuke Niwa 2011-09-09 23:10:56 PDT
Created attachment 106967 [details]
initial patch
Comment 19 WebKit Review Bot 2011-09-10 10:06:26 PDT
Comment on attachment 106967 [details]
initial patch

Attachment 106967 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9633242

New failing tests:
editing/selection/select-bidi-run.html
svg/custom/svg-fonts-word-spacing.html
Comment 20 Eric Seidel (no email) 2011-09-12 13:39:53 PDT
Comment on attachment 106967 [details]
initial patch

View in context: https://bugs.webkit.org/attachment.cgi?id=106967&action=review

> Source/WebCore/editing/FrameSelection.cpp:170
> +    RenderedPosition basePosition(base);
> +    RenderedPosition extentPosition(extent);

Seems it would be better to name these base and extent (since they're so commonly typed) and us visibleBase and visibleExtent for the less common argument names.

> Source/WebCore/editing/RenderedPosition.cpp:112
> +        m_nextLeafChild = m_inlineBox->nextLeafChild();
> +        setCached(CachedNextLeafChild);
> +    }

When do we expire these caches?

> Source/WebCore/editing/RenderedPosition.cpp:169
> +bool RenderedPosition::atLeftBoundaryOfBidiRun(ShouldMatchBidiLevel shouldMatchBidiLevel, unsigned char bidiLevelOfRun) const

I wish there were a way to share code between these two functions.

> Source/WebCore/editing/RenderedPosition.h:93
> +    mutable unsigned m_cacheStatus : 2;

Why this instead of just using -1 or some other value to indicate an uncached pointer?
Comment 21 Ryosuke Niwa 2011-09-12 13:46:02 PDT
Comment on attachment 106967 [details]
initial patch

View in context: https://bugs.webkit.org/attachment.cgi?id=106967&action=review

>> Source/WebCore/editing/RenderedPosition.cpp:112
>> +    }
> 
> When do we expire these caches?

We never do at the moment because RenderedPosition doesn't provide any member functions to modify.

>> Source/WebCore/editing/RenderedPosition.cpp:169
>> +bool RenderedPosition::atLeftBoundaryOfBidiRun(ShouldMatchBidiLevel shouldMatchBidiLevel, unsigned char bidiLevelOfRun) const
> 
> I wish there were a way to share code between these two functions.

Yeah but they call opposite left/right functions so it's hard to do. I might be able to use macro or template to do it but that'll make this code less readable.

>> Source/WebCore/editing/RenderedPosition.h:93
>> +    mutable unsigned m_cacheStatus : 2;
> 
> Why this instead of just using -1 or some other value to indicate an uncached pointer?

Each pointer is cached separately.
Comment 22 Ryosuke Niwa 2011-09-12 17:08:28 PDT
Created attachment 107115 [details]
Updated per Eric's suggestion
Comment 23 Early Warning System Bot 2011-09-12 17:23:29 PDT
Comment on attachment 107115 [details]
Updated per Eric's suggestion

Attachment 107115 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/9640961
Comment 24 Ryosuke Niwa 2011-09-12 18:02:54 PDT
Created attachment 107118 [details]
Fixed qt build
Comment 25 WebKit Review Bot 2011-09-12 18:41:08 PDT
Comment on attachment 107118 [details]
Fixed qt build

Attachment 107118 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9644416

New failing tests:
editing/selection/select-bidi-run.html
Comment 26 Eric Seidel (no email) 2011-09-20 17:21:58 PDT
Mitz and Enrica may wish to be aware of this change.  Reviewing now.
Comment 27 Eric Seidel (no email) 2011-09-20 17:23:09 PDT
(In reply to comment #20)
> (From update of attachment 106967 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=106967&action=review
> 
> > Source/WebCore/editing/FrameSelection.cpp:170
> > +    RenderedPosition basePosition(base);
> > +    RenderedPosition extentPosition(extent);
> 
> Seems it would be better to name these base and extent (since they're so commonly typed) and us visibleBase and visibleExtent for the less common argument names.

Did you disagree with this comment?
Comment 28 Eric Seidel (no email) 2011-09-20 17:32:40 PDT
Comment on attachment 107118 [details]
Fixed qt build

View in context: https://bugs.webkit.org/attachment.cgi?id=107118&action=review

> Source/WebCore/editing/FrameSelection.cpp:170
> +static void adjustEndpointsAtBidiBoundary(VisiblePosition& base, VisiblePosition& extent)
> +{
> +    RenderedPosition basePosition(base);
> +    RenderedPosition extentPosition(extent);

I still would flip this argument naming.  Argument as visibleBase, visibleExtent and locals as base, extent.

> Source/WebCore/editing/RenderedPosition.cpp:112
> +InlineBox* RenderedPosition::prevLeafChild() const
> +{
> +    if (m_prevLeafChild == uncachedInlineBox())
> +        m_prevLeafChild = m_inlineBox->prevLeafChild();
> +    return m_prevLeafChild;
> +}
> +
> +InlineBox* RenderedPosition::nextLeafChild() const
> +{
> +    if (m_nextLeafChild == uncachedInlineBox())
> +        m_nextLeafChild = m_inlineBox->nextLeafChild();
> +    return m_nextLeafChild;
> +}

Why do we need these caches?  Did this show up on a benchmark?  If they didn't, I'd rather not take the security risk associated with weak pointers.

> Source/WebCore/editing/RenderedPosition.cpp:118
> +    return (m_renderer == other.m_renderer && m_inlineBox == other.m_inlineBox && m_offset == other.m_offset)
> +        || (atLeftmostOffsetInBox() && other.atRightmostOffsetInBox() && prevLeafChild() == other.m_inlineBox)
> +        || (atRightmostOffsetInBox() && other.atLeftmostOffsetInBox() && nextLeafChild() == other.m_inlineBox);

This might be more clear written out using locals?  I'm not sure.  as-is this is unreadable to my eyes.

Also, is this some sort of equivalence? WHy the ||?  Should this be a separately named equivalence function? (like how Darin long ago suggested for comparing Position objects)?  == seems wrongly overloaded for this use.

> Source/WebCore/editing/RenderedPosition.cpp:147
> +    ASSERT_NOT_REACHED();
> +    return RenderedPosition();

I wonder if you can just remove this and the compiler will do the right thing?

> Source/WebCore/editing/RenderedPosition.h:89
> +    static InlineBox* uncachedInlineBox() { return reinterpret_cast<InlineBox*>(1); }

You might want to add a comment here about why you chose 1. (cause it's on the null page).

> Source/WebCore/editing/RenderedPosition.h:91
> +    static InlineBox* uncachedInlineBox() { return reinterpret_cast<InlineBox*>(1); }
> +    mutable InlineBox* m_prevLeafChild;
> +    mutable InlineBox* m_nextLeafChild;

I almost wonder if these shouldn't be their own separate struct.  Some sort of SiblingCache.
Comment 29 Ryosuke Niwa 2011-09-20 20:01:47 PDT
Thanks for the review.

(In reply to comment #28)
> (From update of attachment 107118 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=107118&action=review
> 
> > Source/WebCore/editing/FrameSelection.cpp:170
> > +static void adjustEndpointsAtBidiBoundary(VisiblePosition& base, VisiblePosition& extent)
> > +{
> > +    RenderedPosition basePosition(base);
> > +    RenderedPosition extentPosition(extent);
> 
> I still would flip this argument naming.  Argument as visibleBase, visibleExtent and locals as base, extent.

Done.

> > Source/WebCore/editing/RenderedPosition.cpp:112
> > +InlineBox* RenderedPosition::prevLeafChild() const
> > +{
> > +    if (m_prevLeafChild == uncachedInlineBox())
> > +        m_prevLeafChild = m_inlineBox->prevLeafChild();
> > +    return m_prevLeafChild;
> > +}
> > +
> > +InlineBox* RenderedPosition::nextLeafChild() const
> > +{
> > +    if (m_nextLeafChild == uncachedInlineBox())
> > +        m_nextLeafChild = m_inlineBox->nextLeafChild();
> > +    return m_nextLeafChild;
> > +}
> 
> Why do we need these caches?  Did this show up on a benchmark?  If they didn't, I'd rather not take the security risk associated with weak pointers.

Because I use nextLeafChild and prevLeafChild heavily and they're tree traversal (could be O(the number of boxes)).  We could get rid of it for now if you'd like.

> > Source/WebCore/editing/RenderedPosition.cpp:118
> > +    return (m_renderer == other.m_renderer && m_inlineBox == other.m_inlineBox && m_offset == other.m_offset)
> > +        || (atLeftmostOffsetInBox() && other.atRightmostOffsetInBox() && prevLeafChild() == other.m_inlineBox)
> > +        || (atRightmostOffsetInBox() && other.atLeftmostOffsetInBox() && nextLeafChild() == other.m_inlineBox);
> 
> This might be more clear written out using locals?  I'm not sure.  as-is this is unreadable to my eyes.
> 
> Also, is this some sort of equivalence? WHy the ||?  Should this be a separately named equivalence function? (like how Darin long ago suggested for comparing Position objects)?  == seems wrongly overloaded for this use.

Yes. I'm considering two RenderedPositions in two adjacent boxes to be equal.

> > Source/WebCore/editing/RenderedPosition.cpp:147
> > +    ASSERT_NOT_REACHED();
> > +    return RenderedPosition();
> 
> I wonder if you can just remove this and the compiler will do the right thing?

No, gcc will spit a compiler warning/error.

> > Source/WebCore/editing/RenderedPosition.h:89
> > +    static InlineBox* uncachedInlineBox() { return reinterpret_cast<InlineBox*>(1); }
> 
> You might want to add a comment here about why you chose 1. (cause it's on the null page).

Done.

> > Source/WebCore/editing/RenderedPosition.h:91
> > +    static InlineBox* uncachedInlineBox() { return reinterpret_cast<InlineBox*>(1); }
> > +    mutable InlineBox* m_prevLeafChild;
> > +    mutable InlineBox* m_nextLeafChild;
> 
> I almost wonder if these shouldn't be their own separate struct.  Some sort of SiblingCache.

Maybe.
Comment 30 Eric Seidel (no email) 2011-09-20 20:41:27 PDT
I'm strongly against the caches unless we know we need them.  They just add an opportunity for us to have stale pointers around.

We could also have caches which only exist long enough for your algorithm to run.  Unclear what exactly that would mean.  But that would minimize risks of the tree being mutated in a way which could cause use-after-free or other such badness.
Comment 31 Ryosuke Niwa 2011-09-20 20:50:26 PDT
(In reply to comment #30)
> I'm strongly against the caches unless we know we need them.  They just add an opportunity for us to have stale pointers around.
>
> We could also have caches which only exist long enough for your algorithm to run.  Unclear what exactly that would mean.  But that would minimize risks of the tree being mutated in a way which could cause use-after-free or other such badness.

The whole point of RenderedPosition is for it to exist between layouts / style recalc to speed up things for editing/hit testing code.  It's not meant to (and will not) survive DOM mutations or anything that trigger style recalc.
Comment 32 Eric Seidel (no email) 2011-09-20 21:27:36 PDT
If these are meant to be effemeral objects and never stored by anything (like AX), then it's more OK of course.
Comment 33 Ryosuke Niwa 2011-09-20 21:30:58 PDT
(In reply to comment #32)
> If these are meant to be effemeral objects and never stored by anything (like AX), then it's more OK of course.

Yup. RenderedPosition is meant to be used while iterating over paragraphs, etc... without modifying DOM. We're expecting to implement startOfWord, startOfLine, etc... in terms of RenderedPosition and deploy them heavily in accessibility code.
Comment 34 Jeremy Moskovich 2011-09-20 23:34:52 PDT
(In reply to comment #31)
> The whole point of RenderedPosition is for it to exist between layouts / style recalc to speed up things for editing/hit testing code.  It's not meant to (and will not) survive DOM mutations or anything that trigger style recalc.

Is this documented anywhere? If not could you add a code comment to this effect?
One of the things that makes it hard to hack on WebKit is what assumptions you can make about an object's lifetime esp. in relation to other objects.
Comment 35 Ryosuke Niwa 2011-09-20 23:37:35 PDT
(In reply to comment #34)
> Is this documented anywhere? If not could you add a code comment to this effect?
> One of the things that makes it hard to hack on WebKit is what assumptions you can make about an object's lifetime esp. in relation to other objects.

I have some out-dated document at https://docs.google.com/document/d/1vzzi_wQHO0GtLnu-1IYutUKbcpAYlHqZwQlwbfLuowM/edit?authkey=CLOa9cgN&hl=en_US&authkey=CLOa9cgN
Comment 36 Ryosuke Niwa 2011-09-21 13:22:41 PDT
Created attachment 108221 [details]
fixes the bug
Comment 37 Eric Seidel (no email) 2011-09-21 16:52:18 PDT
Comment on attachment 108221 [details]
fixes the bug

View in context: https://bugs.webkit.org/attachment.cgi?id=108221&action=review

This in general seems fine.  the == operator is dangerous, I think.  I'm convinced that the caching approach is a good one, but you really need to add some sort of WARNING at the top of RenderPosition explaining that it is not safe across layouts (or render tree modifications in general).

> Source/WebCore/editing/RenderedPosition.cpp:118
> +    return (m_renderer == other.m_renderer && m_inlineBox == other.m_inlineBox && m_offset == other.m_offset)
> +        || (atLeftmostOffsetInBox() && other.atRightmostOffsetInBox() && prevLeafChild() == other.m_inlineBox)
> +        || (atRightmostOffsetInBox() && other.atLeftmostOffsetInBox() && nextLeafChild() == other.m_inlineBox);

I still think this should be an explicit isEquivalent call, or something other than ==.  Other parts of code (HashTable for example) would expect == to not be this loose.
Comment 38 Ryosuke Niwa 2011-09-21 17:15:44 PDT
Enrica & Dan,

Please let me know by tomorrow morning if you have any other concerns.
Comment 39 WebKit Review Bot 2011-09-21 20:46:14 PDT
Comment on attachment 108221 [details]
fixes the bug

Attachment 108221 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/9796048

New failing tests:
editing/selection/select-bidi-run.html
Comment 40 Ryosuke Niwa 2011-09-26 11:29:28 PDT
Committed r95964: <http://trac.webkit.org/changeset/95964>