Bug 37884 - In HOME/END operations, cursor goes wrong direction in mixed RTL-LTR
Summary: In HOME/END operations, cursor goes wrong direction in mixed RTL-LTR
Status: RESOLVED DUPLICATE of bug 49107
Alias: None
Product: WebKit
Classification: Unclassified
Component: HTML Editing (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Windows 7
: P3 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-20 13:04 PDT by Elliot Block
Modified: 2010-11-05 16:01 PDT (History)
7 users (show)

See Also:


Attachments
Example of mixed RTL+LTR in an RTL context. (356 bytes, text/html)
2010-04-20 13:04 PDT, Elliot Block
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Elliot Block 2010-04-20 13:04:39 PDT
Created attachment 53870 [details]
Example of mixed RTL+LTR in an RTL context.

Overview:

When you have mixed RTL-LTR text, cursoring uses the single direction of the containing element, rather than the individual directions of the text runs themselves.

(Unicode specifies a bidirectional algorithm for how to handle this, and the *display* of text works fine, it's just the keyboard cursoring that is not quite right)


Steps to Reproduce:


1) Make an editable RTL element with mixed RTL+LTR text in it.

e.g. <input type="text" style="direction: rtl;" size="40" value="&#1513;&#1464;&#1473;&#1500;&#1493;&#1465;&#1501;hello"/>


2) Put your cursor in the Hebrew text, which is correctly at the right hand beginning side of the text.


3) Press END


Actual Result:

- The cursor jumps to the *beginning* of "hello" at the far left side of all the text


Expected Result:

- The cursor jumps to the *end* of "hello", which is in the middle visually, between the RTL and the LTR.
- FF, IE and RTL Windows in general appear to get this right


Alternate Example of the Problem:

3) Instead of pressing END, press LEFT to go *forward* through the RTL text.  Keep pressing LEFT to go forward, even when you are in LTR text. 


Actual Result:

- When the cursor gets to the "o" in "hello", it proceeds through the English word *backwards*
i.e. the cursor is at "o", "l", "l", "e", "h"


Expected Result:

- When the cursor gets to the LTR text, it realizes the beginning of the LTR text is over to the left starting with "h", not "o".  The cursor jumps to the left and proceeds through "h" "e" "l" "l" "o" in response to the LEFT key.

- FF doesn't get this right, but Windows, Word, IE, etc.. do.


Build Date & Platform:
- Win7 Safari 4.0.5 (531.22.7)
- Win7 Chrome 4.1.249.1045


Additional Information:

The mixed direction characters display right, which is great, but the cursoring code doesn't seem to have the same level of knowledge of the bi-directional algorithm.  Please see below for more information:

http://www.w3.org/TR/html401/struct/dirlang.html#h-8.2.1


Thanks!
Comment 1 Xiaomei Ji 2010-09-24 11:50:32 PDT
The HOME/END issue was fixed in https://bugs.webkit.org/show_bug.cgi?id=24168.

From the description in comment #3, apparently the old behavior is not correct (look at the case of "IHG def FED xyz CBA").

The logic of the fix is correct. 
And from comment #3, looks like for "def FED 123 CBA abc" (similar to the test case uploaded here), the beginning of the first logical node "abc" is at the position before 'c' (between 'c' and dummy_RTL). That is probably why we returned the caret there when press HOME.
Same for END key.

I need to do further investigation on how to get the same behavior as Firefox.
Before that, I would like to get expert's opinion that the Firefox's behavior is correct/desired.

As to the arrow key operation, there is pro and con on whether it should follow visual order or logical order. IE is the only browser follows logical order. Other browsers choose to follow visual order.
Comment 2 Xiaomei Ji 2010-10-28 12:47:44 PDT
The arrow key movement works as intended, so I am changing this bug to HOME/END related.

I swear (with 2 witness) that my Firefox 3.6.12 (even after I re-installed it) and Firefox 4 beta in Windows work the same as Chrome, wrongly. That is what I compared my fix in https://bugs.webkit.org/show_bug.cgi?id=24168 with.

The expected behavior described in bug description is correct because HOME/END key follows logic order.
Comment 3 Xiaomei Ji 2010-10-28 18:36:09 PDT
I tested in Firefox (in Windows) before, and the behavior was the same as Chromium's current behavior.
I do not know what went wrong (Firefox got updated or I tested in a wrong way?).
And I am even puzzled on why my Firefox (3.6.12 Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12) in Windows Vista still behaves the same as Chromium while others' Firefox work the same as the expected result in this bug description. My bidi.edit.caret_movement_style is set as 2 (the default).

I think the expected behavior describe in this bug is a better behavior.
But I would like to get the confirmation from Dan, Uri, and Aharon.

And I am attaching an email exchange between Uri and me 1+ year ago when I was fixing the HOME/END issue.

==== question from me =========

I read http://www-01.ibm.com/software/globalization/topics/bidiui/index.jsp

Looks like Home/End key are logical key, for both pure RTL text or for mixed RTL/LTR text, they should always move the cursor to the logical beginning/ending of the line.

But I tried the following in Firefox:
1. Open Gmail
2. Switch Gmail UI language to Hebrew
3. Compose a new email
4. Type the following 4 line text
4.1 one pure English line text, 
4.2 one Pure RTL text, 
4.3 one mixed RTL text by typing Hebrew, then English
then Hebrew
4.4   one mixed RTL text by typing English, then Hebrew, Then English

Home key alway move cursor to the very right of the line, and End key always move cursor to the very left of the line, which is a visual order (in RTL enviroment)

In English Gmail, seems the Home/End key does not always follow logical order either.

============ From Uri (Thanks Uri for his always detailed explanation) ======

First, let me clarify that while the IBM spec was the basis of the original implementation in Mozilla, I don't think Mozilla (or anyone else, for that matter) ever followed it 100%, and it certainly does not do so today.

That said, in this case I think Firefox actually does follow the IBM spec. Let's take the case of pressing "Home" in an all-LTR (Latin) line in RTL context (as in your section 4.1). 

Following the instructions on http://www-01.ibm.com/software/globalization/topics/bidiui/conversion.jsp:

The special case of the Home and End positions can be solved by handling it as if there was a dummy character with a level equal to the paragraph embedding level before the line [...]

So we assume the line has a dummy RTL (Hebrew) character, followed by a string of LTR characters. The dummy RTL character has a bidi embeddig level of 1, and the characters of the English text (specifically, the first [leftmost] one), has a bidi embedding level of 2.

After Home, End and Newline, the cursor level is set to the paragraph embedding level.

The paragraph embedding level in our example is 1 (basic RTL), so that's what the "cursor level" is set to
If the cursor level is equal to the level of the previous character, the cursor must be displayed after this character ("after" is on the right for even levels, on the left for odd levels).
So the level of the previous (dummy) character is 1, and the level of the next character is 2. The cursor level, as we said is 1, that is, equal to that of the previous "character", and therefore the cursor should be placed after that dummy character. Now the first dummy RTL character, had it been real, would have appeared on the rightmost edge of the line (if you don't trust me on this I can explain). Therefore, the cursor should be displayed immediately after it, that is, still, at the rightmost edge of the line. 

An attempt for an intuitive explanation why this is "correct": For example, in the case of an all-LTR line in RTL context, the logical beginning of the line could be mapped to either the left or right of the line. The left of the line "makes sense" as the logical beginning if you're going to type in more LTR - the new LTR char you type should appear to the left of the leftmost current LTR char. However, since the overall context is RTL, it's equally plausible (and perhaps more so) that you're going to type an RTL character. If you do so, the RTL character will appear to the right of the rightmost current LTR character. Therefore placing the cursor at the right edge of the line is a fair representation of the logical beginning if you assume RTL is what's going to be typed next.

=======================================================

Note (It is for the fix and I might be wrong, so skip if you are not interested): 
Maybe we could first check "If the cursor level is equal to the level of the next character, the cursor must be displayed before this character ("before" is on the left for even levels, on the right for odd levels)." before check "If the cursor level is equal to the level of the previous character, the cursor must be displayed after this character ("after" is on the right for even levels, on the left for odd levels)." to place the caret correctly in strict logical order.
Comment 4 Xiaomei Ji 2010-11-05 16:01:52 PDT
Per suggestion, I opened a new bug (49107) which is solely on HOME/END and closed this one.

*** This bug has been marked as a duplicate of bug 49107 ***