Bug 5172

Summary: Bidi algorithm not applied correctly across line breaks
Product: WebKit Reporter: mitz
Component: Layout and RenderingAssignee: Dave Hyatt <hyatt>
Status: VERIFIED FIXED    
Severity: Normal CC: eric
Priority: P2    
Version: 420+   
Hardware: Mac   
OS: OS X 10.4   
Bug Depends on: 5532    
Bug Blocks: 5840    
Attachments:
Description Flags
testcase
none
better testcase
none
work in progress
none
Proposed patch
none
Alternative patch
none
Revised patch hyatt: review+

mitz
Reported 2005-09-28 14:07:48 PDT
Summary: WebKit sometimes assigns incorrect bidi embedding levels to text runs immediately before or immediately after a line break, resulting in incorrect line order. To reproduce: Open the testcase in Safari. Expected: the Test column to match the Reference column. Firefox 1.0.4 behaves as expected. Actual: in rows 2, 4 and 10, the columns mismatch. Analysis: When a line ends, it is sometimes impossible to know the embedding level for the last run in the line without looking ahead into the next line. When a line begins, the embedding level of the first run may depend on the last run in the previous line. Currently, WebKit treats each line essentially as a separate paragraph, using the embedding direction for sor and eor.
Attachments
testcase (2.20 KB, text/html)
2005-09-28 14:08 PDT, mitz
no flags
better testcase (2.99 KB, text/html)
2005-09-30 16:21 PDT, mitz
no flags
work in progress (22.76 KB, patch)
2005-11-04 08:24 PST, mitz
no flags
Proposed patch (35.96 KB, patch)
2005-11-05 09:40 PST, mitz
no flags
Alternative patch (36.59 KB, patch)
2005-11-05 10:11 PST, mitz
no flags
Revised patch (38.25 KB, patch)
2005-11-06 16:29 PST, mitz
hyatt: review+
mitz
Comment 1 2005-09-28 14:08:19 PDT
Created attachment 4083 [details] testcase
mitz
Comment 2 2005-09-30 16:07:37 PDT
Using <br/> to force line breaks in the testcase was a bad idea, since IE and NSTextView actually treat line separators like paragraph separators in terms of the bidi algorithm. Firefox treats them as whitespace (which is according to spec). WebKit should probably match IE and the rest of OS X. I should make a testcase where lines wrap naturally.
mitz
Comment 3 2005-09-30 16:21:16 PDT
Created attachment 4121 [details] better testcase
mitz
Comment 4 2005-11-04 08:24:58 PST
Created attachment 4593 [details] work in progress This doesn't address partial relayout (the bidi status and context at the end of each line need to be cached for that to work) and currently treats <br> as whitespace.
mitz
Comment 5 2005-11-05 09:40:01 PST
Created attachment 4607 [details] Proposed patch
mitz
Comment 6 2005-11-05 10:11:47 PST
Created attachment 4608 [details] Alternative patch
Darin Adler
Comment 7 2005-11-05 17:22:09 PST
Comment on attachment 4608 [details] Alternative patch I would suggest using SharedPtr for m_lineBreakContext and for the context field of BidiState. This patch looks greate otherwise, although I think the real proof is in the test cases, r=me.
Dave Hyatt
Comment 8 2005-11-06 13:51:48 PST
+inline bool operator!=(const BidiStatus& status1, const BidiStatus& status2) +{ + return status1.eor != status2.eor || status1.last != status2.last || status1.lastStrong != status2.lastStrong; +} + Couldn't this just be implemented in terms of the == operator? Also, this patch increases the size of all root lines by 4 bytes. Is there any way to avoid this?
Dave Hyatt
Comment 9 2005-11-06 13:53:48 PST
+ if (previousLineBrokeCleanly) { + // A deviation from the Unicode Bidi Algorithm in order to match + // Mac OS X text and WinIE: a hard line break resets bidi state. Did you verify this with both preformatted text newlines as well as brs?
mitz
Comment 10 2005-11-06 14:03:16 PST
(In reply to comment #8) > Couldn't this just be implemented in terms of the == operator? Not when partial relayout takes place: you pick up the context from the last clean line and during relayout of subsequent lines you need to push a new context onto the stack, then you reach the first clean line with a context that's different pointer-wise from the saved context for that line, but it's actually the same. The "Proposed patch" uses pointer comparison (that's the only difference between it and the alternative). It might save a little on the comparisons but cost more in doing unnecessary relayout of all the lines up to the end of the block. > Also, this patch increases the size of all root lines by 4 bytes. Is there any way to avoid this? I counted more than 4 bytes (the BidiContext* and the BidiStatus which is at least 3 bytes). As far as I can tell, all of this is necessary if you want to start relayout in the middle.
mitz
Comment 11 2005-11-06 14:13:08 PST
(In reply to comment #9) > Did you verify this with both preformatted text newlines as well as brs? Oops, WinIE does that with preformatted text newlines but the patch doesn't. I expected previousLineBrokeCleanly to be true in that case as well but it's not :-/
mitz
Comment 12 2005-11-06 14:35:51 PST
(In reply to comment #8) Oops again! In comment #10 I answered a totally different question from what I was asked! Yes, != can be implemented in terms of ==. I just followed the BidiIterator != example.
Dave Hyatt
Comment 13 2005-11-06 15:57:21 PST
Can you not infer the appropriate information based off the cached line break object? Does that not give you enough information to effectively recover the bidi state?
mitz
Comment 14 2005-11-06 16:14:33 PST
(In reply to comment #13) > Can you not infer the appropriate information based off the cached line break object? Does that not give > you enough information to effectively recover the bidi state? No. The stack of embedding levels at the end of the line depends on all the embedding changes prior to that point. Also, the last strong character may have appeared before the last object on the line, and the direction at eor may not be possible to determine based on that object alone.
mitz
Comment 15 2005-11-06 16:29:07 PST
Created attachment 4616 [details] Revised patch Changed to treat newlines in preformatted text like brs, redefined != using ==, and changed to use SharedPointer where Darin has suggested doing so.
Dave Hyatt
Comment 16 2005-11-06 17:22:36 PST
Hmmm, actually I'm a bit surprised I didn't set "previousLineBrokeCleanly" in the preformatted text case. That is probably worth investigating. The variable name certainly sounds like something you'd expect to be set in the preformatted text case as well.
Dave Hyatt
Comment 17 2005-11-08 15:17:06 PST
Comment on attachment 4616 [details] Revised patch r=me, i encourage you to test with all five white-space values and with br and \n (so 10 total combinations).
Eric Seidel (no email)
Comment 18 2005-11-27 13:28:12 PST
*** Bug 5840 has been marked as a duplicate of this bug. ***
Note You need to log in before you can comment on or make changes to this bug.