Bug 14626

Summary: Make bidiReorderCharacters independent of RenderBlock
Product: WebKit Reporter: mitz
Component: TextAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: aroben, hyatt, sam
Priority: P2 Keywords: InRadar
Version: 523.x (Safari 3)   
Hardware: All   
OS: OS X 10.4   
Bug Depends on:    
Bug Blocks: 14684    
Attachments:
Description Flags
[WIP] Move the bidi algorithm implementation into a static function
none
[WIP] Arbitrary checkpoint
none
[WIP] Take bidiReorderCharacters out of RenderBlock
none
[WIP] Take bidiReorderCharacters out of RenderBlock
none
bidiReorderLine perf test
none
Take bidiReorderCharacters out of RenderBlock
none
Take bidiReorderCharacters out of RenderBlock
darin: review-
Take bidiReorderCharacters out of RenderBlock
none
Take bidiReorderCharacters out of RenderBlock darin: review+

Description mitz 2007-07-15 16:27:43 PDT
Refactor bidiReorderLine so that applying the bidi algorithm to a string would not require allocating temporary renderers and RenderStyle, and therefore also would not require a document and a view. This should make bidiReorderCharacters more suitable for use in WebCoreTextRenderer.
Comment 1 mitz 2007-07-15 16:37:27 PDT
Created attachment 15526 [details]
[WIP] Move the bidi algorithm implementation into a static function

This patch turns the RenderBlock member function bidiReorderLine() into a thin wrapper around a new static function, createBidiRunsForLine() whose arguments are start and end bidi iterators and a BidiState. Thus the static function doesn't deal with renderers and render style. The patch also makes the bidi context part of the bidi state.

The next step should be to allow the bidi iterators to be interchanged with iterators that do not have an underlying render tree. I think that might prove tricky to do without hurting performance.
Comment 2 mitz 2007-07-16 15:32:12 PDT
Created attachment 15539 [details]
[WIP] Arbitrary checkpoint

BidiState is now a specialization of BidiResolver.

Aside from the pressing need to reorganize the different parts into different files, the issue I am currently facing is how to manage the list of bidi runs generated by the resolver. At this time, the sFirstBidiRun, sLastBidiRun etc. statics are still (ab)used for both BidiRuns and BidiRunBases (a working title), although I began to change it with the addition of StringBidiResolver::firstRun().

The author's lack of understanding of C++ is showing.
Comment 3 mitz 2007-07-17 10:15:17 PDT
Created attachment 15545 [details]
[WIP] Take bidiReorderCharacters out of RenderBlock
Comment 4 mitz 2007-07-17 11:06:54 PDT
Created attachment 15546 [details]
[WIP] Take bidiReorderCharacters out of RenderBlock

Fixed a test crash.
Comment 5 mitz 2007-07-17 12:59:08 PDT
Created attachment 15549 [details]
bidiReorderLine perf test

The latest version of the patch makes this test take ~0.5% longer.
Comment 6 mitz 2007-07-17 15:07:48 PDT
Inlining BidiState::addRun() helps a little. I couldn't make sense of a Shark comparison between patched and unpatched.

Surprisingly, there is a test case in which the patch improves performance, involving a single line with a huge number of runs.
Comment 7 mitz 2007-07-17 15:25:53 PDT
...and removing this leftover bit

+    virtual ~BidiResolver() { }

results in performance (in the attached test case) comparable to the unpatched version. (I hope there is a legitimate explanation for this).
Comment 8 mitz 2007-07-18 10:14:49 PDT
Created attachment 15562 [details]
Take bidiReorderCharacters out of RenderBlock

This is more or less what I would like to do as a first (and biggest) step. After this separation of the render-tree case and the character buffer case is done, I think you could add bidi-aware string drawing functionality to the graphics layer (and expose it as WebKit SPI as necessary), and -- quite independently -- continue cleaning up bidi.cpp (getting rid of the remaining file statics).

No layout test regressions.

In terms of performance, I used both attachment 15549 [details] and a PLT suite consisting of all fast/text tests save one, and I have reached a point where the numbers are better than TOT, and any further tweaking that I do does not change the numbers significantly.
Comment 9 Sam Weinig 2007-07-18 10:47:30 PDT
Preliminary pass of the code.

in BidiContext.h:

Please don't do this in the header.
+using namespace WTF::Unicode;

Since this is ref-counted, can this inherit from Shared instead of doing its own?
+class BidiContext {

in BidiContext.cpp:

It seems arbitrary to put this in the .cpp file when so much implementation is BidiResolver.h

in BidiResolver.h:

Why not put each of the classes in this file into it's own header?

Please don't do this in the header.
+using namespace WTF::Unicode;

Put a space after the if.
+            if(dir == LeftToRight || dir == ArabicNumber || dir == EuropeanNumber)

Put spaces around the %'s in few places including
+        if (runDir == RightToLeft) {
+            if (level%2) // we have an odd level
Comment 10 mitz 2007-07-18 12:39:24 PDT
Created attachment 15566 [details]
Take bidiReorderCharacters out of RenderBlock

(In reply to comment #9)
> Please don't do this in the header.
> +using namespace WTF::Unicode;

OK.

> Since this is ref-counted, can this inherit from Shared instead of doing its
> own?
> +class BidiContext {

Thanks, now it does.

> in BidiContext.cpp:
> 
> It seems arbitrary to put this in the .cpp file when so much implementation is
> BidiResolver.h

Moved to BidiContext.h.

> in BidiResolver.h:
> 
> Why not put each of the classes in this file into it's own header?

Because they are small and those headers will not be included directly by any file besides BidiResolver.h.

> Please don't do this in the header.
> +using namespace WTF::Unicode;

OK.

> Put a space after the if.
> +            if(dir == LeftToRight || dir == ArabicNumber || dir ==
> EuropeanNumber)

Done for all "if(", but otherwise style cleanup of existing code is outside the scope of this patch.

> Put spaces around the %'s in few places including
> +        if (runDir == RightToLeft) {
> +            if (level%2) // we have an odd level
> 

Ditto.

Renamed BidiRunBase to BidiCharacterRun. Note that in this patch I am not renaming existing bidi.cpp types such as BidiIterator, BidiState and BidiRun, although eventually I would like to (at which time BidiStatus could be renamed BidiState or BidiResolverState).
Comment 11 mitz 2007-07-18 12:40:30 PDT
Comment on attachment 15566 [details]
Take bidiReorderCharacters out of RenderBlock

Note that the ChangeLog is out of date.
Comment 12 Darin Adler 2007-07-18 19:46:59 PDT
Comment on attachment 15566 [details]
Take bidiReorderCharacters out of RenderBlock

+static bool operator==(const BidiContext& c1, const BidiContext& c2)

This should have external linkage. If this function isn't going to be inlined, then it should be in a .cpp file instead of a .h file. Or if it should be inlined, mark it inline and leave it in the header. But remove "static" no matter which way you go.

+    unsigned offset() { return m_offset; }

Mark this const?

+#ifndef BidiResolver_h
+#define BidiResolver_h
+
+#include "BidiContext.h"
+#include <wtf/Assertions.h>
+#include <wtf/PassRefPtr.h>
+#include <wtf/RefPtr.h>
+#include <wtf/unicode/Unicode.h>

There are some rendundant includes here. No need to include something that's already included by BidiContext.h.

+class BidiIterator;
+class BidiRun;
+template <class Iterator, class Run> class BidiResolver;
+typedef BidiResolver<BidiIterator, BidiRun> BidiState;
 class Position;
+class RootInlineBox;

I'd suggest putting the template declaration and the typedef in a separate section, after the plain old class declarations. I've done the same for struct declarations in the past.

-    void bidiReorderLine(const BidiIterator& start, const BidiIterator& end, BidiState& bidi);
+    void bidiReorderLine(const BidiIterator& start, const BidiIterator& end, BidiState&);

You removed the name "bidi" from this, but not from the next declaration.

+struct BidiRun : public BidiCharacterRun {

For struct, inheritance is public by default. So I suggest omitting "public" when a struct inherits.

I suppose we should be renaming "bidi.h" to "BidiRun.h" in a future pass given that's the only declaration in there now. Not sure what to do about "bidi.cpp".

I'm not sure this is a convenient time for this huge improvement, but the possibility of fixing some bidi bugs on Windows seems one great reason to do it. Setting review- because I know you well enough to know you'll want to take another pass on this and write a new ChangeLog.

(For selfish Apple reasons it would be good to get this checked in by Friday morning. After that time we may have to branch if there are any major changes on the trunk.)
Comment 13 mitz 2007-07-19 03:45:47 PDT
Created attachment 15578 [details]
Take bidiReorderCharacters out of RenderBlock

(In reply to comment #12)
> +static bool operator==(const BidiContext& c1, const BidiContext& c2)
> 
> This should have external linkage. If this function isn't going to be inlined,
> then it should be in a .cpp file instead of a .h file. Or if it should be
> inlined, mark it inline and leave it in the header. But remove "static" no
> matter which way you go.

Moved it back to BidiContext.cpp. (Inlining it had a negative impact on my performance numbers, perhaps because doing so negates the inlining of the BidiStatus == operator. There may be room for further optimization here in the future).

I reverted the change to make BidiContext inherit from Shared<BidiContext> as it resulted in a significant performance regression. Again, there may be a well-performing way to do it in the future, but for now I would rather just leave existing code as it was.

> +    unsigned offset() { return m_offset; }
> 
> Mark this const?

Done.

> +#ifndef BidiResolver_h
> +#define BidiResolver_h
> +
> +#include "BidiContext.h"
> +#include <wtf/Assertions.h>
> +#include <wtf/PassRefPtr.h>
> +#include <wtf/RefPtr.h>
> +#include <wtf/unicode/Unicode.h>
> 
> There are some rendundant includes here. No need to include something that's
> already included by BidiContext.h.

Removed 2 of the 5 #includes.

> +class BidiIterator;
> +class BidiRun;
> +template <class Iterator, class Run> class BidiResolver;
> +typedef BidiResolver<BidiIterator, BidiRun> BidiState;
>  class Position;
> +class RootInlineBox;
> 
> I'd suggest putting the template declaration and the typedef in a separate
> section, after the plain old class declarations. I've done the same for struct
> declarations in the past.

Done.

> 
> -    void bidiReorderLine(const BidiIterator& start, const BidiIterator& end,
> BidiState& bidi);
> +    void bidiReorderLine(const BidiIterator& start, const BidiIterator& end,
> BidiState&);
> 
> You removed the name "bidi" from this, but not from the next declaration.

Fixed.

> +struct BidiRun : public BidiCharacterRun {
> 
> For struct, inheritance is public by default. So I suggest omitting "public"
> when a struct inherits.

Done.

> I suppose we should be renaming "bidi.h" to "BidiRun.h" in a future pass given
> that's the only declaration in there now. Not sure what to do about "bidi.cpp".

I think the RenderBlock methods currently in bidi.cpp should end up in RenderBlock.cpp. Then bidi.cpp and bidi.h will be about what's now known as BidiState (which should be renamed; I like the name RenderBidiResolver but I think names beginning with "Render" are reserved for classes that inherit from RenderObject, so maybe BlockBidiResolver).

> I'm not sure this is a convenient time for this huge improvement, but the
> possibility of fixing some bidi bugs on Windows seems one great reason to do
> it.

I should mention that there are two other ways to address (if not completely resolve) the bidi bugs on Windows that are possibly less risky:

1) Since ICU is available on Windows, use its implementation of the bidi algorithm. Alternatively, use some other implementation available on the platform or bring in some other open-source implementation.

2) Do something similar to what -[NSString _web_drawAtPoint:font:textColor:] does, namely implement canUseFastRenderer() and as the "slow renderer" still use WebCore, but temporarily force it onto the complex string drawing path. While you are not really supposed to use the WebCore renderer for mixed-directionality text, in all but some contrived cases (which possibly cannot ever occur in the intended use), the "complex" code path will do the right thing.
Comment 14 mitz 2007-07-20 02:44:54 PDT
Created attachment 15597 [details]
Take bidiReorderCharacters out of RenderBlock

The only difference from the last patch is that I patched PopupMenuWin.cpp, which was using also RenderBlock::bidiReorderCharacters.
Comment 15 Darin Adler 2007-07-20 10:09:33 PDT
Comment on attachment 15597 [details]
Take bidiReorderCharacters out of RenderBlock

r=me
Comment 16 Matt Lilek 2007-07-20 12:13:53 PDT
Landed in r24485.
Comment 17 Mark Rowe (bdash) 2007-08-02 08:41:54 PDT
<rdar://problem/5380587>
Comment 18 mitz 2007-08-18 07:20:35 PDT
*** Bug 12536 has been marked as a duplicate of this bug. ***