At one point in time, ICU's UBA was too slow for us to use. Since then, Firefox has switched to use ICU. https://bugzilla.mozilla.org/show_bug.cgi?id=1308359 We should re-investigate the perf of ICU's UBA.
<rdar://problem/35229655>
Created attachment 334570 [details] WIP
Created attachment 334584 [details] WIP
Created attachment 334682 [details] WIP
Created attachment 334683 [details] WIP
Created attachment 334728 [details] WIP
Created attachment 334800 [details] WIP
Created attachment 334870 [details] WIP
Created attachment 334875 [details] WIP
Comment on attachment 334875 [details] WIP View in context: https://bugs.webkit.org/attachment.cgi?id=334875&action=review > Source/WebCore/rendering/RenderBlockLineLayout.cpp:1732 > + InlineIterator startPosition = InlineIterator(this, this, 0); We need to figure out how to incorporate the direction property of the block.
Please perf-test this.
Created attachment 334964 [details] WIP
Created attachment 334965 [details] WIP
Created attachment 334966 [details] WIP
Created attachment 334968 [details] WIP
We also have to make sure that if ICU sees multiple paragraphs in the same call to ubidi_setPara(), that each paragraph gets an appropriate base direction. IIRC ICU will look for the first strong character in each paragraph, which isn't correct for us.
Created attachment 334970 [details] WIP
Created attachment 334971 [details] WIP
Created attachment 334974 [details] WIP
Created attachment 334986 [details] WIP
I used the following script to create a test case: var target = document.getElementById("target"); var characters = ["\u05D0", "A", " ", "\u202A", "\u200E", "\u202D", "\u202B", "\u200F", "\u202E", "\u202C"]; var content = ""; for (let i = 0; i < 10000; ++i) { content = content + characters[Math.floor(Math.random() * characters.length)]; } target.textContent = content; Then, I copied/pasted the generated content into a raw .html file a bunch of times (so the test is consistent between runs). Then, I timed the runtime of layoutRunsAndFloats() using mach_absolute_time(). I opened the file 20 times using DRT, and averaged the runtimes of each time layoutRunsAndFloats() was called. I then did the same thing with this patch applied. I then did that entire sequence again once more, just to make sure. This patch reports to be a 29% progression.
If you remove the start-up calls of layoutRunsAndFloats() (because these start-up calls will do a bunch of extra work like width measurements and font lookups, which will be cached on successive calls), this patch reports to be a 33% progression.
Oh, the test I ran uses a single element with lots of text inside it. We should also test the case of lots of inline elements, each with only a few characters inside it.
We should also run performance tests on content that is all one direction (like regular English or Hebrew text with spaces and punctuation).
We should also run some performance tests on Hebrew Wikipedia
Created attachment 335508 [details] WIP
More microbenchmark performance numbers about this patch: One long element with tons of bidi control characters: ~40% progression Many elements, each having only a few characters (but overall many bidi control characters): ~20% progression Many elements, each of which holds either whitespace or strong LTR characters: about the same (within the noise)
Created attachment 335521 [details] WIP
Created attachment 335524 [details] WIP
Alright, I've improved this latest patch as much as I think I can. The final microbenchmark results are: Test 1: One long element, tons of bidi control characters: 23% progression Test 2: Many elements, tons of bidi control characters: About the same (within the noise) Test 3: Many elements, no control characters, all LTR or whitespace: About the same (within the noise) Test 4: Many elements, almost all LTR or whitespace, but with (on average) one RTL character per line: 16% regression I'll attach the .html files for the tests.
Created attachment 335525 [details] Test 1
Created attachment 335527 [details] Test 2
Created attachment 335528 [details] Test 3
Created attachment 335529 [details] Test 4
The results of the standard Page Load Test are within the noise.
Please also test typing & editing Hebrew & Arabic text with Arabic numerals in Latin script (1, 2, 3, etc..) on a slow phone to make sure we're not regressing the editing experience.
*** Bug 149170 has been marked as a duplicate of this bug. ***