Bug 178960 - Consider using ICU's bidi algorithm instead of our custom one
Summary: Consider using ICU's bidi algorithm instead of our custom one
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: Text (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Myles C. Maxfield
URL:
Keywords: InRadar
: 149170 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-10-27 15:06 PDT by Myles C. Maxfield
Modified: 2021-04-07 20:20 PDT (History)
10 users (show)

See Also:


Attachments
WIP (36.18 KB, patch)
2018-02-24 19:12 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (60.32 KB, patch)
2018-02-25 13:56 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (75.17 KB, patch)
2018-02-27 01:46 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (75.29 KB, patch)
2018-02-27 01:52 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (76.40 KB, patch)
2018-02-27 22:29 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (81.90 KB, patch)
2018-03-01 01:16 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (81.89 KB, patch)
2018-03-01 16:47 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (82.36 KB, patch)
2018-03-01 18:02 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (102.17 KB, patch)
2018-03-03 14:30 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (102.02 KB, patch)
2018-03-03 14:32 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (101.86 KB, patch)
2018-03-03 15:02 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (101.91 KB, patch)
2018-03-03 15:16 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (103.19 KB, patch)
2018-03-03 17:16 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (103.13 KB, patch)
2018-03-03 17:21 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (102.96 KB, patch)
2018-03-03 19:35 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (104.36 KB, patch)
2018-03-04 19:06 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (113.89 KB, patch)
2018-03-10 02:15 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (115.17 KB, patch)
2018-03-10 16:05 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
WIP (115.17 KB, patch)
2018-03-10 18:11 PST, Myles C. Maxfield
no flags Details | Formatted Diff | Diff
Test 1 (2.37 MB, text/html)
2018-03-10 18:17 PST, Myles C. Maxfield
no flags Details
Test 2 (3.58 MB, text/html)
2018-03-10 18:18 PST, Myles C. Maxfield
no flags Details
Test 3 (2.17 MB, text/html)
2018-03-10 18:18 PST, Myles C. Maxfield
no flags Details
Test 4 (2.19 MB, text/html)
2018-03-10 18:18 PST, Myles C. Maxfield
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Myles C. Maxfield 2017-10-27 15:06:14 PDT
At one point in time, ICU's UBA was too slow for us to use.

Since then, Firefox has switched to use ICU. https://bugzilla.mozilla.org/show_bug.cgi?id=1308359

We should re-investigate the perf of ICU's UBA.
Comment 1 Radar WebKit Bug Importer 2017-10-27 15:07:05 PDT
<rdar://problem/35229655>
Comment 2 Myles C. Maxfield 2018-02-24 19:12:02 PST
Created attachment 334570 [details]
WIP
Comment 3 Myles C. Maxfield 2018-02-25 13:56:18 PST
Created attachment 334584 [details]
WIP
Comment 4 Myles C. Maxfield 2018-02-27 01:46:45 PST
Created attachment 334682 [details]
WIP
Comment 5 Myles C. Maxfield 2018-02-27 01:52:43 PST
Created attachment 334683 [details]
WIP
Comment 6 Myles C. Maxfield 2018-02-27 22:29:44 PST
Created attachment 334728 [details]
WIP
Comment 7 Myles C. Maxfield 2018-03-01 01:16:23 PST
Created attachment 334800 [details]
WIP
Comment 8 Myles C. Maxfield 2018-03-01 16:47:44 PST
Created attachment 334870 [details]
WIP
Comment 9 Myles C. Maxfield 2018-03-01 18:02:54 PST
Created attachment 334875 [details]
WIP
Comment 10 Myles C. Maxfield 2018-03-01 18:55:45 PST
Comment on attachment 334875 [details]
WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=334875&action=review

> Source/WebCore/rendering/RenderBlockLineLayout.cpp:1732
> +        InlineIterator startPosition = InlineIterator(this, this, 0);

We need to figure out how to incorporate the direction property of the block.
Comment 11 Ryosuke Niwa 2018-03-01 20:06:15 PST
Please perf-test this.
Comment 12 Myles C. Maxfield 2018-03-03 14:30:22 PST
Created attachment 334964 [details]
WIP
Comment 13 Myles C. Maxfield 2018-03-03 14:32:30 PST
Created attachment 334965 [details]
WIP
Comment 14 Myles C. Maxfield 2018-03-03 15:02:41 PST
Created attachment 334966 [details]
WIP
Comment 15 Myles C. Maxfield 2018-03-03 15:16:43 PST
Created attachment 334968 [details]
WIP
Comment 16 Myles C. Maxfield 2018-03-03 15:47:36 PST
We also have to make sure that if ICU sees multiple paragraphs in the same call to ubidi_setPara(), that each paragraph gets an appropriate base direction. IIRC ICU will look for the first strong character in each paragraph, which isn't correct for us.
Comment 17 Myles C. Maxfield 2018-03-03 17:16:25 PST
Created attachment 334970 [details]
WIP
Comment 18 Myles C. Maxfield 2018-03-03 17:21:13 PST
Created attachment 334971 [details]
WIP
Comment 19 Myles C. Maxfield 2018-03-03 19:35:36 PST
Created attachment 334974 [details]
WIP
Comment 20 Myles C. Maxfield 2018-03-04 19:06:02 PST
Created attachment 334986 [details]
WIP
Comment 21 Myles C. Maxfield 2018-03-04 23:47:23 PST
I used the following script to create a test case:

var target = document.getElementById("target");
var characters = ["\u05D0", "A", " ", "\u202A", "\u200E", "\u202D", "\u202B", "\u200F", "\u202E", "\u202C"];
var content = "";
for (let i = 0; i < 10000; ++i) {
    content = content + characters[Math.floor(Math.random() * characters.length)];
}
target.textContent = content;

Then, I copied/pasted the generated content into a raw .html file a bunch of times (so the test is consistent between runs).

Then, I timed the runtime of layoutRunsAndFloats() using mach_absolute_time(). I opened the file 20 times using DRT, and averaged the runtimes of each time layoutRunsAndFloats() was called. I then did the same thing with this patch applied.

I then did that entire sequence again once more, just to make sure.

This patch reports to be a 29% progression.
Comment 22 Myles C. Maxfield 2018-03-04 23:54:22 PST
If you remove the start-up calls of layoutRunsAndFloats() (because these start-up calls will do a bunch of extra work like width measurements and font lookups, which will be cached on successive calls), this patch reports to be a 33% progression.
Comment 23 Myles C. Maxfield 2018-03-05 00:09:56 PST
Oh, the test I ran uses a single element with lots of text inside it. We should also test the case of lots of inline elements, each with only a few characters inside it.
Comment 24 Myles C. Maxfield 2018-03-05 09:14:10 PST
We should also run performance tests on content that is all one direction (like regular English or Hebrew text with spaces and punctuation).
Comment 25 Myles C. Maxfield 2018-03-05 22:33:50 PST
We should also run some performance tests on Hebrew Wikipedia
Comment 26 Myles C. Maxfield 2018-03-10 02:15:06 PST
Created attachment 335508 [details]
WIP
Comment 27 Myles C. Maxfield 2018-03-10 02:18:30 PST
More microbenchmark performance numbers about this patch:

One long element with tons of bidi control characters: ~40% progression
Many elements, each having only a few characters (but overall many bidi control characters): ~20% progression
Many elements, each of which holds either whitespace or strong LTR characters: about the same (within the noise)
Comment 28 Myles C. Maxfield 2018-03-10 16:05:01 PST
Created attachment 335521 [details]
WIP
Comment 29 Myles C. Maxfield 2018-03-10 18:11:09 PST
Created attachment 335524 [details]
WIP
Comment 30 Myles C. Maxfield 2018-03-10 18:16:55 PST
Alright, I've improved this latest patch as much as I think I can. The final microbenchmark results are:

Test 1: One long element, tons of bidi control characters: 23% progression
Test 2: Many elements, tons of bidi control characters: About the same (within the noise)
Test 3: Many elements, no control characters, all LTR or whitespace: About the same (within the noise)
Test 4: Many elements, almost all LTR or whitespace, but with (on average) one RTL character per line: 16% regression

I'll attach the .html files for the tests.
Comment 31 Myles C. Maxfield 2018-03-10 18:17:52 PST
Created attachment 335525 [details]
Test 1
Comment 32 Myles C. Maxfield 2018-03-10 18:18:04 PST
Created attachment 335527 [details]
Test 2
Comment 33 Myles C. Maxfield 2018-03-10 18:18:20 PST
Created attachment 335528 [details]
Test 3
Comment 34 Myles C. Maxfield 2018-03-10 18:18:34 PST
Created attachment 335529 [details]
Test 4
Comment 35 Myles C. Maxfield 2018-03-11 00:38:10 PST
The results of the standard Page Load Test are within the noise.
Comment 36 Ryosuke Niwa 2018-03-12 11:41:05 PDT
Please also test typing & editing Hebrew & Arabic text with Arabic numerals in Latin script (1, 2, 3, etc..) on a slow phone to make sure we're not regressing the editing experience.
Comment 37 Myles C. Maxfield 2018-08-27 11:47:04 PDT
*** Bug 149170 has been marked as a duplicate of this bug. ***