234380 – [LFC][IFC] Simple RTL content may need visual reordering

RESOLVED FIXED 234380

[LFC][IFC] Simple RTL content may need visual reordering

https://bugs.webkit.org/show_bug.cgi?id=234380

Summary [LFC][IFC] Simple RTL content may need visual reordering

alan

Reported 2021-12-15 19:36:41 PST

as it turns out there are certain unicode categories that could make the simple (8bit) content visually reordered.

Attachments
Patch (3.69 KB, patch) 2021-12-15 19:40 PST, alan	no flags	Details Formatted Diff Diff
Patch (2.88 KB, patch) 2021-12-16 07:56 PST, alan	no flags	Details Formatted Diff Diff
Patch (2.42 KB, patch) 2021-12-16 19:16 PST, alan	no flags	Details Formatted Diff Diff
Patch (2.41 KB, patch) 2021-12-17 06:17 PST, alan	no flags	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

alan

Comment 1 2021-12-15 19:40:09 PST

Created attachment 447318 [details] Patch

Antti Koivisto

Comment 2 2021-12-16 06:20:45 PST

Comment on attachment 447318 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=447318&action=review > Source/WebCore/layout/formattingContexts/inline/InlineItemsBuilder.cpp:279 > + , paragraphContentBuilder.is8Bit() ? StringView(paragraphContentBuilder).upconvertedCharacters() : paragraphContentBuilder.characters16() I think you can just call upconvertedCharacters unconditionally. It handles all cases efficiently. I'd keep this call as a single and use variables if needed.

alan

Comment 3 2021-12-16 07:56:46 PST

Created attachment 447355 [details] Patch

EWS

Comment 4 2021-12-16 09:37:49 PST

Committed r287142 (245326@main): <https://commits.webkit.org/245326@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 447355 [details].

Radar WebKit Bug Importer

Comment 5 2021-12-16 09:38:17 PST

<rdar://problem/86580841>

Darin Adler

Comment 6 2021-12-16 13:24:49 PST

Comment on attachment 447355 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=447355&action=review > Source/WebCore/layout/formattingContexts/inline/InlineItemsBuilder.cpp:279 > + , StringView(paragraphContentBuilder).upconvertedCharacters() Is the lifetime OK here? I think it might not be. As soon as the ubidi_setPara function is done, the temporary buffer containing the unconverted characters will be deallocated. Also worried about the performance implications; the upconvertedCharacters code path is not necessarily the most efficient way to get 16-bit characters for a purpose like this.

alan

Comment 7 2021-12-16 13:37:40 PST

(In reply to Darin Adler from comment #6) > Comment on attachment 447355 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=447355&action=review > > > Source/WebCore/layout/formattingContexts/inline/InlineItemsBuilder.cpp:279 > > + , StringView(paragraphContentBuilder).upconvertedCharacters() > > Is the lifetime OK here? I think it might not be. As soon as the > ubidi_setPara function is done, the temporary buffer containing the > unconverted characters will be deallocated. You are right. According to the documentation "...a pointer to the text that the Bidi algorithm will be performed on. This pointer is stored in the UBiDi object and can be retrieved with ubidi_getText()." Will fix. Thanks! > > Also worried about the performance implications; the upconvertedCharacters > code path is not necessarily the most efficient way to get 16-bit characters > for a purpose like this. Do you have any suggestion? In most cases we already have a 16bit content here, only a very simple, 8bit content with RTL direction needs upconverting.

Darin Adler

Comment 8 2021-12-16 13:42:46 PST

(In reply to zalan from comment #7) > Will fix. Thanks! Can fix it by putting the upconvertedCharacters result into a local variable, but it needs to be of type UpconvertedCharacters (likely done with "auto"), not a const UChar*. > > Also worried about the performance implications; the upconvertedCharacters > > code path is not necessarily the most efficient way to get 16-bit characters > > for a purpose like this. > > Do you have any suggestion? In most cases we already have a 16bit content > here, only a very simple, 8bit content with RTL direction needs upconverting. Oh, I didn’t realize that. If we’ll skip this code path in the common case and not do it for all cases of 8-bit characters, then I withdraw my objection. One example of how this is not necessarily best optimized for heavier uses is that UpconvertedCharacters puts the characters on the stack rather than doing a malloc/free pair if there are <= 32 characters. Explicitly-written code could set a different threshold, which would be valuable if longer strings are common, and the malloc/free cost is a significant part of the total cost.

alan

Comment 9 2021-12-16 13:54:01 PST

(In reply to Darin Adler from comment #8) > (In reply to zalan from comment #7) > > Will fix. Thanks! > > Can fix it by putting the upconvertedCharacters result into a local > variable, but it needs to be of type UpconvertedCharacters (likely done with > "auto"), not a const UChar*. Thanks. > > > > Also worried about the performance implications; the upconvertedCharacters > > > code path is not necessarily the most efficient way to get 16-bit characters > > > for a purpose like this. > > > > Do you have any suggestion? In most cases we already have a 16bit content > > here, only a very simple, 8bit content with RTL direction needs upconverting. > > Oh, I didn’t realize that. If we’ll skip this code path in the common case > and not do it for all cases of 8-bit characters, then I withdraw my > objection. > > One example of how this is not necessarily best optimized for heavier uses > is that UpconvertedCharacters puts the characters on the stack rather than > doing a malloc/free pair if there are <= 32 characters. Explicitly-written > code could set a different threshold, which would be valuable if longer > strings are common, and the malloc/free cost is a significant part of the > total cost. Makes sense. For the common codepath this would certainly need an explicitly-written code (luckily this is not the case here -also, on ToT we always land with a 16bit content here (RTL is not yet enabled)).

alan

Comment 10 2021-12-16 19:16:03 PST

Reopening to attach new patch.

alan

Comment 11 2021-12-16 19:16:05 PST

Created attachment 447411 [details] Patch

Darin Adler

Comment 12 2021-12-16 21:37:13 PST

Comment on attachment 447411 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=447411&action=review > Source/WebCore/layout/formattingContexts/inline/InlineItemsBuilder.cpp:281 > + , bidiContent.get() Should work without the explicit call to get()

alan

Comment 13 2021-12-17 06:17:58 PST

Created attachment 447452 [details] Patch

EWS

Comment 14 2021-12-17 07:19:14 PST

Committed r287184 (245351@main): <https://commits.webkit.org/245351@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 447452 [details].

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component Layout and Rendering

Assignee

alan

Reported

2021-12-15 19:36 PST

Modified

2021-12-17 07:19 PST History

CC List

6 users Show

URL

Keywords InRadar

Depends on

Blocks