RESOLVED FIXED 234380
[LFC][IFC] Simple RTL content may need visual reordering
https://bugs.webkit.org/show_bug.cgi?id=234380
Summary [LFC][IFC] Simple RTL content may need visual reordering
zalan
Reported 2021-12-15 19:36:41 PST
as it turns out there are certain unicode categories that could make the simple (8bit) content visually reordered.
Attachments
Patch (3.69 KB, patch)
2021-12-15 19:40 PST, zalan
no flags
Patch (2.88 KB, patch)
2021-12-16 07:56 PST, zalan
no flags
Patch (2.42 KB, patch)
2021-12-16 19:16 PST, zalan
no flags
Patch (2.41 KB, patch)
2021-12-17 06:17 PST, zalan
no flags
zalan
Comment 1 2021-12-15 19:40:09 PST
Antti Koivisto
Comment 2 2021-12-16 06:20:45 PST
Comment on attachment 447318 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=447318&action=review > Source/WebCore/layout/formattingContexts/inline/InlineItemsBuilder.cpp:279 > + , paragraphContentBuilder.is8Bit() ? StringView(paragraphContentBuilder).upconvertedCharacters() : paragraphContentBuilder.characters16() I think you can just call upconvertedCharacters unconditionally. It handles all cases efficiently. I'd keep this call as a single and use variables if needed.
zalan
Comment 3 2021-12-16 07:56:46 PST
EWS
Comment 4 2021-12-16 09:37:49 PST
Committed r287142 (245326@main): <https://commits.webkit.org/245326@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 447355 [details].
Radar WebKit Bug Importer
Comment 5 2021-12-16 09:38:17 PST
Darin Adler
Comment 6 2021-12-16 13:24:49 PST
Comment on attachment 447355 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=447355&action=review > Source/WebCore/layout/formattingContexts/inline/InlineItemsBuilder.cpp:279 > + , StringView(paragraphContentBuilder).upconvertedCharacters() Is the lifetime OK here? I think it might not be. As soon as the ubidi_setPara function is done, the temporary buffer containing the unconverted characters will be deallocated. Also worried about the performance implications; the upconvertedCharacters code path is not necessarily the most efficient way to get 16-bit characters for a purpose like this.
zalan
Comment 7 2021-12-16 13:37:40 PST
(In reply to Darin Adler from comment #6) > Comment on attachment 447355 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=447355&action=review > > > Source/WebCore/layout/formattingContexts/inline/InlineItemsBuilder.cpp:279 > > + , StringView(paragraphContentBuilder).upconvertedCharacters() > > Is the lifetime OK here? I think it might not be. As soon as the > ubidi_setPara function is done, the temporary buffer containing the > unconverted characters will be deallocated. You are right. According to the documentation "...a pointer to the text that the Bidi algorithm will be performed on. This pointer is stored in the UBiDi object and can be retrieved with ubidi_getText()." Will fix. Thanks! > > Also worried about the performance implications; the upconvertedCharacters > code path is not necessarily the most efficient way to get 16-bit characters > for a purpose like this. Do you have any suggestion? In most cases we already have a 16bit content here, only a very simple, 8bit content with RTL direction needs upconverting.
Darin Adler
Comment 8 2021-12-16 13:42:46 PST
(In reply to zalan from comment #7) > Will fix. Thanks! Can fix it by putting the upconvertedCharacters result into a local variable, but it needs to be of type UpconvertedCharacters (likely done with "auto"), not a const UChar*. > > Also worried about the performance implications; the upconvertedCharacters > > code path is not necessarily the most efficient way to get 16-bit characters > > for a purpose like this. > > Do you have any suggestion? In most cases we already have a 16bit content > here, only a very simple, 8bit content with RTL direction needs upconverting. Oh, I didn’t realize that. If we’ll skip this code path in the common case and not do it for all cases of 8-bit characters, then I withdraw my objection. One example of how this is not necessarily best optimized for heavier uses is that UpconvertedCharacters puts the characters on the stack rather than doing a malloc/free pair if there are <= 32 characters. Explicitly-written code could set a different threshold, which would be valuable if longer strings are common, and the malloc/free cost is a significant part of the total cost.
zalan
Comment 9 2021-12-16 13:54:01 PST
(In reply to Darin Adler from comment #8) > (In reply to zalan from comment #7) > > Will fix. Thanks! > > Can fix it by putting the upconvertedCharacters result into a local > variable, but it needs to be of type UpconvertedCharacters (likely done with > "auto"), not a const UChar*. Thanks. > > > > Also worried about the performance implications; the upconvertedCharacters > > > code path is not necessarily the most efficient way to get 16-bit characters > > > for a purpose like this. > > > > Do you have any suggestion? In most cases we already have a 16bit content > > here, only a very simple, 8bit content with RTL direction needs upconverting. > > Oh, I didn’t realize that. If we’ll skip this code path in the common case > and not do it for all cases of 8-bit characters, then I withdraw my > objection. > > One example of how this is not necessarily best optimized for heavier uses > is that UpconvertedCharacters puts the characters on the stack rather than > doing a malloc/free pair if there are <= 32 characters. Explicitly-written > code could set a different threshold, which would be valuable if longer > strings are common, and the malloc/free cost is a significant part of the > total cost. Makes sense. For the common codepath this would certainly need an explicitly-written code (luckily this is not the case here -also, on ToT we always land with a 16bit content here (RTL is not yet enabled)).
zalan
Comment 10 2021-12-16 19:16:03 PST
Reopening to attach new patch.
zalan
Comment 11 2021-12-16 19:16:05 PST
Darin Adler
Comment 12 2021-12-16 21:37:13 PST
Comment on attachment 447411 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=447411&action=review > Source/WebCore/layout/formattingContexts/inline/InlineItemsBuilder.cpp:281 > + , bidiContent.get() Should work without the explicit call to get()
zalan
Comment 13 2021-12-17 06:17:58 PST
EWS
Comment 14 2021-12-17 07:19:14 PST
Committed r287184 (245351@main): <https://commits.webkit.org/245351@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 447452 [details].
Note You need to log in before you can comment on or make changes to this bug.