RESOLVED FIXED 94886
Add an optimized version of copyLCharsFromUCharSource for ARM
https://bugs.webkit.org/show_bug.cgi?id=94886
Summary Add an optimized version of copyLCharsFromUCharSource for ARM
Benjamin Poulain
Reported 2012-08-23 18:09:20 PDT
Some more SIMD :)
Attachments
Patch (2.91 KB, patch)
2012-08-23 18:20 PDT, Benjamin Poulain
no flags
Patch (2.90 KB, patch)
2012-08-23 19:24 PDT, Benjamin Poulain
no flags
Patch (3.05 KB, patch)
2012-08-24 12:50 PDT, Benjamin Poulain
barraclough: review+
Benjamin Poulain
Comment 1 2012-08-23 18:20:33 PDT
Benjamin Poulain
Comment 2 2012-08-23 18:22:36 PDT
I could not use intrinsics here because: -I need the explicit alignment for speed. -I need the auto increment for speed. -IIRC the interleaved load intrinsics are not always available anyway.
Build Bot
Comment 3 2012-08-23 18:45:41 PDT
WebKit Review Bot
Comment 4 2012-08-23 18:55:13 PDT
Comment on attachment 160303 [details] Patch Attachment 160303 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/13570906
Benjamin Poulain
Comment 5 2012-08-23 19:24:37 PDT
WebKit Review Bot
Comment 6 2012-08-23 19:46:59 PDT
Comment on attachment 160310 [details] Patch Attachment 160310 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/13568917
Build Bot
Comment 7 2012-08-23 20:00:33 PDT
Early Warning System Bot
Comment 8 2012-08-23 20:44:51 PDT
Benjamin Poulain
Comment 9 2012-08-23 20:47:48 PDT
Comment on attachment 160310 [details] Patch I'll fix that tomorrow.
Gyuyoung Kim
Comment 10 2012-08-23 20:51:47 PDT
Early Warning System Bot
Comment 11 2012-08-23 20:58:43 PDT
Peter Beverloo (cr-android ews)
Comment 12 2012-08-23 21:43:04 PDT
Comment on attachment 160310 [details] Patch Attachment 160310 [details] did not pass cr-android-ews (chromium-android): Output: http://queues.webkit.org/results/13569880
Benjamin Poulain
Comment 13 2012-08-24 12:50:43 PDT
Benjamin Poulain
Comment 14 2012-08-24 12:51:46 PDT
This new version improves the performance when we cannot use Neon, and is just as fast when the input is big enough.
Gavin Barraclough
Comment 15 2012-10-26 23:12:23 PDT
Comment on attachment 160481 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=160481&action=review This looks awesome, didn't know about the interleaved unpack, that's really handy. > Source/WTF/wtf/text/ASCIIFastPath.h:134 > +#elif COMPILER(GCC) && CPU(ARM_NEON) && !(PLATFORM(BIG_ENDIAN) || PLATFORM(MIDDLE_ENDIAN)) Your optimized path skips an ASSERT to check the upper bits are zero; might be worth adding "&& defined(NDEBUG)" so that debug builds get the C-loop with the ASSERT. > Source/WTF/wtf/text/ASCIIFastPath.h:141 > + do { I think WebKit coding style is no parens here.
Benjamin Poulain
Comment 16 2012-10-31 17:31:43 PDT
Note You need to log in before you can comment on or make changes to this bug.