Bug 94886 - Add an optimized version of copyLCharsFromUCharSource for ARM
Summary: Add an optimized version of copyLCharsFromUCharSource for ARM
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Template Framework (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Benjamin Poulain
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-23 18:09 PDT by Benjamin Poulain
Modified: 2012-10-31 17:31 PDT (History)
9 users (show)

See Also:


Attachments
Patch (2.91 KB, patch)
2012-08-23 18:20 PDT, Benjamin Poulain
no flags Details | Formatted Diff | Diff
Patch (2.90 KB, patch)
2012-08-23 19:24 PDT, Benjamin Poulain
no flags Details | Formatted Diff | Diff
Patch (3.05 KB, patch)
2012-08-24 12:50 PDT, Benjamin Poulain
barraclough: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Benjamin Poulain 2012-08-23 18:09:20 PDT
Some more SIMD :)
Comment 1 Benjamin Poulain 2012-08-23 18:20:33 PDT
Created attachment 160303 [details]
Patch
Comment 2 Benjamin Poulain 2012-08-23 18:22:36 PDT
I could not use intrinsics here because:
-I need the explicit alignment for speed.
-I need the auto increment for speed.
-IIRC the interleaved load intrinsics are not always available anyway.
Comment 3 Build Bot 2012-08-23 18:45:41 PDT
Comment on attachment 160303 [details]
Patch

Attachment 160303 [details] did not pass mac-ews (mac):
Output: http://queues.webkit.org/results/13562967
Comment 4 WebKit Review Bot 2012-08-23 18:55:13 PDT
Comment on attachment 160303 [details]
Patch

Attachment 160303 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13570906
Comment 5 Benjamin Poulain 2012-08-23 19:24:37 PDT
Created attachment 160310 [details]
Patch
Comment 6 WebKit Review Bot 2012-08-23 19:46:59 PDT
Comment on attachment 160310 [details]
Patch

Attachment 160310 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/13568917
Comment 7 Build Bot 2012-08-23 20:00:33 PDT
Comment on attachment 160310 [details]
Patch

Attachment 160310 [details] did not pass win-ews (win):
Output: http://queues.webkit.org/results/13559960
Comment 8 Early Warning System Bot 2012-08-23 20:44:51 PDT
Comment on attachment 160310 [details]
Patch

Attachment 160310 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/13559975
Comment 9 Benjamin Poulain 2012-08-23 20:47:48 PDT
Comment on attachment 160310 [details]
Patch

I'll fix that tomorrow.
Comment 10 Gyuyoung Kim 2012-08-23 20:51:47 PDT
Comment on attachment 160310 [details]
Patch

Attachment 160310 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/13566937
Comment 11 Early Warning System Bot 2012-08-23 20:58:43 PDT
Comment on attachment 160310 [details]
Patch

Attachment 160310 [details] did not pass qt-wk2-ews (qt):
Output: http://queues.webkit.org/results/13563945
Comment 12 Peter Beverloo (cr-android ews) 2012-08-23 21:43:04 PDT
Comment on attachment 160310 [details]
Patch

Attachment 160310 [details] did not pass cr-android-ews (chromium-android):
Output: http://queues.webkit.org/results/13569880
Comment 13 Benjamin Poulain 2012-08-24 12:50:43 PDT
Created attachment 160481 [details]
Patch
Comment 14 Benjamin Poulain 2012-08-24 12:51:46 PDT
This new version improves the performance when we cannot use Neon, and is just as fast when the input is big enough.
Comment 15 Gavin Barraclough 2012-10-26 23:12:23 PDT
Comment on attachment 160481 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=160481&action=review

This looks awesome, didn't know about the interleaved unpack, that's really handy.

> Source/WTF/wtf/text/ASCIIFastPath.h:134
> +#elif COMPILER(GCC) && CPU(ARM_NEON) && !(PLATFORM(BIG_ENDIAN) || PLATFORM(MIDDLE_ENDIAN))

Your optimized path skips an ASSERT to check the upper bits are zero; might be worth adding "&& defined(NDEBUG)" so that debug builds get the C-loop with the ASSERT.

> Source/WTF/wtf/text/ASCIIFastPath.h:141
> +        do {

I think WebKit coding style is no parens here.
Comment 16 Benjamin Poulain 2012-10-31 17:31:43 PDT
Committed r133100: <http://trac.webkit.org/changeset/133100>