WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
109335
ARM_NEON Inline Assembly for copyLCharsFromUCharSource() inefficient for aligned destinations
https://bugs.webkit.org/show_bug.cgi?id=109335
Summary
ARM_NEON Inline Assembly for copyLCharsFromUCharSource() inefficient for alig...
Michael Saboff
Reported
2013-02-08 17:11:10 PST
The ARM_NEON specific code for copyLCharsFromUCharSource() always tries to align the destination, even when it is aligned. The can be seen for moves > 15 characters in length. The code in question is marked if (length >= (2 * memoryAccessSize) - 1) { // Prefix: align dst on 64 bits. const uintptr_t memoryAccessMask = memoryAccessSize - 1; * do { * *destination++ = static_cast<LChar>(*source++); * } while (!isAlignedTo<memoryAccessMask>(destination)); // Vector interleaved unpack, we only store the lower 8 bits. const uintptr_t lengthLeft = end - destination; const LChar* const simdEnd = end - (lengthLeft % memoryAccessSize); do { asm("vld2.8 { d0-d1 }, [%[SOURCE]] !\n\t" "vst1.8 { d0 }, [%[DESTINATION],:64] !\n\t" : [SOURCE]"+r" (source), [DESTINATION]"+r" (destination) : : "memory", "d0", "d1"); } while (destination != simdEnd); } The do { } while should be changed to a while.
Attachments
Patch
(1.53 KB, patch)
2013-02-08 17:45 PST
,
Michael Saboff
fpizlo
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
Michael Saboff
Comment 1
2013-02-08 17:45:23 PST
Created
attachment 187391
[details]
Patch In a synthetic test harness, this is a speed up in the small, but above 15 character case.
Michael Saboff
Comment 2
2013-02-08 18:00:19 PST
Committed
r142336
: <
http://trac.webkit.org/changeset/142336
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug