Bug 109335 - ARM_NEON Inline Assembly for copyLCharsFromUCharSource() inefficient for aligned destinations
Summary: ARM_NEON Inline Assembly for copyLCharsFromUCharSource() inefficient for alig...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Other All
: P2 Normal
Assignee: Michael Saboff
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-08 17:11 PST by Michael Saboff
Modified: 2013-02-08 18:00 PST (History)
4 users (show)

See Also:


Attachments
Patch (1.53 KB, patch)
2013-02-08 17:45 PST, Michael Saboff
fpizlo: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Saboff 2013-02-08 17:11:10 PST
The ARM_NEON specific code for copyLCharsFromUCharSource() always tries to align the destination, even when it is aligned.  The can be seen for moves > 15 characters in length.

The code in question is marked

    if (length >= (2 * memoryAccessSize) - 1) {
        // Prefix: align dst on 64 bits.
        const uintptr_t memoryAccessMask = memoryAccessSize - 1;
 *      do {
 *          *destination++ = static_cast<LChar>(*source++);
 *       } while (!isAlignedTo<memoryAccessMask>(destination));

        // Vector interleaved unpack, we only store the lower 8 bits.
        const uintptr_t lengthLeft = end - destination;
        const LChar* const simdEnd = end - (lengthLeft % memoryAccessSize);
        do {
            asm("vld2.8   { d0-d1 }, [%[SOURCE]] !\n\t"
                "vst1.8   { d0 }, [%[DESTINATION],:64] !\n\t"
                : [SOURCE]"+r" (source), [DESTINATION]"+r" (destination)
                :
                : "memory", "d0", "d1");
        } while (destination != simdEnd);
    }

The do { } while should be changed to a while.
Comment 1 Michael Saboff 2013-02-08 17:45:23 PST
Created attachment 187391 [details]
Patch

In a synthetic test harness, this is a speed up in the small, but above 15 character case.
Comment 2 Michael Saboff 2013-02-08 18:00:19 PST
Committed r142336: <http://trac.webkit.org/changeset/142336>