Bug 163264 - B3->Air lowering needs the same defenses in effectiveAddr() that it has in tryAppendLea()
Summary: B3->Air lowering needs the same defenses in effectiveAddr() that it has in tr...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-10 20:47 PDT by Filip Pizlo
Modified: 2016-10-11 15:01 PDT (History)
5 users (show)

See Also:


Attachments
the patch (9.02 KB, patch)
2016-10-11 09:49 PDT, Filip Pizlo
mark.lam: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2016-10-10 20:47:06 PDT
There are two ways we may end up matching left-shifts into address calculations: effectiveAddr() and tryAppendLea(). tryAppendLea() does it right, effectiveAddr() does it wrong.
Comment 1 Filip Pizlo 2016-10-11 09:26:24 PDT
Yup, this test generates wrong code and crashes:

void testLoadBaseIndexShift32()
{
    Procedure proc;
    BasicBlock* root = proc.addBlock();
    root->appendNew<Value>(
        proc, Return, Origin(),
        root->appendNew<MemoryValue>(
            proc, Load, Int32, Origin(),
            root->appendNew<Value>(
                proc, Add, Origin(),
                root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR0),
                root->appendNew<Value>(
                    proc, Shl, Origin(),
                    root->appendNew<ArgumentRegValue>(proc, Origin(), GPRInfo::argumentGPR1),
                    root->appendNew<Const32Value>(proc, Origin(), 32)))));
    auto code = compile(proc);
    int32_t value = 12341234;
    char* ptr = bitwise_cast<char*>(&value);
    for (unsigned i = 0; i < 10; ++i)
        CHECK_EQ(invoke<int32_t>(*code, ptr - (static_cast<intptr_t>(1) << static_cast<intptr_t>(32)) * i, i), 12341234);
}
Comment 2 Filip Pizlo 2016-10-11 09:49:51 PDT
Created attachment 291271 [details]
the patch
Comment 3 Mark Lam 2016-10-11 10:30:21 PDT
Comment on attachment 291271 [details]
the patch

r=me if this works for ARM as well.  Otherwise, you might need to make this adjustment conditional on isX86 (if that's not already the case).

I could be mistaken, but I vaguely remember that for 32-bit ARM, I think the shift operand is internally masked with 31 by the CPU before executing the shift.  Hence, it's not possible to get a shift by 32.  I'm not sure of the treatment on ARM64.  Regardless, you should run the test on ARM64 as well to make sure that this does work there (if you haven't already done so).
Comment 4 Filip Pizlo 2016-10-11 10:55:55 PDT
(In reply to comment #3)
> Comment on attachment 291271 [details]
> the patch
> 
> r=me if this works for ARM as well.  Otherwise, you might need to make this
> adjustment conditional on isX86 (if that's not already the case).
> 
> I could be mistaken, but I vaguely remember that for 32-bit ARM, I think the
> shift operand is internally masked with 31 by the CPU before executing the
> shift.  Hence, it's not possible to get a shift by 32.  I'm not sure of the
> treatment on ARM64.  Regardless, you should run the test on ARM64 as well to
> make sure that this does work there (if you haven't already done so).

This is correct on arm64. The semantics of B3 Shl are to mask by 63 for 64-bit shifts and to mask by 31 for 32-bit shifts. Since this code is about emitting BaseIndex instead of Shl, it only has to worry about B3 semantics of Shl (which don't change from architecture to architecture) and target CPU semantics for BaseIndez (which vary wildly, hence the isValidIndexForm call). 

Incidentally, ARM64, x86, and x86-64 all agree with B3 about Shl. ARMv7 disagrees (it masks shifts by 255). But even on ARMv7 this code is right since its converting a Shl to BaseIndex, not ARMv7's shift instruction. To support ARMv7 we would leave this code alone, we would continue to keep our same Shl semantics, and we would make our lowering of Shl on ARMv7 have an extra legalization step that injects a shift by 31 or 63 to make ARMv7 comply with our expectations.
Comment 5 Filip Pizlo 2016-10-11 13:42:05 PDT
Landed in https://trac.webkit.org/changeset/207163
Comment 6 Saam Barati 2016-10-11 14:58:32 PDT
Comment on attachment 291271 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=291271&action=review

> Source/JavaScriptCore/b3/testb3.cpp:13679
> +    if (isX86())
> +        checkUsesInstruction(*code, "(%rdi,%rsi,4)");

It would also be nice to check we do good things for ARM64 here? Will this optimization kick in on ARM64?
Comment 7 Filip Pizlo 2016-10-11 15:01:50 PDT
(In reply to comment #6)
> Comment on attachment 291271 [details]
> the patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=291271&action=review
> 
> > Source/JavaScriptCore/b3/testb3.cpp:13679
> > +    if (isX86())
> > +        checkUsesInstruction(*code, "(%rdi,%rsi,4)");
> 
> It would also be nice to check we do good things for ARM64 here? Will this
> optimization kick in on ARM64?

The ARM optimization is more restricted.  I think it might kick in here, but I wasn't sure.  I think that if/when we start trying to get this right on ARM we should write ARM-specific tests for that.

Also, the reason for this check is to make sure that the fix for this bug didn't completely break BaseIndex.  To do that, it can do the check on any architecture on which we regularly run tests.