Bug 170973

Summary: B3StackmapSpecial should handle when stackmap values are not recoverable from a Def'ed arg.
Product: WebKit Reporter: Mark Lam <mark.lam>
Component: JavaScriptCoreAssignee: Mark Lam <mark.lam>
Status: RESOLVED FIXED    
Severity: Normal CC: buildbot, fpizlo, jfbastien, keith_miller, msaboff, rniwa, saam, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
proposed patch.
fpizlo: review+, buildbot: commit-queue-
Archive of layout-test-results from ews113 for mac-elcapitan
none
Archive of layout-test-results from ews100 for mac-elcapitan
none
Archive of layout-test-results from ews106 for mac-elcapitan-wk2
none
patch for landing.
none
x86_64 benchmark result. none

Description Mark Lam 2017-04-18 18:03:19 PDT
In the event of an arithmetic overflows on a binary sub instruction (where the result register is same as one of the operand registers), the CheckSub FTL operation will try to recover the original value in the clobbered result register.  This recover is done by adding the other operand value to the result register.  However, this recovery method only works if the width of the original value in the result register is less or equal to the width of the expected result.  If the width of the original operand value (e.g. a JSInt32) is wider than the result (e.g. a machine Int32), then the sub operation would have zero extended the result and cleared the upper 32-bits of the result register.  Recovery by adding back the other operand will not restore the JSValue tag in the upper word.  This poses a problem if the stackmap value for the operand relies on that same clobbered register.

The fix is to detect this potential scenario (i.e. width of the Def's arg < width of a stackmap value).  If this condition is detected, we'll declare the stackmap value to be LateColdUse to ensure that the register allocator gives it a different register if needed so that it's not dependent on the clobbered register.

<rdar://problem/30318657>
Comment 1 Mark Lam 2017-04-18 18:19:36 PDT
Created attachment 307446 [details]
proposed patch.
Comment 2 Filip Pizlo 2017-04-18 18:31:47 PDT
Comment on attachment 307446 [details]
proposed patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=307446&action=review

Have you measured perf?

r=me with suggested changes and perf results.  Octane is particularly sensitive to CheckSpecial not using late unless it really has to.  If my guess is right, this will be neutral on benchmarks - but that needs to be verified.

> Source/JavaScriptCore/b3/B3PatchpointSpecial.cpp:65
> +        auto argWidth = inst.origin->resultWidth();
> +        optionalDefArgWidth = argWidth;
> +        callback(inst.args[argIndex++], role, inst.origin->resultBank(), argWidth);
>      }
>  
> -    forEachArgImpl(0, argIndex, inst, SameAsRep, std::nullopt, callback);
> +    forEachArgImpl(0, argIndex, inst, SameAsRep, std::nullopt, callback, optionalDefArgWidth);

I think that in Patchpoint, you don't need to do anything like this.  Patchpoint does not have Check's weird recover-value-on-overflow behavior, since it has no notion of overflow.  You should pass nullopt here.

> Source/JavaScriptCore/b3/B3StackmapSpecial.cpp:129
> +            // If the Def'ed arg has a smaller width than the a stackmap value, then we may not
> +            // be able to recover the stackmap value. So, force LateColdUse to preserve the
> +            // original stackmap value across the Special operation.
> +            if (optionalDefArgWidth && *optionalDefArgWidth < child.value()->resultWidth())
> +                role = Arg::LateColdUse;

I think that you want it to be a LateUse if it was a Use and a LateColdUse if it was a ColdUse.  If it's already some kind of late use then you don't need to change it.
Comment 3 Build Bot 2017-04-18 18:55:10 PDT
Comment on attachment 307446 [details]
proposed patch.

Attachment 307446 [details] did not pass jsc-ews (mac):
Output: http://webkit-queues.webkit.org/results/3560267

New failing tests:
wasm.yaml/wasm/function-tests/many-arguments-to-function.js.default-wasm
wasm.yaml/wasm/js-api/wasm-to-wasm.js.default-wasm
wasm.yaml/wasm/spec-tests/left-to-right.wast.js.default-wasm
Comment 4 Build Bot 2017-04-18 19:37:18 PDT
Comment on attachment 307446 [details]
proposed patch.

Attachment 307446 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/3560351

New failing tests:
workers/wasm-hashset-many-2.html
workers/wasm-hashset-many.html
workers/wasm-hashset.html
workers/wasm-long-compile-many.html
Comment 5 Build Bot 2017-04-18 19:37:20 PDT
Created attachment 307451 [details]
Archive of layout-test-results from ews113 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews113  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 6 Build Bot 2017-04-18 19:39:21 PDT
Comment on attachment 307446 [details]
proposed patch.

Attachment 307446 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/3560415

New failing tests:
workers/wasm-hashset-many-2.html
workers/wasm-hashset-many.html
workers/wasm-hashset.html
Comment 7 Build Bot 2017-04-18 19:39:22 PDT
Created attachment 307453 [details]
Archive of layout-test-results from ews100 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews100  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 8 Build Bot 2017-04-18 19:45:17 PDT
Comment on attachment 307446 [details]
proposed patch.

Attachment 307446 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/3560428

New failing tests:
workers/wasm-hashset-many-2.html
workers/wasm-hashset-many.html
workers/wasm-hashset.html
Comment 9 Build Bot 2017-04-18 19:45:18 PDT
Created attachment 307456 [details]
Archive of layout-test-results from ews106 for mac-elcapitan-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106  Port: mac-elcapitan-wk2  Platform: Mac OS X 10.11.6
Comment 10 Mark Lam 2017-04-19 12:58:15 PDT
(In reply to Filip Pizlo from comment #2)
> Comment on attachment 307446 [details]
> proposed patch.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=307446&action=review
> 
> Have you measured perf?

Checking now.

> r=me with suggested changes and perf results.  Octane is particularly
> sensitive to CheckSpecial not using late unless it really has to.  If my
> guess is right, this will be neutral on benchmarks - but that needs to be
> verified.
> 
> > Source/JavaScriptCore/b3/B3PatchpointSpecial.cpp:65
> > +        auto argWidth = inst.origin->resultWidth();
> > +        optionalDefArgWidth = argWidth;
> > +        callback(inst.args[argIndex++], role, inst.origin->resultBank(), argWidth);
> >      }
> >  
> > -    forEachArgImpl(0, argIndex, inst, SameAsRep, std::nullopt, callback);
> > +    forEachArgImpl(0, argIndex, inst, SameAsRep, std::nullopt, callback, optionalDefArgWidth);
> 
> I think that in Patchpoint, you don't need to do anything like this. 
> Patchpoint does not have Check's weird recover-value-on-overflow behavior,
> since it has no notion of overflow.  You should pass nullopt here.

Fixed.

> > Source/JavaScriptCore/b3/B3StackmapSpecial.cpp:129
> > +            // If the Def'ed arg has a smaller width than the a stackmap value, then we may not
> > +            // be able to recover the stackmap value. So, force LateColdUse to preserve the
> > +            // original stackmap value across the Special operation.
> > +            if (optionalDefArgWidth && *optionalDefArgWidth < child.value()->resultWidth())
> > +                role = Arg::LateColdUse;
> 
> I think that you want it to be a LateUse if it was a Use and a LateColdUse
> if it was a ColdUse.  If it's already some kind of late use then you don't
> need to change it.

Thanks for catching this.  Fixed.
Comment 11 Mark Lam 2017-04-19 13:12:54 PDT
Created attachment 307501 [details]
patch for landing.

Applies fixes from Fil's feedback.  Let's get some EWS testing while I do perf testing.
Comment 12 Mark Lam 2017-04-19 13:42:06 PDT
JSC benchmarks say that perf is neutral.  Now, we wait for EWS bots to finish testing.
Comment 13 Mark Lam 2017-04-19 13:45:32 PDT
Created attachment 307505 [details]
x86_64 benchmark result.
Comment 14 Mark Lam 2017-04-19 14:52:29 PDT
Landed in r215531: <http://trac.webkit.org/r215531>.
Comment 15 Saam Barati 2017-05-10 19:46:04 PDT
Comment on attachment 307501 [details]
patch for landing.

View in context: https://bugs.webkit.org/attachment.cgi?id=307501&action=review

> Source/JavaScriptCore/b3/B3CheckSpecial.cpp:117
> +            if (Arg::isAnyDef(role)) {
> +                ASSERT(!optionalDefArgWidth); // There can only be one Def'ed arg.
> +                optionalDefArgWidth = width;

I think what we want here is Arg::isLateDef.
Fil, Mark, what do you think?
This assertion fires on arm64 for:

arm64: BranchMul32 U:G:32, U:G:32, U:G:32, S:G:32, S:G:32, ZD:G:32 /branch
    ResCond, Tmp, Tmp, Tmp, Tmp, Tmp
Comment 16 Saam Barati 2017-05-10 19:47:31 PDT
(In reply to Saam Barati from comment #15)
> Comment on attachment 307501 [details]
> patch for landing.
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=307501&action=review
> 
> > Source/JavaScriptCore/b3/B3CheckSpecial.cpp:117
> > +            if (Arg::isAnyDef(role)) {
> > +                ASSERT(!optionalDefArgWidth); // There can only be one Def'ed arg.
> > +                optionalDefArgWidth = width;
> 
> I think what we want here is Arg::isLateDef.
> Fil, Mark, what do you think?
> This assertion fires on arm64 for:
> 
> arm64: BranchMul32 U:G:32, U:G:32, U:G:32, S:G:32, S:G:32, ZD:G:32 /branch
>     ResCond, Tmp, Tmp, Tmp, Tmp, Tmp

Note that Scratch are defs:

```
    static bool isAnyDef(Role role)
    {
        switch (role) {
        case Use:
        case ColdUse:
        case UseAddr:
        case LateUse:
        case LateColdUse:
            return false;
        case Def:
        case UseDef:
        case ZDef:
        case UseZDef:
        case EarlyDef:
        case EarlyZDef:
        case Scratch:
            return true;
        }
        ASSERT_NOT_REACHED();
    }
```