Bug 154022

Summary: [JSC] Implement isFinite / isNaN in JS and make DFG ToNumber accept non number values
Product: WebKit Reporter: Yusuke Suzuki <ysuzuki>
Component: JavaScriptCoreAssignee: Yusuke Suzuki <ysuzuki>
Status: RESOLVED FIXED    
Severity: Normal CC: benjamin, buildbot, commit-queue, fpizlo, ggaren, joepeck, keith_miller, mark.lam, msaboff, rniwa, saam
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 159097    
Bug Blocks: 153738, 155240, 156404    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Results
none
Archive of layout-test-results from ews102 for mac-yosemite
none
Archive of layout-test-results from ews106 for mac-yosemite-wk2
none
Archive of layout-test-results from ews115 for mac-yosemite
none
Patch
none
Archive of layout-test-results from ews103 for mac-yosemite
none
Archive of layout-test-results from ews107 for mac-yosemite-wk2
none
Archive of layout-test-results from ews112 for mac-yosemite
none
Patch
none
Patch
none
Patch
none
Patch
none
performance results
none
Patch
none
Patch
none
Patch
none
Patch
none
Archive of layout-test-results from ews115 for mac-yosemite
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Performance results
none
Patch
none
Patch
buildbot: commit-queue-
Archive of layout-test-results from ews101 for mac-yosemite
none
Archive of layout-test-results from ews104 for mac-yosemite-wk2
none
Archive of layout-test-results from ews123 for ios-simulator-wk2
none
Archive of layout-test-results from ews115 for mac-yosemite
none
[PATCH] Additions to Yusuke's Previous Patch
none
[PATCH] isNaN Intrinsics (for comparison)
none
[PATCH] Rebaselined + Combined Tests
none
[RESULTS] Benchmark results
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch keith_miller: review+

Description Yusuke Suzuki 2016-02-08 17:48:45 PST
...
Comment 1 Yusuke Suzuki 2016-02-18 08:43:20 PST
Created attachment 271661 [details]
Patch

WIP
Comment 2 Yusuke Suzuki 2016-02-24 10:43:09 PST
Created attachment 272118 [details]
Patch

WIP part2
Comment 3 Yusuke Suzuki 2016-02-24 10:59:24 PST
Created attachment 272122 [details]
Patch

I'll paste the results of the benchmarks
Comment 4 Yusuke Suzuki 2016-02-24 11:20:32 PST
Created attachment 272125 [details]
Patch

I'll paste the results of the benchmarks
Comment 5 Yusuke Suzuki 2016-02-24 12:20:48 PST
Created attachment 272135 [details]
Results

Benchmark results. Looks neutral.
Comment 6 Yusuke Suzuki 2016-02-24 12:21:10 PST
I'll update the expect file for profiler tests.
Comment 7 Build Bot 2016-02-24 12:22:23 PST
Comment on attachment 272125 [details]
Patch

Attachment 272125 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/878449

New failing tests:
fast/profiler/built-in-function-calls-user-defined-function.html
svg/in-html/sizing/svg-inline.html
fast/profiler/built-in-function-calls-anonymous.html
Comment 8 Build Bot 2016-02-24 12:22:26 PST
Created attachment 272136 [details]
Archive of layout-test-results from ews102 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews102  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 9 Yusuke Suzuki 2016-02-24 12:25:32 PST
Comment on attachment 272125 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=272125&action=review

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3284
> +

I'll insert branchIfNumber fast path here.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:3326
> +

Same as above.
Comment 10 Build Bot 2016-02-24 12:34:29 PST
Comment on attachment 272125 [details]
Patch

Attachment 272125 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/878497

New failing tests:
fast/profiler/built-in-function-calls-user-defined-function.html
svg/in-html/sizing/svg-inline.html
fast/profiler/built-in-function-calls-anonymous.html
Comment 11 Build Bot 2016-02-24 12:34:32 PST
Created attachment 272139 [details]
Archive of layout-test-results from ews106 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 12 Build Bot 2016-02-24 13:10:48 PST
Comment on attachment 272125 [details]
Patch

Attachment 272125 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/878553

New failing tests:
fast/profiler/built-in-function-calls-user-defined-function.html
svg/in-html/sizing/svg-inline.html
fast/profiler/built-in-function-calls-anonymous.html
Comment 13 Build Bot 2016-02-24 13:10:50 PST
Created attachment 272142 [details]
Archive of layout-test-results from ews115 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews115  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 14 Filip Pizlo 2016-02-24 13:48:39 PST
Comment on attachment 272125 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=272125&action=review

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3284
>> +
> 
> I'll insert branchIfNumber fast path here.

Please do!  I think that will be very important.
Comment 15 Yusuke Suzuki 2016-02-25 09:37:46 PST
Created attachment 272209 [details]
Patch

Updated. The results is forthcoming.
Comment 16 Yusuke Suzuki 2016-02-25 09:42:19 PST
Comment on attachment 272125 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=272125&action=review

>>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3284
>>> +
>> 
>> I'll insert branchIfNumber fast path here.
> 
> Please do!  I think that will be very important.

Yeah! I've added it. Now, I'm building JSC on OSX laptop to measure the performance.
Comment 17 Build Bot 2016-02-25 10:38:55 PST
Comment on attachment 272209 [details]
Patch

Attachment 272209 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/882523

New failing tests:
svg/in-html/sizing/svg-inline.html
Comment 18 Build Bot 2016-02-25 10:38:58 PST
Created attachment 272216 [details]
Archive of layout-test-results from ews103 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews103  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 19 Build Bot 2016-02-25 10:42:20 PST
Comment on attachment 272209 [details]
Patch

Attachment 272209 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/882525

New failing tests:
svg/in-html/sizing/svg-inline.html
Comment 20 Build Bot 2016-02-25 10:42:23 PST
Created attachment 272217 [details]
Archive of layout-test-results from ews107 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews107  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 21 Build Bot 2016-02-25 10:55:59 PST
Comment on attachment 272209 [details]
Patch

Attachment 272209 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/882526

New failing tests:
svg/in-html/sizing/svg-inline.html
Comment 22 Build Bot 2016-02-25 10:56:02 PST
Created attachment 272218 [details]
Archive of layout-test-results from ews112 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews112  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 23 Yusuke Suzuki 2016-02-25 11:07:02 PST
Investigating svg/in-html/sizing/svg-inline.html case...
Comment 24 Yusuke Suzuki 2016-02-25 20:06:19 PST
Created attachment 272285 [details]
Patch

Updated. The results is forthcoming.
Comment 25 Yusuke Suzuki 2016-02-25 20:29:04 PST
Created attachment 272286 [details]
Patch

Updated. The results is forthcoming.
Comment 26 Yusuke Suzuki 2016-02-25 20:40:52 PST
Created attachment 272287 [details]
Patch

Updated. The results is forthcoming.
Comment 27 Yusuke Suzuki 2016-02-25 20:48:23 PST
Created attachment 272288 [details]
Patch

Updated. The results is forthcoming.
Comment 28 Yusuke Suzuki 2016-02-25 21:26:02 PST
Created attachment 272295 [details]
performance results

Looks neutral.
Comment 29 Yusuke Suzuki 2016-02-29 06:24:36 PST
Created attachment 272489 [details]
Patch
Comment 30 Yusuke Suzuki 2016-03-03 22:00:23 PST
Can anyone take a look?
Comment 31 Saam Barati 2016-03-06 17:23:48 PST
Comment on attachment 272489 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=272489&action=review

r=me with comments

It might be good for Filip or someone else to also take a look.

> Source/JavaScriptCore/builtins/GlobalObject.js:2
> + * Copyright (C) 2016 Yusuke Suzuki <utatane.tea@gmail.com>.

This should be 2015-2016

> Source/JavaScriptCore/builtins/GlobalOperations.js:2
> + * Copyright (C) 2015-2016 Yusuke Suzuki <utatane.tea@gmail.com>.

This should just be 2016 I think

> Source/JavaScriptCore/builtins/GlobalOperations.js:51
> +    var maxSafeInteger = 0x1FFFFFFFFFFFFF;

This is a good use case for "const"

> Source/JavaScriptCore/builtins/GlobalOperations.js:61
> +    return object === @undefined || object == null || typeof object === "object";

Doesn't "== null" handle both undefined and null?

> Source/JavaScriptCore/builtins/NumberConstructor.js:40
> +function isNaN(value)

Why are the two different isNaN implementations different?

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3277
> +        JSValueOperand op1(this, node->child1());

I think we can come up with a better name than "op1".
Maybe "argument" or "value" or something else.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:3322
> +        JSValueOperand op1(this, node->child1());

ditto

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:3338
> +            MacroAssembler::Jump notNumber = m_jit.branchIfNotNumber(JSValueRegs(op1GPR), resultGPR);
> +            m_jit.move(op1GPR, resultGPR);
> +            MacroAssembler::Jump done = m_jit.jump();

Isn't there a way for AI to tell us if it's proven something is a number and then you don't even need to do this branch?

> Source/JavaScriptCore/runtime/CommonIdentifiers.h:-316
> -    macro(isFinite) \
> -    macro(isNaN) \

Where else are these defined?
Comment 32 Yusuke Suzuki 2016-03-06 20:05:52 PST
Comment on attachment 272489 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=272489&action=review

Thanks!

>> Source/JavaScriptCore/builtins/GlobalOperations.js:2
>> + * Copyright (C) 2015-2016 Yusuke Suzuki <utatane.tea@gmail.com>.
> 
> This should just be 2016 I think

I'll fix it.

>> Source/JavaScriptCore/builtins/GlobalOperations.js:51
>> +    var maxSafeInteger = 0x1FFFFFFFFFFFFF;
> 
> This is a good use case for "const"

Sounds nice, fixed.

>> Source/JavaScriptCore/builtins/GlobalOperations.js:61
>> +    return object === @undefined || object == null || typeof object === "object";
> 
> Doesn't "== null" handle both undefined and null?

It's OK. I'll change to that.

>> Source/JavaScriptCore/builtins/NumberConstructor.js:40
>> +function isNaN(value)
> 
> Why are the two different isNaN implementations different?

One for global.isNaN and one for Number.isNaN. global.isNaN attempt to convert an argument to a number. But Number.isNaN does not: it returns false if an argument is not a number type.

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3277
>> +        JSValueOperand op1(this, node->child1());
> 
> I think we can come up with a better name than "op1".
> Maybe "argument" or "value" or something else.

I'll change it to `argument`, it's more descriptive.

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:3322
>> +        JSValueOperand op1(this, node->child1());
> 
> ditto

Fixed.

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:3338
>> +            MacroAssembler::Jump done = m_jit.jump();
> 
> Isn't there a way for AI to tell us if it's proven something is a number and then you don't even need to do this branch?

AI and constant folding pass already have the path for that.
The constant folding pass attempts to convert ToNumber to Identity if the argument is proven that it's a number. And later, it invokes AI to convert the node to the constant value.
AI attempt to retrieve the constant value from the argument. And AI also attempt to transfer the proven value from the argument to the result if the argument is a number.

>> Source/JavaScriptCore/runtime/CommonIdentifiers.h:-316
>> -    macro(isNaN) \
> 
> Where else are these defined?

It's defined by generate-js-builtins.py since the builtin JS now has isNaN and isFinite for the function names.
So, these names are defined in BuiltinNames.
Comment 33 Yusuke Suzuki 2016-03-07 04:30:03 PST
Created attachment 273172 [details]
Patch
Comment 34 Yusuke Suzuki 2016-03-07 04:30:40 PST
(In reply to comment #31)
> Comment on attachment 272489 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=272489&action=review
> 
> r=me with comments
> 
> It might be good for Filip or someone else to also take a look.

Yeah. Updated the patch, and I'll request DFG part review :)
Comment 35 Yusuke Suzuki 2016-03-07 05:08:45 PST
Created attachment 273173 [details]
Patch

rebase for FTL change
Comment 36 Filip Pizlo 2016-03-07 08:17:59 PST
Comment on attachment 273173 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273173&action=review

> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:611
> +        case ToNumber: {
> +            SpeculatedType child = node->child1()->prediction();
> +            if (child) {
> +                if (child & ~SpecFullNumber)
> +                    changed |= mergePrediction(mergeSpeculations(child & SpecFullNumber, SpecBytecodeNumber));
> +                else
> +                    changed |= mergePrediction(child & SpecFullNumber);
> +            } else
> +                changed |= mergePrediction(SpecBytecodeNumber);
> +            break;
> +        }

We can't land this since it introduces double pollution.  It's wrong to create prediction propagation rules that blindly predict number.  You need to have some profiling to tell you if you will see just ints, just doubles, or either ints or doubles.  Ideally you will also have profiling to tell you finer-grained things as well, but the minimum is distinguishing ints from doubles.

Taking it piece by piece:

                    changed |= mergePrediction(mergeSpeculations(child & SpecFullNumber, SpecBytecodeNumber));

This is wrong because it forces the output to include double even if the input didn't contain double.  It's making a guess that the effectful part of ToNumber returns doubles.  This means that any code like this will have a really bad time:

o.valueOf = function() { return some integer; }
var x = @toNumber(o);
for (var i = 0; i < ...; ++i) { do things with x; }

Please make sure that if you predict double - i.e. your speculated type includes double in its set - that you have some evidence that you actually need to include double.  Never do it blindly.

                    changed |= mergePrediction(child & SpecFullNumber);

This should just be "changed |= mergePrediction(child)", since you already know that child & ~SpecFullNumber is zero.

            } else
                changed |= mergePrediction(SpecBytecodeNumber);

I don't think this belongs here at all.  Why are you saying that if the child is the empty set then the output is any number?  That doesn't make sense.  If the child is the empty set, then the output can only be the empty set.

What were you trying to accomplish with this last rule?

Anyway, the correct thing to do is to make op_to_number have profiling to tell it what kind of number it returns.  I believe that mlam implemented something like this for other numerical operations.  Please do the same here as a prerequisite to implementing ToNumber in the DFG.
Comment 37 Yusuke Suzuki 2016-03-07 09:47:19 PST
Comment on attachment 273173 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273173&action=review

Thanks!

>> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:611
>> +        }
> 
> We can't land this since it introduces double pollution.  It's wrong to create prediction propagation rules that blindly predict number.  You need to have some profiling to tell you if you will see just ints, just doubles, or either ints or doubles.  Ideally you will also have profiling to tell you finer-grained things as well, but the minimum is distinguishing ints from doubles.
> 
> Taking it piece by piece:
> 
>                     changed |= mergePrediction(mergeSpeculations(child & SpecFullNumber, SpecBytecodeNumber));
> 
> This is wrong because it forces the output to include double even if the input didn't contain double.  It's making a guess that the effectful part of ToNumber returns doubles.  This means that any code like this will have a really bad time:
> 
> o.valueOf = function() { return some integer; }
> var x = @toNumber(o);
> for (var i = 0; i < ...; ++i) { do things with x; }
> 
> Please make sure that if you predict double - i.e. your speculated type includes double in its set - that you have some evidence that you actually need to include double.  Never do it blindly.
> 
>                     changed |= mergePrediction(child & SpecFullNumber);
> 
> This should just be "changed |= mergePrediction(child)", since you already know that child & ~SpecFullNumber is zero.
> 
>             } else
>                 changed |= mergePrediction(SpecBytecodeNumber);
> 
> I don't think this belongs here at all.  Why are you saying that if the child is the empty set then the output is any number?  That doesn't make sense.  If the child is the empty set, then the output can only be the empty set.
> 
> What were you trying to accomplish with this last rule?
> 
> Anyway, the correct thing to do is to make op_to_number have profiling to tell it what kind of number it returns.  I believe that mlam implemented something like this for other numerical operations.  Please do the same here as a prerequisite to implementing ToNumber in the DFG.

"Please make sure that if you predict double - i.e. your speculated type includes double in its set - that you have some evidence that you actually need to include double. Never do it blindly."

Make sense. Thanks.

"What were you trying to accomplish with this last rule?"

ToNumber always returns number types even if any argument comes. This rule represented that anything number is speculated for this ToNumber node.
But yup, this is speculation phase, not AI. So instead of returning possible type sets, as you said, just leaving this node unspeculated is correct.

"Anyway, the correct thing to do is to make op_to_number have profiling to tell it what kind of number it returns. I believe that mlam implemented something like this for other numerical operations. Please do the same here as a prerequisite to implementing ToNumber in the DFG."

OK, I'm now planning to insert value profiling for op_to_number and use heap prediction similar to ArithCeil/ArithFloor/ArithRound.
Comment 38 Yusuke Suzuki 2016-03-07 10:06:52 PST
Also checking ResultProfile...
Comment 39 Filip Pizlo 2016-03-07 10:35:23 PST
(In reply to comment #38)
> Also checking ResultProfile...

Yup, you probably want to use ResultProfile for when you start with op_to_number, and you want to use heap prediction when you start with the number constructor.
Comment 40 Filip Pizlo 2016-03-07 10:58:37 PST
(In reply to comment #37)
> Comment on attachment 273173 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=273173&action=review
> 
> Thanks!
> 
> >> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:611
> >> +        }
> > 
> > We can't land this since it introduces double pollution.  It's wrong to create prediction propagation rules that blindly predict number.  You need to have some profiling to tell you if you will see just ints, just doubles, or either ints or doubles.  Ideally you will also have profiling to tell you finer-grained things as well, but the minimum is distinguishing ints from doubles.
> > 
> > Taking it piece by piece:
> > 
> >                     changed |= mergePrediction(mergeSpeculations(child & SpecFullNumber, SpecBytecodeNumber));
> > 
> > This is wrong because it forces the output to include double even if the input didn't contain double.  It's making a guess that the effectful part of ToNumber returns doubles.  This means that any code like this will have a really bad time:
> > 
> > o.valueOf = function() { return some integer; }
> > var x = @toNumber(o);
> > for (var i = 0; i < ...; ++i) { do things with x; }
> > 
> > Please make sure that if you predict double - i.e. your speculated type includes double in its set - that you have some evidence that you actually need to include double.  Never do it blindly.
> > 
> >                     changed |= mergePrediction(child & SpecFullNumber);
> > 
> > This should just be "changed |= mergePrediction(child)", since you already know that child & ~SpecFullNumber is zero.
> > 
> >             } else
> >                 changed |= mergePrediction(SpecBytecodeNumber);
> > 
> > I don't think this belongs here at all.  Why are you saying that if the child is the empty set then the output is any number?  That doesn't make sense.  If the child is the empty set, then the output can only be the empty set.
> > 
> > What were you trying to accomplish with this last rule?
> > 
> > Anyway, the correct thing to do is to make op_to_number have profiling to tell it what kind of number it returns.  I believe that mlam implemented something like this for other numerical operations.  Please do the same here as a prerequisite to implementing ToNumber in the DFG.
> 
> "Please make sure that if you predict double - i.e. your speculated type
> includes double in its set - that you have some evidence that you actually
> need to include double. Never do it blindly."
> 
> Make sense. Thanks.
> 
> "What were you trying to accomplish with this last rule?"
> 
> ToNumber always returns number types even if any argument comes. This rule
> represented that anything number is speculated for this ToNumber node.
> But yup, this is speculation phase, not AI. So instead of returning possible
> type sets, as you said, just leaving this node unspeculated is correct.

Aha, I can see your thought process.  But note that also in AI, it would be most correct to not set the node's result if the node's inputs are empty.

Any forward data flow static analysis will have to deal with this seemingly contradictory situation where one (or more) of the inputs has "bottom", or the empty set, as its type or abstract value. It's important to handle this correctly, because this case has far-reaching implications. Let's consider what are the possible meanings of the input being bottom. I believe that there are only two possible meanings. For the purpose of this discussion, I'll assume that we're dealing with a forward data flow analysis, AKA an abstract interpreter, and that the thing it's solving for are the types of data flow Nodes. In type terminology, the empty set is called "bottom".

1) The data flow fixpoint hasn't yet had an opportunity to propagate any meaningful information into the node, and this is the only reason why it's bottom. Eventually, that input will have something other than bottom as its type.

2) The data flow fixpoint has proved that the input node will never execute. Because we require that the inputs of a node dominate the node, the fact that the input never executes proves that we will never execute.

I believe that this claim generalizes to any forward data flow analysis, including our abstract interpreter as well as a bunch of other forward data flow analyses: the integer range optimization phase, OSR availability, etc.

Note that in the prediction propagation phase, the "never execute" part doesn't have to do with what will happen in the future. The prediction propagation phase is concerned with inferring what happened in the past. Therefore, when we hit case (2), we are actually proving that the input had never executed in the past and that therefore, the current operation has never executed in the past either. We usually handle such things by forcing OSR exit as soon as we hit something that (a) has never executed and (b) cannot be compiled without some information about its type.

However, it's possible that there is a separate bug in prediction propagation where some node claims to predict bottom even though it has executed in the past. This is a bug - either in prediction propagation or in profiling. When we see this happen, we should fix the real bug instead of working around it by having nodes special-case bottom inputs. So, let's assume that a bottom input either means that we simply haven't finished some propagation and will see non-bottom later or that it proves that the input node has never executed in the past. Furthermore, let's assume that the node that has never executed in the past has already been turned into ForceOSRExit or that it will be turned into ForceOSRExit at some later point in compilation.

Knowing all of this, let's consider what would happen if we had prediction propagation predict something other than bottom if it saw a bottom input.

In case (1): If we predict non-bottom on a bottom input and that input becomes non-bottom later, then we will have polluted our prediction for no good reason. Say that we special-case bottom inputs by merging a SpecBoolean prediction. Say that our non-bottom propagation rule says that if the input is definitely SpecInt, then return SpecInt. But let's say that one of our inputs will be evaluated by prediction propagation at some point in the future, so for now it claims to be bottom. And let's say that this input will predict SpecInt once its prediction rule runs. In this case, we'd *like* to claim that we predict SpecInt. But if we special-case bottom to return SpecBoolean, then we will eventually end up predicting SpecBoolean|SpecInt, which is less precise. Moreover, it's dangerously imprecise: many type inference rules in FixupPhase will avoid optimizing an operation if it's note purely SpecInt.

From this we can see that predicting anything other than bottom when your input(s) are bottom is going to be a disaster unless your bottom prediction is a subset of what you will eventually predict. For this reason, we have some nodes that say things like:

    case Foo:
        setPrediction(SpecThings);
        break;

Let's say Foo has two inputs. Even though this returns non-bottom when either inputs is bottom, this is fine, because it *always* predicts SpecThings. So an input transitioning away from bottom will not cause problems.

In case (2): Here it gets tricky. Let's again say that we have a rule that we will predict SpecInt if the input is SpecInt, and another rule that we will predict SpecBoolean if the input is bottom. Let's say that the input is bottom because the input node never executed. Here's an example of what this might look like:

    var x;
    if (thing that never happens) {
        y = NeverExecutes()
        x = Foo(@y)
    } else {
        x = 42
    }
    x++

Notice that in this case, claiming to return SpecBoolean is totally harmful even though Foo never executes. In this case, it would have been far better if Foo claimed to return bottom when its inputs are bottom.

So, in both cases, it looks like special-casing bottom to return non-bottom is a bad idea.

Note that rule (1) generalizes to cases where the input is fed by multiple control flow paths, and we're in a state where we have predicted what happens along one of them but not the other. For this reason, it's best if the prediction propagation rules obey a kind of monotonicity:

"Given types T and U where T is a subset of U (for example, U = SpecBytecodeNumber and T = SpecInt32), if Foo(@x of type T) predicts type V and Foo(@x of type Y) predicts type W, then V must be a subset of W."

This rule ensures that we don't pollute a node's type/set/value/whatever with inaccurate information for no reason. This rule also forces prediction propagation rules to avoid bottom special cases, or indeed any kind of special cases - since it can be quite hard to write complex logic that still obeys this rule.

> 
> "Anyway, the correct thing to do is to make op_to_number have profiling to
> tell it what kind of number it returns. I believe that mlam implemented
> something like this for other numerical operations. Please do the same here
> as a prerequisite to implementing ToNumber in the DFG."
> 
> OK, I'm now planning to insert value profiling for op_to_number and use heap
> prediction similar to ArithCeil/ArithFloor/ArithRound.
Comment 41 Yusuke Suzuki 2016-03-07 11:37:03 PST
Comment on attachment 273173 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273173&action=review

>>>> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:611
>>>> +        }
>>> 
>>> We can't land this since it introduces double pollution.  It's wrong to create prediction propagation rules that blindly predict number.  You need to have some profiling to tell you if you will see just ints, just doubles, or either ints or doubles.  Ideally you will also have profiling to tell you finer-grained things as well, but the minimum is distinguishing ints from doubles.
>>> 
>>> Taking it piece by piece:
>>> 
>>>                     changed |= mergePrediction(mergeSpeculations(child & SpecFullNumber, SpecBytecodeNumber));
>>> 
>>> This is wrong because it forces the output to include double even if the input didn't contain double.  It's making a guess that the effectful part of ToNumber returns doubles.  This means that any code like this will have a really bad time:
>>> 
>>> o.valueOf = function() { return some integer; }
>>> var x = @toNumber(o);
>>> for (var i = 0; i < ...; ++i) { do things with x; }
>>> 
>>> Please make sure that if you predict double - i.e. your speculated type includes double in its set - that you have some evidence that you actually need to include double.  Never do it blindly.
>>> 
>>>                     changed |= mergePrediction(child & SpecFullNumber);
>>> 
>>> This should just be "changed |= mergePrediction(child)", since you already know that child & ~SpecFullNumber is zero.
>>> 
>>>             } else
>>>                 changed |= mergePrediction(SpecBytecodeNumber);
>>> 
>>> I don't think this belongs here at all.  Why are you saying that if the child is the empty set then the output is any number?  That doesn't make sense.  If the child is the empty set, then the output can only be the empty set.
>>> 
>>> What were you trying to accomplish with this last rule?
>>> 
>>> Anyway, the correct thing to do is to make op_to_number have profiling to tell it what kind of number it returns.  I believe that mlam implemented something like this for other numerical operations.  Please do the same here as a prerequisite to implementing ToNumber in the DFG.
>> 
>> "Please make sure that if you predict double - i.e. your speculated type includes double in its set - that you have some evidence that you actually need to include double. Never do it blindly."
>> 
>> Make sense. Thanks.
>> 
>> "What were you trying to accomplish with this last rule?"
>> 
>> ToNumber always returns number types even if any argument comes. This rule represented that anything number is speculated for this ToNumber node.
>> But yup, this is speculation phase, not AI. So instead of returning possible type sets, as you said, just leaving this node unspeculated is correct.
>> 
>> "Anyway, the correct thing to do is to make op_to_number have profiling to tell it what kind of number it returns. I believe that mlam implemented something like this for other numerical operations. Please do the same here as a prerequisite to implementing ToNumber in the DFG."
>> 
>> OK, I'm now planning to insert value profiling for op_to_number and use heap prediction similar to ArithCeil/ArithFloor/ArithRound.
> 
> Aha, I can see your thought process.  But note that also in AI, it would be most correct to not set the node's result if the node's inputs are empty.
> 
> Any forward data flow static analysis will have to deal with this seemingly contradictory situation where one (or more) of the inputs has "bottom", or the empty set, as its type or abstract value. It's important to handle this correctly, because this case has far-reaching implications. Let's consider what are the possible meanings of the input being bottom. I believe that there are only two possible meanings. For the purpose of this discussion, I'll assume that we're dealing with a forward data flow analysis, AKA an abstract interpreter, and that the thing it's solving for are the types of data flow Nodes. In type terminology, the empty set is called "bottom".
> 
> 1) The data flow fixpoint hasn't yet had an opportunity to propagate any meaningful information into the node, and this is the only reason why it's bottom. Eventually, that input will have something other than bottom as its type.
> 
> 2) The data flow fixpoint has proved that the input node will never execute. Because we require that the inputs of a node dominate the node, the fact that the input never executes proves that we will never execute.
> 
> I believe that this claim generalizes to any forward data flow analysis, including our abstract interpreter as well as a bunch of other forward data flow analyses: the integer range optimization phase, OSR availability, etc.
> 
> Note that in the prediction propagation phase, the "never execute" part doesn't have to do with what will happen in the future. The prediction propagation phase is concerned with inferring what happened in the past. Therefore, when we hit case (2), we are actually proving that the input had never executed in the past and that therefore, the current operation has never executed in the past either. We usually handle such things by forcing OSR exit as soon as we hit something that (a) has never executed and (b) cannot be compiled without some information about its type.
> 
> However, it's possible that there is a separate bug in prediction propagation where some node claims to predict bottom even though it has executed in the past. This is a bug - either in prediction propagation or in profiling. When we see this happen, we should fix the real bug instead of working around it by having nodes special-case bottom inputs. So, let's assume that a bottom input either means that we simply haven't finished some propagation and will see non-bottom later or that it proves that the input node has never executed in the past. Furthermore, let's assume that the node that has never executed in the past has already been turned into ForceOSRExit or that it will be turned into ForceOSRExit at some later point in compilation.
> 
> Knowing all of this, let's consider what would happen if we had prediction propagation predict something other than bottom if it saw a bottom input.
> 
> In case (1): If we predict non-bottom on a bottom input and that input becomes non-bottom later, then we will have polluted our prediction for no good reason. Say that we special-case bottom inputs by merging a SpecBoolean prediction. Say that our non-bottom propagation rule says that if the input is definitely SpecInt, then return SpecInt. But let's say that one of our inputs will be evaluated by prediction propagation at some point in the future, so for now it claims to be bottom. And let's say that this input will predict SpecInt once its prediction rule runs. In this case, we'd *like* to claim that we predict SpecInt. But if we special-case bottom to return SpecBoolean, then we will eventually end up predicting SpecBoolean|SpecInt, which is less precise. Moreover, it's dangerously imprecise: many type inference rules in FixupPhase will avoid optimizing an operation if it's note purely SpecInt.
> 
> From this we can see that predicting anything other than bottom when your input(s) are bottom is going to be a disaster unless your bottom prediction is a subset of what you will eventually predict. For this reason, we have some nodes that say things like:
> 
>     case Foo:
>         setPrediction(SpecThings);
>         break;
> 
> Let's say Foo has two inputs. Even though this returns non-bottom when either inputs is bottom, this is fine, because it *always* predicts SpecThings. So an input transitioning away from bottom will not cause problems.
> 
> In case (2): Here it gets tricky. Let's again say that we have a rule that we will predict SpecInt if the input is SpecInt, and another rule that we will predict SpecBoolean if the input is bottom. Let's say that the input is bottom because the input node never executed. Here's an example of what this might look like:
> 
>     var x;
>     if (thing that never happens) {
>         y = NeverExecutes()
>         x = Foo(@y)
>     } else {
>         x = 42
>     }
>     x++
> 
> Notice that in this case, claiming to return SpecBoolean is totally harmful even though Foo never executes. In this case, it would have been far better if Foo claimed to return bottom when its inputs are bottom.
> 
> So, in both cases, it looks like special-casing bottom to return non-bottom is a bad idea.
> 
> Note that rule (1) generalizes to cases where the input is fed by multiple control flow paths, and we're in a state where we have predicted what happens along one of them but not the other. For this reason, it's best if the prediction propagation rules obey a kind of monotonicity:
> 
> "Given types T and U where T is a subset of U (for example, U = SpecBytecodeNumber and T = SpecInt32), if Foo(@x of type T) predicts type V and Foo(@x of type Y) predicts type W, then V must be a subset of W."
> 
> This rule ensures that we don't pollute a node's type/set/value/whatever with inaccurate information for no reason. This rule also forces prediction propagation rules to avoid bottom special cases, or indeed any kind of special cases - since it can be quite hard to write complex logic that still obeys this rule.

Completely right! I missed that the bottom meaning in this phase, forwarding analysis that runs to the fixed-point.
We start bottoms, eventually extend the results and reach to the fixed-point by iterating until there is no changes.
As you said, producing non-bottom with the bottom input unreasonably pollutes the result.
It is later propagated to the use nodes and makes the graph fixed-point where nodes have unnecessarily generic (and/or unreasonable if nodes produce unpossible types for bottoms) speculations.
Comment 42 Yusuke Suzuki 2016-03-08 09:01:10 PST
Created attachment 273298 [details]
Patch

WIP
Comment 43 Build Bot 2016-03-08 10:12:25 PST
Comment on attachment 273298 [details]
Patch

Attachment 273298 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/942583

Number of test failures exceeded the failure limit.
Comment 44 Build Bot 2016-03-08 10:12:28 PST
Created attachment 273301 [details]
Archive of layout-test-results from ews115 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews115  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 45 Yusuke Suzuki 2016-03-09 06:27:01 PST
After investigating the code, leveraging value profile in both NumberConstructor and op_to_number seems cleaner approach.
If we use value profiling, we can leverage existing OSRExit feedback system in DFG.
Comment 46 Yusuke Suzuki 2016-03-09 07:27:33 PST
Created attachment 273427 [details]
Patch
Comment 47 Yusuke Suzuki 2016-03-09 07:36:56 PST
Comment on attachment 273427 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273427&action=review

> Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:1674
> +        // it performs BasicBlock clipping.

Is my understanding here correct? (This is in ToPrimitive)

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4660
> +            SpeculatedType prediction = getPredictionWithoutOSRExit();

Let's use prediction provided by the value profiling in op_to_number.
op_to_number may be converted to Identity if the argument is speculated as Int32 / Int52 / number.
So here, I choose getPredictionWithoutOSRExit() instead of getPrediction(), since we have a chance to convert ToNumber to Identity without the heap prediciton information.

What do you think of?

> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:603
> +            if (child) {

Yeah, now this phase handles bottom meaning correctly.

> Source/JavaScriptCore/tests/stress/to-number-intrinsic-int52.js:29
> +    shouldBe(test(object), true);

In this case, our value profiling in op_to_number is updated when OSRExit occurs.
Comment 48 Yusuke Suzuki 2016-03-09 07:48:03 PST
Still considering...
Comment 49 Yusuke Suzuki 2016-03-09 07:55:11 PST
Comment on attachment 273427 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273427&action=review

Fixing...

> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:617
> +                    if (isInt32SpeculationForArithmetic(node->getHeapPrediction()))

Let's not propagate if heap prediction is SpecNone.
Comment 50 Filip Pizlo 2016-03-09 08:00:11 PST
Comment on attachment 273427 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273427&action=review

>> Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:1674
>> +        // it performs BasicBlock clipping.
> 
> Is my understanding here correct? (This is in ToPrimitive)

You shouldn't have to do this. If the input is empty because the code is unreachable then AI will almost always realize this before it even decides to interpret this code.

>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4660
>> +            SpeculatedType prediction = getPredictionWithoutOSRExit();
> 
> Let's use prediction provided by the value profiling in op_to_number.
> op_to_number may be converted to Identity if the argument is speculated as Int32 / Int52 / number.
> So here, I choose getPredictionWithoutOSRExit() instead of getPrediction(), since we have a chance to convert ToNumber to Identity without the heap prediciton information.
> 
> What do you think of?

It seems reasonable to convert this to a number check if the input is empty. But then what do you claim the output to be?

I would rather we not play that game unless there is some benchmark that really needs it.

> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:618
> +                        changed |= mergePrediction(SpecInt32);

I see. This sort of makes sense but it's strange that you need this. 

Would we ever be in a situation where this has executed but has no value profiling?
Comment 51 Yusuke Suzuki 2016-03-09 08:17:15 PST
Created attachment 273429 [details]
Patch
Comment 52 Yusuke Suzuki 2016-03-09 08:35:09 PST
Created attachment 273431 [details]
Patch
Comment 53 Yusuke Suzuki 2016-03-09 08:53:02 PST
Comment on attachment 273427 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=273427&action=review

Thanks! I've updated the patch.

>>> Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:1674
>>> +        // it performs BasicBlock clipping.
>> 
>> Is my understanding here correct? (This is in ToPrimitive)
> 
> You shouldn't have to do this. If the input is empty because the code is unreachable then AI will almost always realize this before it even decides to interpret this code.

OK. Is it better to leave this comment? Or is dropping this (And this existing check in ToPrimitive) nice?

>>> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:4660
>>> +            SpeculatedType prediction = getPredictionWithoutOSRExit();
>> 
>> Let's use prediction provided by the value profiling in op_to_number.
>> op_to_number may be converted to Identity if the argument is speculated as Int32 / Int52 / number.
>> So here, I choose getPredictionWithoutOSRExit() instead of getPrediction(), since we have a chance to convert ToNumber to Identity without the heap prediciton information.
>> 
>> What do you think of?
> 
> It seems reasonable to convert this to a number check if the input is empty. But then what do you claim the output to be?
> 
> I would rather we not play that game unless there is some benchmark that really needs it.

"It seems reasonable to convert this to a number check if the input is empty. But then what do you claim the output to be?"

Currently, in the latest patch, we claim that the output is SpecNone if the input prediction is not a number and the heap prediction is empty.
This is the similar strategy to GetByVal etc.

>> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:618
>> +                        changed |= mergePrediction(SpecInt32);
> 
> I see. This sort of makes sense but it's strange that you need this. 
> 
> Would we ever be in a situation where this has executed but has no value profiling?

I've added the `to-number-intrinsic-object-without-execution.js` in the latest patch.
The code is like,

function test(x, y)
{
    if (x)
        return +y;  // op_to_number => DFG ToNumber node.
    return y;
}
noInline(test);

var object = { valueOf() { return 41; } };
for (var i = 0; i < 1e4; ++i)
    shouldBe(test(false, object), object);
for (var i = 0; i < 1e4; ++i)
    shouldBe(test(true, object), 41)

In this case, ToNumber's heap prediction is None, but test(true, object) will execute this ToNumber node.
Comment 54 Yusuke Suzuki 2016-03-17 12:21:28 PDT
Created attachment 274312 [details]
Patch

WIP, adding perf tests
Comment 55 Yusuke Suzuki 2016-03-20 06:16:04 PDT
Created attachment 274538 [details]
Patch

WIP
Comment 56 Yusuke Suzuki 2016-03-20 16:22:08 PDT
Created attachment 274549 [details]
Patch
Comment 57 Yusuke Suzuki 2016-03-20 16:26:02 PDT
Created attachment 274550 [details]
Performance results

Performance results. The results show that, while the existing benchmarks are performance neutral, the newly added regression cases pose performance benefits.

Highlights.

   many-foreach-calls                                 3.8189+-0.0893     ^      3.5397+-0.1113        ^ definitely 1.0789x faster
   to-number-constructor-number-string-number-string   
                                                      5.6464+-0.0770     ^      4.5756+-0.2072        ^ definitely 1.2340x faster
   to-number-constructor-only-number                  1.1697+-0.4333            0.8915+-0.3496          might be 1.3121x faster
   to-number-constructor-only-string                  2.4897+-0.3875            2.1661+-0.1091          might be 1.1494x faster
   to-number-constructor-string-number-string-number   
                                                      5.0871+-0.8283     ^      3.9309+-0.2013        ^ definitely 1.2941x faster
   to-number-number-string-number-string              4.6152+-0.2719     ^      3.9729+-0.1904        ^ definitely 1.1617x faster
   to-number-only-number                              0.8597+-0.3273            0.8156+-0.3341          might be 1.0542x faster
   to-number-only-string                              2.5112+-0.4037            2.1957+-0.1924          might be 1.1437x faster
   to-number-string-number-string-number              4.4713+-0.7338            3.8277+-0.1535          might be 1.1681x faster

And major cases.

SunSpider:
   3d-cube                                            6.3732+-0.0931     ?      6.4298+-0.2714        ?
   3d-morph                                           6.0883+-0.1106     ?      6.0993+-0.1288        ?
   3d-raytrace                                        7.3366+-0.3756     ?      7.5585+-0.4992        ? might be 1.0302x slower
   access-binary-trees                                2.5618+-0.0535     ?      2.7611+-0.5258        ? might be 1.0778x slower
   access-fannkuch                                    7.2067+-0.0486     ?      7.4743+-0.7703        ? might be 1.0371x slower
   access-nbody                                       3.2580+-0.3149            3.1177+-0.1220          might be 1.0450x faster
   access-nsieve                                      3.5742+-0.3015            3.4047+-0.0683          might be 1.0498x faster
   bitops-3bit-bits-in-byte                           1.5273+-0.0549            1.5141+-0.0298        
   bitops-bits-in-byte                                3.3797+-0.2493            3.2998+-0.1257          might be 1.0242x faster
   bitops-bitwise-and                                 2.2354+-0.0541     ?      2.4384+-0.2693        ? might be 1.0908x slower
   bitops-nsieve-bits                                 3.7802+-0.0338     ?      3.8090+-0.0302        ?
   controlflow-recursive                              2.8688+-0.1499            2.8019+-0.0527          might be 1.0239x faster
   crypto-aes                                         5.4863+-1.2255            4.9929+-0.0867          might be 1.0988x faster
   crypto-md5                                         3.1178+-0.0544            3.0934+-0.0532        
   crypto-sha1                                        2.9456+-0.1755            2.8778+-0.0911          might be 1.0236x faster
   date-format-tofte                                 10.7627+-1.1743           10.3254+-0.4965          might be 1.0423x faster
   date-format-xparb                                  6.6530+-1.0932            6.2252+-0.6539          might be 1.0687x faster
   math-cordic                                        3.4194+-0.0508     ?      3.5102+-0.2366        ? might be 1.0265x slower
   math-partial-sums                                  6.5095+-1.6709            6.0290+-0.1583          might be 1.0797x faster
   math-spectral-norm                                 2.6340+-0.1088     ?      2.8098+-0.6107        ? might be 1.0667x slower
   regexp-dna                                         7.4284+-0.1836     ?      7.6663+-0.8676        ? might be 1.0320x slower
   string-base64                                      6.4280+-3.2477            5.6663+-1.2063          might be 1.1344x faster
   string-fasta                                       7.4894+-1.0250            7.2339+-0.5068          might be 1.0353x faster
   string-tagcloud                                    9.7450+-0.4018     ?      9.7917+-0.5663        ?
   string-unpack-code                                21.2923+-0.5871     ?     21.5615+-0.7747        ? might be 1.0126x slower
   string-validate-input                              5.0477+-0.0731     ?      5.4909+-0.9653        ? might be 1.0878x slower

   <arithmetic>                                       5.7365+-0.1204            5.6917+-0.0738          might be 1.0079x faster

                                                         Baseline                    Mine                                       
LongSpider:
   3d-cube                                         1016.1585+-27.0689    ?   1022.3058+-23.2973       ?
   3d-morph                                         713.1695+-12.1136         708.7007+-12.7297       
   3d-raytrace                                      805.8268+-14.9657         801.5078+-5.8238        
   access-binary-trees                             1101.4223+-26.7799        1085.6840+-12.1603         might be 1.0145x faster
   access-fannkuch                                  306.4267+-11.3278    ?    309.6245+-19.8502       ? might be 1.0104x slower
   access-nbody                                     626.3729+-5.7993     ?    630.9977+-7.4701        ?
   access-nsieve                                    459.5836+-1.2277     ?    468.8323+-11.5473       ? might be 1.0201x slower
   bitops-3bit-bits-in-byte                          43.9318+-1.2346           43.3387+-0.3393          might be 1.0137x faster
   bitops-bits-in-byte                              133.2760+-5.3565          132.1829+-2.9661        
   bitops-nsieve-bits                               428.8856+-13.3582    ?    429.5735+-4.6456        ?
   controlflow-recursive                            554.0968+-12.2939         549.8398+-0.2887        
   crypto-aes                                       807.0908+-25.3401    ?    807.8371+-26.1642       ?
   crypto-md5                                       603.9527+-10.7737         602.0818+-8.9615        
   crypto-sha1                                      803.9685+-14.4063    ?    804.3979+-5.2481        ?
   date-format-tofte                                900.8805+-41.2003    ?    901.9331+-11.2552       ?
   date-format-xparb                                881.2698+-46.6007         881.2690+-56.3496       
   hash-map                                         192.7647+-9.3403          189.2370+-3.9540          might be 1.0186x faster
   math-cordic                                      582.2319+-34.6753         574.4958+-20.0469         might be 1.0135x faster
   math-partial-sums                                564.7260+-11.7855    ?    569.1930+-11.7245       ?
   math-spectral-norm                               907.1774+-11.3226         903.9260+-7.2020        
   string-base64                                    474.8335+-11.7379         472.2028+-5.3264        
   string-fasta                                     435.9358+-13.5105    ?    437.7109+-12.5690       ?
   string-tagcloud                                  206.9197+-1.2552     ?    210.1993+-8.1898        ? might be 1.0158x slower

   <geometric>                                      485.4015+-2.4990          484.9093+-2.3970          might be 1.0010x faster

                                                         Baseline                    Mine                                       
V8Spider:
   crypto                                            48.7447+-3.1533           48.1138+-0.8902          might be 1.0131x faster
   deltablue                                         59.6057+-3.0103     ?     62.7792+-2.4343        ? might be 1.0532x slower
   earley-boyer                                      50.0319+-1.6403           48.8795+-0.6415          might be 1.0236x faster
   raytrace                                          31.5288+-1.2310     ?     31.7156+-1.1773        ?
   regexp                                            72.0871+-1.5823     ?     73.4077+-3.8745        ? might be 1.0183x slower
   richards                                          51.7085+-2.6363           50.5961+-0.4412          might be 1.0220x faster
   splay                                             39.9800+-2.3166     ?     41.8163+-4.1065        ? might be 1.0459x slower

   <geometric>                                       49.0341+-0.5677     ?     49.4781+-1.1016        ? might be 1.0091x slower

                                                         Baseline                    Mine                                       
Octane:
   encrypt                                           0.21933+-0.00796          0.21682+-0.00613         might be 1.0116x faster
   decrypt                                           3.81274+-0.04886    ?     3.81502+-0.06411       ?
   deltablue                                x2       0.19978+-0.00386    ?     0.20089+-0.00356       ?
   earley                                            0.41307+-0.00282          0.41273+-0.01057       
   boyer                                             6.34479+-0.13780    ?     6.38300+-0.13414       ?
   navier-stokes                            x2       5.38274+-0.03017          5.37487+-0.06769       
   raytrace                                 x2       1.30391+-0.06970          1.29456+-0.03112       
   richards                                 x2       0.10441+-0.00172    ?     0.10532+-0.00290       ?
   splay                                    x2       0.44981+-0.00599    ?     0.45071+-0.00529       ?
   regexp                                   x2      24.69333+-0.72302         24.66242+-0.24702       
   pdfjs                                    x2      48.62453+-0.78942    ?    49.32342+-0.73513       ? might be 1.0144x slower
   mandreel                                 x2      53.12757+-0.21540         53.02321+-0.37252       
   gbemu                                    x2      32.18213+-1.69259         31.98962+-0.77623       
   closure                                           0.72071+-0.00977          0.71951+-0.00550       
   jquery                                            9.12171+-0.02957          9.08233+-0.08805       
   box2d                                    x2      11.76712+-0.13221    ?    11.80162+-0.15008       ?
   zlib                                     x2     426.66117+-4.07487        425.22957+-5.73037       
   typescript                               x2     845.44366+-8.82720    ?   847.07080+-11.60012      ?

   <geometric>                                       6.64518+-0.05788    ?     6.64906+-0.02221       ? might be 1.0006x slower

                                                         Baseline                    Mine                                       
Kraken:
   ai-astar                                           96.096+-0.249      ?      98.957+-5.914         ? might be 1.0298x slower
   audio-beat-detection                               58.540+-1.609      ?      59.451+-5.810         ? might be 1.0156x slower
   audio-dft                                         109.577+-1.914      ?     111.496+-3.840         ? might be 1.0175x slower
   audio-fft                                          45.746+-0.413             45.697+-0.408         
   audio-oscillator                                   59.956+-4.295             58.193+-0.728           might be 1.0303x faster
   imaging-darkroom                                   73.247+-0.334      ?      73.707+-0.873         ?
   imaging-desaturate                                 65.599+-1.471      ?      66.509+-4.943         ? might be 1.0139x slower
   imaging-gaussian-blur                              97.823+-17.169            95.568+-14.045          might be 1.0236x faster
   json-parse-financial                               49.054+-0.706      ?      49.335+-0.444         ?
   json-stringify-tinderbox                           28.627+-0.481             28.091+-0.671           might be 1.0191x faster
   stanford-crypto-aes                                47.233+-1.428             46.852+-0.338         
   stanford-crypto-ccm                                44.361+-4.990             44.157+-4.383         
   stanford-crypto-pbkdf2                            115.546+-4.619            112.966+-1.952           might be 1.0228x faster
   stanford-crypto-sha256-iterative                   45.686+-0.175      ?      46.909+-3.962         ? might be 1.0268x slower

   <arithmetic>                                       66.935+-1.706      ?      66.992+-1.698         ? might be 1.0008x slower

AsmBench:
   bigfib.cpp                                       531.5360+-9.9592          529.1728+-5.4392        
   container.cpp                                   3510.8025+-15.3161    ?   3515.0745+-28.9874       ?
   dry.c                                            553.9915+-9.6924     ?    559.1116+-47.1683       ?
   float-mm.c                                       817.0005+-11.0379         812.3982+-10.5081       
   gcc-loops.cpp                                   4903.0374+-48.1377    ?   4910.6392+-15.9543       ?
   hash-map                                         190.9288+-3.5004     ?    191.6537+-6.4353        ?
   n-body.c                                        1053.8492+-9.2325     ?   1062.0502+-26.0760       ?
   quicksort.c                                      463.4501+-8.6122          459.1661+-4.4287        
   towers.c                                         309.2684+-5.6332          306.5305+-2.1822        

   <geometric>                                      789.6669+-3.0369          789.1655+-8.8593          might be 1.0006x faster
Comment 58 Yusuke Suzuki 2016-03-20 16:32:17 PDT
Created attachment 274551 [details]
Patch
Comment 59 Yusuke Suzuki 2016-03-22 09:06:56 PDT
Comment on attachment 274551 [details]
Patch

Still checking getPredictionWithoutOSRExit effect.
Comment 60 Yusuke Suzuki 2016-03-29 09:51:16 PDT
I've got the clean MacBook Pro. It's good for performance measurement :)
Comment 61 Yusuke Suzuki 2016-04-06 08:11:00 PDT
Before this, I'll finish https://bugs.webkit.org/show_bug.cgi?id=155110
Comment 62 Yusuke Suzuki 2016-04-18 11:20:21 PDT
Created attachment 276648 [details]
Patch
Comment 63 Build Bot 2016-04-18 12:05:52 PDT
Comment on attachment 276648 [details]
Patch

Attachment 276648 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1181587

Number of test failures exceeded the failure limit.
Comment 64 Build Bot 2016-04-18 12:05:57 PDT
Created attachment 276652 [details]
Archive of layout-test-results from ews101 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 65 Build Bot 2016-04-18 12:06:26 PDT
Comment on attachment 276648 [details]
Patch

Attachment 276648 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1181583

Number of test failures exceeded the failure limit.
Comment 66 Build Bot 2016-04-18 12:06:31 PDT
Created attachment 276653 [details]
Archive of layout-test-results from ews104 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 67 Build Bot 2016-04-18 12:08:59 PDT
Comment on attachment 276648 [details]
Patch

Attachment 276648 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/1181585

Number of test failures exceeded the failure limit.
Comment 68 Build Bot 2016-04-18 12:09:05 PDT
Created attachment 276654 [details]
Archive of layout-test-results from ews123 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews123  Port: ios-simulator-wk2  Platform: Mac OS X 10.10.5
Comment 69 Darin Adler 2016-04-18 12:26:26 PDT
Comment on attachment 276648 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=276648&action=review

> Source/JavaScriptCore/builtins/NumberConstructor.js:35
> +    // Return false if value is |NaN|.
> +    if (value !== value)
> +        return false;

Can’t we leave out this if statement? Isn’t the expression below already false for |NaN|?
Comment 70 Yusuke Suzuki 2016-04-18 15:08:05 PDT
Comment on attachment 276648 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=276648&action=review

Thanks, I'll measure the performance before r? (Currently, i'm in London, so after returning Japan, I'll take the measurement)

>> Source/JavaScriptCore/builtins/NumberConstructor.js:35
>> +        return false;
> 
> Can’t we leave out this if statement? Isn’t the expression below already false for |NaN|?

`NaN !== (any value)` is always true. So `NaN !== @Infinity && NaN !== -@Infinity` is true.
Comment 71 Build Bot 2016-04-18 15:46:48 PDT
Comment on attachment 276648 [details]
Patch

Attachment 276648 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1182335

Number of test failures exceeded the failure limit.
Comment 72 Build Bot 2016-04-18 15:46:54 PDT
Created attachment 276675 [details]
Archive of layout-test-results from ews115 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews115  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 73 Joseph Pecoraro 2016-06-03 18:48:59 PDT
Comment on attachment 276648 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=276648&action=review

> Source/JavaScriptCore/builtins/NumberConstructor.js:45
> +    if (typeof value !== "number")
> +        return false;

Is this typeof necessary? If it isn't observable, it seems to me the "value !== value" on its own should be enough.
Comment 74 Joseph Pecoraro 2016-06-03 18:59:44 PDT
Comment on attachment 276648 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=276648&action=review

> Source/JavaScriptCore/builtins/GlobalOperations.js:46
> +    // originally Math.min(Math.max(length, 0), maxSafeInteger));
> +    return length > 0 ? (length < maxSafeInteger ? length : @MAX_SAFE_INTEGER) : 0;

Looks like this may be what is causing the tests to fail. maxSafeInteger is undefined, so it should probably be @MAX_SAFE_INTEGER.
Comment 75 Joseph Pecoraro 2016-06-04 01:35:00 PDT
Created attachment 280512 [details]
[PATCH] Additions to Yusuke's Previous Patch

Ben and I were looking into adding Intrinsics for `isNaN` and `Number.isNaN`, and we implemented that with a bunch of tests before coming across this bug, which implements them as builtins AND adds ToNumber DFG support (which we had skipped)!

Given we already implemented the Intrinsics case I thought it would be fun to compare the two approaches. I rebaselined Yusuke's builtins patch and compared with the "isNaN on doubles" regress tests we had already made earlier.

At first the Intrinsics was unexpectedly outperforming builtins. Some debugging showed that the isNaN builtin with @toNumber was not making it to the FTL because CompareStrictEq(Untyped, Untyped) could not be FTLCompiled. When we added support for that, the performance of builtins was much better, on-par and slightly better then the Intrinsics!

>                             baseline                  builtins                                     
>    global-isNaN         42.7479+-1.5150     ^      9.4362+-0.2667        ^ definitely 4.5302x faster
>    Number-isNaN         38.1028+-1.0004     ^     10.2417+-0.3694        ^ definitely 3.7204x faster
>    <geometric>          40.3484+-1.0544     ^      9.8273+-0.2458        ^ definitely 4.1058x faster
>
>
>                            intrinsic                  builtins                                     
>    global-isNaN         10.2375+-0.3228            9.6810+-0.3510          might be 1.0575x faster
>    Number-isNaN         10.0255+-0.2097     ?     10.1832+-0.1975        ? might be 1.0157x slower
>    <geometric>          10.1276+-0.1849            9.9246+-0.1853          might be 1.0205x faster

Ben also mentioned some reasons that builtins would be better then intrinsics, so that is probably the better approach anyways.

This patch has been sitting for a month and a half, so I'll update a rebaselined patch, highlighting just the changes I made. However there is a lot I don't fully understand in the original patch (Heap Predictions and the New Value Profiling) so I should probably hand the patch off to someone else.
Comment 76 Joseph Pecoraro 2016-06-04 01:36:17 PDT
Created attachment 280513 [details]
[PATCH] isNaN Intrinsics (for comparison)
Comment 77 Joseph Pecoraro 2016-06-04 01:39:46 PDT
Created attachment 280514 [details]
[PATCH] Rebaselined + Combined Tests

I'll get updated benchmark results from my laptop soon.
Comment 78 Joseph Pecoraro 2016-06-04 01:42:01 PDT
Comment on attachment 280512 [details]
[PATCH] Additions to Yusuke's Previous Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=280512&action=review

> Source/JavaScriptCore/builtins/GlobalOperations.js:46
> -    return length > 0 ? (length < maxSafeInteger ? length : @MAX_SAFE_INTEGER) : 0;
> +    return length > 0 ? (length < @Number.MAX_SAFE_INTEGER ? length : @Number.MAX_SAFE_INTEGER) : 0;

NOTE: Object.getOwnPropertyDescriptor(Number, "MAX_SAFE_INTEGER")
< {value: 9007199254740991, writable: false, enumerable: false, configurable: false}

So this should be safe to do, but I think previous code just used the 0x1FFFFFFFFFFFFF constant.
Comment 79 WebKit Commit Bot 2016-06-04 01:43:19 PDT
Attachment 280514 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4889:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 1 in 84 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 80 Joseph Pecoraro 2016-06-04 10:17:19 PDT
Created attachment 280518 [details]
[RESULTS] Benchmark results
Comment 81 Yusuke Suzuki 2016-06-05 02:03:03 PDT
Comment on attachment 280512 [details]
[PATCH] Additions to Yusuke's Previous Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=280512&action=review

Great! Sorry, I worked on the different patch and left this for a long time...

>> Source/JavaScriptCore/builtins/GlobalOperations.js:46
>> +    return length > 0 ? (length < @Number.MAX_SAFE_INTEGER ? length : @Number.MAX_SAFE_INTEGER) : 0;
> 
> NOTE: Object.getOwnPropertyDescriptor(Number, "MAX_SAFE_INTEGER")
> < {value: 9007199254740991, writable: false, enumerable: false, configurable: false}
> 
> So this should be safe to do, but I think previous code just used the 0x1FFFFFFFFFFFFF constant.

Oops, thanks. Or you can use @MAX_SAFE_INTEGER bytecode intrinsic by adding MAX_SAFE_INTEGER entry like MAX_STRING_LENGTH in bytecode/BytecodeIntrinsicRegistry.{h,cpp}.
This intrinsic (@MAX_SAFE_INTEGER) is once introduced in @isConcatSeparatable patch, but it seems that this patch is reverted now.

> Source/JavaScriptCore/builtins/NumberConstructor.js:-46
> -

Right! Only the NaN becomes true for `value !== value` in all the values.
Comment 82 Saam Barati 2016-06-11 01:24:47 PDT
Comment on attachment 280514 [details]
[PATCH] Rebaselined + Combined Tests

View in context: https://bugs.webkit.org/attachment.cgi?id=280514&action=review

Quick comments. I'll look in more detail tomorrow or sunday

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3601
> +        flushRegisters();

This seems pessimistic.
Elaborated below

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3607
> +            callOperation(operationToNumber, JSValueRegs(resultTagGPR, resultPayloadGPR), argumentTagGPR, argumentPayloadGPR);

You can flushRegisters() before this.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3616
> +            callOperation(operationToNumber, JSValueRegs(resultTagGPR, resultPayloadGPR), argumentTagGPR, argumentPayloadGPR);

You should silentSpill/silentFill around here

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:3544
> +        flushRegisters();

Ditto here
Comment 83 Yusuke Suzuki 2016-06-15 10:48:13 PDT
Comment on attachment 280514 [details]
[PATCH] Rebaselined + Combined Tests

View in context: https://bugs.webkit.org/attachment.cgi?id=280514&action=review

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3601
>> +        flushRegisters();
> 
> This seems pessimistic.
> Elaborated below

Nice!! I'll change it.

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3607
>> +            callOperation(operationToNumber, JSValueRegs(resultTagGPR, resultPayloadGPR), argumentTagGPR, argumentPayloadGPR);
> 
> You can flushRegisters() before this.

Inserted.

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3616
>> +            callOperation(operationToNumber, JSValueRegs(resultTagGPR, resultPayloadGPR), argumentTagGPR, argumentPayloadGPR);
> 
> You should silentSpill/silentFill around here

I've just inserted silentSpillAllRegisters(result registers that will be written) & silentFillAllRegisters(result registers that will be written).
Comment 84 Yusuke Suzuki 2016-06-16 21:16:29 PDT
Created attachment 281528 [details]
Patch
Comment 85 Yusuke Suzuki 2016-06-16 22:12:14 PDT
Created attachment 281533 [details]
Patch
Comment 86 Yusuke Suzuki 2016-06-16 22:12:43 PDT
clang perf https://gist.github.com/Constellation/bf90cb58e6bfe033a9637f9f6c8be353
gcc perf https://gist.github.com/Constellation/e5668d2751adfb83e45ad966c1b53f97

to-number operations are improved.
clang
   to-number-string-number-string-number              3.4870+-0.1136            3.3479+-0.1014     ?      3.3837+-0.1006          might be 1.0305x faster
   to-number-only-number                              0.7907+-0.0146            0.7864+-0.0174            0.7814+-0.0204          might be 1.0119x faster
   to-number-constructor-string-number-string-number   
   to-number-constructor-only-number                  0.9893+-0.0182     ^      0.8108+-0.0151            0.8106+-0.0193        ^ definitely 1.2206x faster
   to-number-number-string-number-string              3.7534+-0.0442     ^      3.3689+-0.0863     ?      3.3731+-0.0727        ^ definitely 1.1127x faster
   to-number-constructor-number-string-number-string   
   to-number-constructor-only-string                  1.9983+-0.0179     ^      1.9225+-0.0462     ?      1.9903+-0.0561        
   to-number-only-string                              1.9435+-0.0127     ^      1.8583+-0.0341     ?      1.9292+-0.0410
   global-isNaN                                      43.3854+-0.1792     ^     13.4526+-0.1622           13.3526+-0.1088        ^ definitely 3.2492x faster
   Number-isNaN                                      40.1451+-0.3602     ^     13.1905+-0.1708     ?     13.4074+-0.1693        ^ definitely 2.9942x faster

gcc
   to-number-string-number-string-number              3.5788+-0.0595     ^      3.3110+-0.0362     ?      3.3376+-0.0452        ^ definitely 1.0723x faster
   to-number-only-number                              0.8090+-0.0155     ?      0.8149+-0.0167            0.7973+-0.0161          might be 1.0146x faster
   to-number-constructor-string-number-string-number   
   to-number-constructor-only-number                  1.0103+-0.0163     ^      0.8233+-0.0118     ?      0.8307+-0.0132        ^ definitely 1.2162x faster
   to-number-number-string-number-string              3.8718+-0.0120     ^      3.3753+-0.0485     ?      3.4253+-0.0595        ^ definitely 1.1303x faster
   to-number-constructor-number-string-number-string   
   to-number-constructor-only-string                  2.0040+-0.0267     ^      1.8386+-0.0308     ?      1.8807+-0.0390        ^ definitely 1.0655x faster
   to-number-only-string                              1.9409+-0.0241     ^      1.7961+-0.0258     ?      1.8422+-0.0281        ^ definitely 1.0535x faster
   global-isNaN                                      45.9407+-1.6226     ^     13.4784+-0.1317           13.4631+-0.1185        ^ definitely 3.4123x faster
   Number-isNaN                                      40.6952+-0.2095     ^     13.5030+-0.1252     ?     13.5063+-0.1276        ^ definitely 3.0131x faster
Comment 87 Yusuke Suzuki 2016-06-16 23:20:14 PDT
And AsmBench n-body is consistently better.

clang
   n-body.c                                         926.8761+-11.0716    ^    904.8155+-4.1272     ?    907.0384+-4.8709        ^ definitely 1.0219x faster

gcc
   n-body.c                                         925.5804+-5.6073     ^    902.6779+-6.7585     ?    910.3094+-9.7520          might be 1.0168x faster
Comment 88 Yusuke Suzuki 2016-06-17 09:12:52 PDT
And the result on OSX laptop. The performance looks neutral.

https://gist.github.com/Constellation/21c9e7c02e4872ec965559f723d7ab06

Kraken json-xxx is not related since they don't have any to_number bytecode.
LongSpider date-format-tofte also does not have to_number bytecode, so not related.

SunSpider's 3d-raytrace has several to_number bytecodes, but it is too small benchmark. (And LongSpider 3d-raytrace is performance neutral).
But I think it is due to the flakiness, since to_number is only called 14 times during the benchmark (And raytraceScene function is not compiled). (And the tier used for this to_number is up to baseline in this SunSpider/3d-raytrace).
Comment 89 Yusuke Suzuki 2016-06-19 22:22:12 PDT
And consistently, AsmBench/n-body gets 2% improvements.

   n-body.c                                         950.2645+-1.5768     ^    932.0507+-2.8031        ^ definitely 1.0195x faster
Comment 90 Yusuke Suzuki 2016-06-19 22:24:54 PDT
Comment on attachment 281533 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281533&action=review

> Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:5054
> +            set(VirtualRegister(currentInstruction[1].u.operand), addToGraph(ToNumber, OpInfo(0), OpInfo(prediction), value));

Note: When reverting this code to the old one, AsmBench/n-body's improvement disappears. (DFG ToNumber contributes to AsmBench/n-body improvements)
Comment 91 Yusuke Suzuki 2016-06-20 19:56:21 PDT
Created attachment 281702 [details]
Patch

Rebased the patch. It just fixes a minor conflict against the recent changes
Comment 92 WebKit Commit Bot 2016-06-20 19:58:12 PDT
Attachment 281702 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4921:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 1 in 86 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 93 Filip Pizlo 2016-06-21 12:07:20 PDT
Comment on attachment 281702 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281702&action=review

DFG part almost looks good, but I think there's a bug in ConstantFoldingPhase.

> Source/JavaScriptCore/dfg/DFGConstantFoldingPhase.cpp:570
> +                if (m_state.forNode(node->child1()).m_type & ~SpecFullNumber)

You use BytecodeNumber in AbstractInterpreterInlines.  I think that's correct and you should use that here.

> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:468
> +                // If we can speculate based on the child's prediction data, it's fine.
> +                // These cases are later fixed up and converted to Identity.
> +                if (isFullNumberSpeculation(child))
> +                    changed |= mergePrediction(child);
> +                else {
> +                    // Here we rely on the prediction offered by the value profiling.
> +                    // When the profiling does not say anything, we do not propagate any prediction.
> +                    // If the profiled result is not correct, OSRExit will update the profile.
> +                    //
> +                    // If ToNumber is never executed yet, heapPrediction becomes SpecNone.
> +                    // In this case, we propagate it as Int32. But if it is incorrect, BadType OSRExit will fix the speculation.
> +                    SpeculatedType heapPrediction = node->getHeapPrediction();
> +                    if (isInt32SpeculationForArithmetic(heapPrediction))
> +                        changed |= mergePrediction(SpecInt32Only);
> +                    else
> +                        changed |= mergePrediction(SpecBytecodeNumber);

I think it's better to use the heap prediction unconditionally.  It's simpler to do it that way, it's cheaper for the PredictionPropagationPhase (since you're using setPrediction() so you can move it to the code that runs once), and it's more precise.
Comment 94 Keith Miller 2016-06-21 16:50:15 PDT
Comment on attachment 281702 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281702&action=review

> Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:1540
> -    emitOpcode(opcodeID);
> +    UnlinkedValueProfile profile { };
> +    if (opcodeID == op_to_number)
> +        profile = emitProfiledOpcode(opcodeID);
> +    else
> +        emitOpcode(opcodeID);
> +
>      instructions().append(dst->index());
>      instructions().append(src->index());
> +
> +    if (opcodeID == op_to_number)
> +        instructions().append(profile);
> +
>      return dst;

This seems pretty hacky. Why not have an emitProfiledUnaryOp function instead?
Comment 95 Yusuke Suzuki 2016-06-22 03:49:25 PDT
Comment on attachment 281702 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281702&action=review

Thank you for your reviews! I'll upload the revised patch :)

>> Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:1540
>>      return dst;
> 
> This seems pretty hacky. Why not have an emitProfiledUnaryOp function instead?

Thanks!
op_to_number bytecode is emitted in 2 ways. One is calling BytecodeGenerator::emitToNumber, and another is UnaryOpNode's op_to_number case.
Some JS syntax (like +value) constructs the UnaryOpNode holding op_to_number as its unary operation.
And UnaryOpNode emits its bytecodes through BytecodeGenerator::emitUnaryOp.

That's why I encapsulated profile into emitUnaryOp; I would like to handle op_to_number as a one of unary op bytecode, and I don't want to care about whether unary op bytecode is profiled in the UnaryOpNode::emitBytecode side.

>> Source/JavaScriptCore/dfg/DFGConstantFoldingPhase.cpp:570
>> +                if (m_state.forNode(node->child1()).m_type & ~SpecFullNumber)
> 
> You use BytecodeNumber in AbstractInterpreterInlines.  I think that's correct and you should use that here.

Oops, right!
We say that ToNumber's result is BytecodeNumber, not this guard drains several unexpected number types.
Fixed.

>> Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:468
>> +                        changed |= mergePrediction(SpecBytecodeNumber);
> 
> I think it's better to use the heap prediction unconditionally.  It's simpler to do it that way, it's cheaper for the PredictionPropagationPhase (since you're using setPrediction() so you can move it to the code that runs once), and it's more precise.

OK. Now we already use `getPrediction()` for to_number bytecode. So if the heap prediction is not offered, we already emits ForceOSRExit.
So relying on the heap prediction here seems good.
Comment 96 Yusuke Suzuki 2016-06-22 03:55:23 PDT
Comment on attachment 281702 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281702&action=review

>>> Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:1540
>>>      return dst;
>> 
>> This seems pretty hacky. Why not have an emitProfiledUnaryOp function instead?
> 
> Thanks!
> op_to_number bytecode is emitted in 2 ways. One is calling BytecodeGenerator::emitToNumber, and another is UnaryOpNode's op_to_number case.
> Some JS syntax (like +value) constructs the UnaryOpNode holding op_to_number as its unary operation.
> And UnaryOpNode emits its bytecodes through BytecodeGenerator::emitUnaryOp.
> 
> That's why I encapsulated profile into emitUnaryOp; I would like to handle op_to_number as a one of unary op bytecode, and I don't want to care about whether unary op bytecode is profiled in the UnaryOpNode::emitBytecode side.

Ah, or, adding UnaryPlusNode::emitBytecode may be good. I'll change to that.
Comment 97 Yusuke Suzuki 2016-06-22 05:19:43 PDT
Created attachment 281837 [details]
Patch
Comment 98 Yusuke Suzuki 2016-06-22 05:32:42 PDT
Comment on attachment 281837 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281837&action=review

Uploaded the revised patch. It fixes the pointed issues. And I changed JIT emitting code slightly to encourage register reuse.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:3621
> +                notNumber = m_jit.branchIfNotNumber(argument.jsValueRegs(), scratch.gpr());

32bit branchIfNotNumber requires additional scratch register.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:3546
> +        GPRTemporary result(this, Reuse, argument);

Reuse the argument register if possible.

The result becomes like this.

      76:<!2:loc11>     ToNumber(Untyped:@68, JS|MustGen|UseAsOther, DoubleimpurenanTopEmpty, R:World, W:Heap, Exits, ClobbersExit, bc#48)  predicting DoubleimpurenanTopEmpty
          0x7f986d5fe693: test %rax, %r14
          0x7f986d5fe696: jz 0x7f986d5fe6a1
          0x7f986d5fe69c: jmp 0x7f986d5fe6d1
          0x7f986d5fe6a1: mov %rax, %rsi
          0x7f986d5fe6a4: mov %rbp, %rdi
          0x7f986d5fe6a7: mov $0x2, 0x24(%rbp)
          0x7f986d5fe6ae: mov $0x7f98711ea5f0, %r11
          0x7f986d5fe6b8: call *%r11
          0x7f986d5fe6bb: mov $0x7f982d3f72d0, %r11
          0x7f986d5fe6c5: mov (%r11), %r11
          0x7f986d5fe6c8: test %r11, %r11
          0x7f986d5fe6cb: jnz 0x7f986d5fe88c

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:3551
> +        argument.use();

Declare the use at this point.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:3561
> +            MacroAssembler::Jump notNumber = m_jit.branchIfNotNumber(argumentGPR);

64bit branchIfNotNumber does not require any temporary registers additionally. OK.
Comment 99 Yusuke Suzuki 2016-06-22 05:40:23 PDT
Comment on attachment 281837 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281837&action=review

> Source/JavaScriptCore/ChangeLog:28
> +        This value profiling is provided from either NumberConstructor's call operation or op_to_number.

I'll update this description,

ToNumber DFG node has a value profiling. This profiling is leveraged to determine the result number type of the ToNumber operation.
This value profiling is provided from either NumberConstructor's call operation or op_to_number.
Comment 100 Yusuke Suzuki 2016-06-22 08:46:21 PDT
Created attachment 281840 [details]
Patch

Ready for review
Comment 101 WebKit Commit Bot 2016-06-22 11:00:26 PDT
Attachment 281840 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4921:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 1 in 89 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 102 Yusuke Suzuki 2016-06-23 01:43:10 PDT
Created attachment 281902 [details]
Patch

Update the patch for 32bit build failure. Now, ready!
Comment 103 WebKit Commit Bot 2016-06-23 01:46:17 PDT
Attachment 281902 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ChangeLog:95:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:101:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:102:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:103:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:104:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:113:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:114:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:115:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:116:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:117:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:118:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:119:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:120:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:121:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:122:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:123:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:124:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:125:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ChangeLog:127:  Line contains tab character.  [whitespace/tab] [5]
ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4924:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 20 in 89 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 104 Yusuke Suzuki 2016-06-23 01:46:49 PDT
Created attachment 281903 [details]
Patch

Just removed the tabs in ChangeLog. Now, ready!
Comment 105 WebKit Commit Bot 2016-06-23 01:49:56 PDT
Attachment 281903 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4924:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 1 in 89 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 106 Yusuke Suzuki 2016-06-23 02:02:35 PDT
Created attachment 281904 [details]
Patch

Debug build fix
Comment 107 WebKit Commit Bot 2016-06-23 02:05:49 PDT
Attachment 281904 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4924:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 1 in 89 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 108 Yusuke Suzuki 2016-06-23 20:42:12 PDT
Committed r202413: <http://trac.webkit.org/changeset/202413>
Comment 109 Yusuke Suzuki 2016-06-24 02:56:18 PDT
While SunSpider, Kraken, and Octane show performance neutral (and AsmBench/n-body.c shows performance progression, yay!)

a test, that is not included in WebKit run-jsc-benchmarks, "assorted-misc-bugs-847389-jpeg2000", shows the significant degradation.
https://arewefastyet.com/#machine=29&view=single&suite=misc&subtest=bugs-847389-jpeg2000

After investigating the cause, interestingly, reverting DFGPredictionPropagationPhase.cpp change to the previous patch fixes the performance regression.
To fix that, I think

1. add the assorted tests into run-jsc-benchmarks
2. apply the change

is necessary.
Comment 111 WebKit Commit Bot 2016-06-24 11:47:46 PDT
Re-opened since this is blocked by bug 159097
Comment 112 Alexey Proskuryakov 2016-06-24 11:49:52 PDT
Tests were also broken on armv7.
Comment 113 Yusuke Suzuki 2016-06-24 16:07:52 PDT
Oops, thanks.
I'll update the patch w/ 32bit fix and perf fix.
Comment 114 Yusuke Suzuki 2016-06-25 05:34:38 PDT
Comment on attachment 281904 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=281904&action=review

> Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:962
>      callOpcodeSlowPath(_slow_path_to_number)

This dispatch(3) should be changed to dispatch(4).
Comment 115 Yusuke Suzuki 2016-06-26 06:56:07 PDT
Created attachment 282092 [details]
Patch
Comment 116 WebKit Commit Bot 2016-06-26 06:58:28 PDT
Attachment 282092 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4924:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 1 in 89 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 117 Yusuke Suzuki 2016-06-26 07:56:17 PDT
Created attachment 282093 [details]
Patch
Comment 118 WebKit Commit Bot 2016-06-26 07:59:00 PDT
Attachment 282093 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:4924:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Total errors found: 1 in 90 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 119 Yusuke Suzuki 2016-06-26 08:02:30 PDT
Comment on attachment 282093 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=282093&action=review

Request the review for the updated patch :D

> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:1828
> +            }

This is the important change of this patch. Since the ToNumber node tell about its prediction based on its heap prediction,
we should convert the node into its type through ToNumber.
In misc-bugs-847389-jpeg2000 case (assorted tests https://arewefastyet.com/#machine=29&view=single&suite=misc&subtest=bugs-847389-jpeg2000), the case happens that
ToNumber's heap prediction is NonboolInt32 while the prediction of the child1() is Number. In the previous patch, we convert such node to Double Identity node and it makes the abstract interpreter execution invalid in CFA phase. And it causes 52% regression.

Now, we use DoubleAsInt32 for the mismatch case (the heap prediction says Int32, but the child's prediction says the other number).

> Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:963
> +    dispatch(4)

This was missed and it caused 32bit crash.
Comment 120 Yusuke Suzuki 2016-06-26 08:12:10 PDT
Now running the benchmark.
Comment 121 Yusuke Suzuki 2016-06-26 18:22:56 PDT
Here's the results.
https://gist.github.com/Constellation/19d4ecf8e785c64d6ab499c18ed82327

The same results to the previous one.

   to-number-string-number-string-number              3.5749+-0.0317     ^      3.3145+-0.0466        ^ definitely 1.0786x faster
   to-number-only-number                              0.8247+-0.0110            0.8118+-0.0108          might be 1.0158x faster
   to-number-constructor-string-number-string-number   
   global-isNaN                                      44.5953+-0.2675     ^     13.3903+-0.1755        ^ definitely 3.3304x faster
   to-number-constructor-only-number                  1.0059+-0.0091     ^      0.8304+-0.0148        ^ definitely 1.2112x faster
   Number-isNaN                                      40.6242+-0.1968     ^     13.3933+-0.1415        ^ definitely 3.0332x faster
   to-number-number-string-number-string              4.0311+-0.0623     ^      3.4088+-0.0469        ^ definitely 1.1825x faster
   to-number-constructor-number-string-number-string   
   to-number-constructor-only-string                  2.1306+-0.0203     ^      1.8518+-0.0319        ^ definitely 1.1506x faster
   to-number-only-string                              2.0544+-0.0146     ^      1.8138+-0.0220        ^ definitely 1.1326x faster
   n-body.c                                         921.7538+-5.3001     ^    903.4331+-10.0243       ^ definitely 1.0203x faster

And

LongSpider/hash-map
LongSpider/string-base64 (does not include to_number)
Octane/raytrace
Octane/closure (does not include to_number)
Kraken/json-stringify-tinderbox (does not include to_number)
Kraken/stanford-crypto-sha256-iterative (does not include to_number)

poses slowdown but, it's due to noise.
We can validate them by running again.

(A is Baseline, B is Patched)

Collected 50 samples per benchmark/VM, with 50 VM invocations per benchmark. Emitted a call to
gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used
the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark
execution times with 95% confidence intervals in milliseconds.

                            A                         B                                         

hash-map            155.8200+-1.2988          155.6257+-1.3206        

<geometric>         155.8200+-1.2988          155.6257+-1.3206          might be 1.0012x faster


Collected 50 samples per benchmark/VM, with 50 VM invocations per benchmark. Emitted a call to gc()
between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the
jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution
times with 95% confidence intervals in milliseconds.

                               A                         B                                         

raytrace       x2       0.93941+-0.00305          0.93674+-0.00263       

<geometric>             0.93941+-0.00305          0.93674+-0.00263         might be 1.0028x faster
Comment 122 Yusuke Suzuki 2016-06-29 05:29:55 PDT
Ping?

misc-bugs-847389-jpeg2000 result is the following. This patch improves it by ~2%.

Benchmark report for JSRegress on hanayamata.

VMs tested:
"A" at /home/yusukesuzuki/dev/WebKit/WebKitBuild/nan12-master/Release/bin/jsc
"B" at /home/yusukesuzuki/dev/WebKit/WebKitBuild/nan12/Release/bin/jsc

Collected 20 samples per benchmark/VM, with 20 VM invocations per benchmark. Emitted a call to gc() between
sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific
preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95%
confidence intervals in milliseconds.

                                       A                         B                                         

misc-bugs-847389-jpeg2000     4745.0222+-31.1721    ^   4653.7076+-53.5189       ^ definitely 1.0196x faster

<geometric>                   4745.0222+-31.1721    ^   4653.7076+-53.5189       ^ definitely 1.0196x faster


I didn't include this misc-bugs-847389-jpeg2000 js in the regress test currently since I found that it incurs assertion failure on the debug test (MacroAssember invoked by Air causes this assertion). Seeing the assertion itself, I think this is OK in fact. But I'll fix this in the separate test.
Comment 123 Keith Miller 2016-06-29 10:05:27 PDT
Comment on attachment 282093 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=282093&action=review

r=me.

> Source/JavaScriptCore/ChangeLog:97
> +        Alway rely on the heap prediction.

Alway => Always

> Source/JavaScriptCore/builtins/GlobalOperations.js:57
> +    return object === @undefined || object == null || typeof object === "object";

Do you need the "object === @undefined" case isn't that covered by "object == null"?
Comment 124 Keith Miller 2016-06-29 10:05:41 PDT
View in context: https://bugs.webkit.org/attachment.cgi?id=282093&action=review

r=me with comment.

> Source/JavaScriptCore/ChangeLog:97
> +        Alway rely on the heap prediction.

Alway => Always

> Source/JavaScriptCore/builtins/GlobalOperations.js:57
> +    return object === @undefined || object == null || typeof object === "object";

Do you need the "object === @undefined" case isn't that covered by "object == null"?
Comment 125 Yusuke Suzuki 2016-06-30 08:23:57 PDT
Comment on attachment 282093 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=282093&action=review

Thanks!

>> Source/JavaScriptCore/ChangeLog:97
>> +        Alway rely on the heap prediction.
> 
> Alway => Always

Fixed.

>> Source/JavaScriptCore/builtins/GlobalOperations.js:57
>> +    return object === @undefined || object == null || typeof object === "object";
> 
> Do you need the "object === @undefined" case isn't that covered by "object == null"?

Yes, it covers. Drop object === @undefined.
Comment 126 Yusuke Suzuki 2016-06-30 08:27:20 PDT
Committed r202680: <http://trac.webkit.org/changeset/202680>