Bug 146910 - compileArithSub in the DFG attempts to read a double constant as an int32
Summary: compileArithSub in the DFG attempts to read a double constant as an int32
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P1 Normal
Assignee: Matthew Mirman
URL:
Keywords: InRadar
Depends on: 146935
Blocks:
  Show dependency treegraph
 
Reported: 2015-07-13 12:33 PDT by Matthew Mirman
Modified: 2015-07-14 14:59 PDT (History)
4 users (show)

See Also:


Attachments
Patch. (8.58 KB, patch)
2015-07-13 12:43 PDT, Matthew Mirman
benjamin: review-
Details | Formatted Diff | Diff
Patch (8.68 KB, patch)
2015-07-13 16:24 PDT, Matthew Mirman
fpizlo: review+
Details | Formatted Diff | Diff
Performance results. (44.92 KB, text/plain)
2015-07-14 10:11 PDT, Matthew Mirman
no flags Details
Internal Performance results. (52.67 KB, text/plain)
2015-07-14 12:55 PDT, Matthew Mirman
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthew Mirman 2015-07-13 12:33:41 PDT
Patch forthcoming. 

rdar://problem/21729083
Comment 1 Radar WebKit Bug Importer 2015-07-13 12:35:31 PDT
<rdar://problem/21798270>
Comment 2 Matthew Mirman 2015-07-13 12:43:52 PDT
Created attachment 256713 [details]
Patch.
Comment 3 Benjamin Poulain 2015-07-13 12:57:18 PDT
Comment on attachment 256713 [details]
Patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=256713&action=review

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:2992
> +        if (node->child2()->isInt32Constant()) {

If we have Int32Use, we know that the constant is an integer or a Int32 stored in a double. We can still use the fast path with immediate, we just need convert the double to int32 to get the immediate.

ArithAdd should probably get the same.

> Source/JavaScriptCore/tests/stress/arith-add-with-constants.js:228
> +function arithSub42WrittenAsDouble(x) {

IMHO, it is worth adding a new test covering the hell out of Add and Sub for those "ASM-like" typing.
Comment 4 Filip Pizlo 2015-07-13 13:38:35 PDT
Comment on attachment 256713 [details]
Patch.

View in context: https://bugs.webkit.org/attachment.cgi?id=256713&action=review

>> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:2992
>> +        if (node->child2()->isInt32Constant()) {
> 
> If we have Int32Use, we know that the constant is an integer or a Int32 stored in a double. We can still use the fast path with immediate, we just need convert the double to int32 to get the immediate.
> 
> ArithAdd should probably get the same.

I disagree.  I believe that what should actually happen is an OSR exit.  The abstract interpreter will conclude that the non-int32 constant will fail the Int32 check.

> Source/JavaScriptCore/dfg/DFGValidate.cpp:547
> -            VALIDATE((node, edge), edge.useKind() == UntypedUse);
> +            VALIDATE((node, edge), edge.useKind() == UntypedUse || edge.useKind() == NumberUse || edge.useKind() == Int32Use);

Please don't do this.  We shouldn't change this invariant of the IR.
Comment 5 Matthew Mirman 2015-07-13 16:24:25 PDT
Created attachment 256744 [details]
Patch
Comment 6 Filip Pizlo 2015-07-13 16:25:24 PDT
Comment on attachment 256744 [details]
Patch

Can you run and post perf tests before landing?
Comment 7 Matthew Mirman 2015-07-14 10:11:56 PDT
Created attachment 256774 [details]
Performance results.

my patch might be 1.0004x faster.  Landing now.
Comment 8 Filip Pizlo 2015-07-14 10:12:46 PDT
(In reply to comment #7)
> Created attachment 256774 [details]
> Performance results.
> 
> my patch might be 1.0004x faster.  Landing now.

Can you run ../Internal/Tools/Scripts/run-jsc-benchmarks?
Comment 9 Matthew Mirman 2015-07-14 10:29:49 PDT
(In reply to comment #8)
> (In reply to comment #7)
> > Created attachment 256774 [details]
> > Performance results.
> > 
> > my patch might be 1.0004x faster.  Landing now.
> 
> Can you run ../Internal/Tools/Scripts/run-jsc-benchmarks?

I landed it before I got that comment: http://trac.webkit.org/changeset/186805
And yes, I'm doing that now.
Comment 10 Matthew Mirman 2015-07-14 12:55:12 PDT
Created attachment 256785 [details]
Internal Performance results.
Comment 11 Filip Pizlo 2015-07-14 13:04:08 PDT
Please roll out your patch.  You made Octane/raytrace 80% slower.
Comment 12 Matthew Mirman 2015-07-14 13:19:59 PDT
Reverted r186805 for reason:

Made raytracer on octane 80% slower

Committed r186811: <http://trac.webkit.org/changeset/186811>
Comment 13 Filip Pizlo 2015-07-14 13:46:52 PDT
Here's the performance I see on my machine:

Benchmark report for Octane on dethklok (MacBookPro9,1).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/jsc (r186811)
"FixSub" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r186811)

Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc()
between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the
jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution
times with 95% confidence intervals in milliseconds.

                           TipOfTree                   FixSub                                      

encrypt                 0.19983+-0.00736    ?     0.20379+-0.00622       ? might be 1.0198x slower
decrypt                 3.56811+-0.12414          3.48541+-0.13976         might be 1.0237x faster
deltablue      x2       0.19685+-0.00548          0.19574+-0.00586       
earley                  0.35817+-0.00351          0.35475+-0.00426       
boyer                   5.29290+-0.05199          5.23545+-0.08301         might be 1.0110x faster
navier-stokes  x2       5.47625+-0.22626    ?     5.51694+-0.10611       ?
raytrace       x2       1.33243+-0.05607          1.32627+-0.05485       
richards       x2       0.13620+-0.00609    ?     0.13754+-0.00701       ?
splay          x2       0.40679+-0.02080          0.40155+-0.01103         might be 1.0130x faster
regexp         x2      30.18984+-0.72081         29.70016+-0.70073         might be 1.0165x faster
pdfjs          x2      44.12776+-0.86649    ?    44.14128+-0.49305       ?
mandreel       x2      51.88066+-1.66762         51.74641+-1.57926       
gbemu          x2      46.96385+-4.09226         44.97621+-0.79699         might be 1.0442x faster
closure                 0.66344+-0.01368          0.64069+-0.01466         might be 1.0355x faster
jquery                  8.48724+-0.22021    ?     8.55310+-0.24830       ?
box2d          x2      12.46104+-0.16595    ?    12.58297+-0.33394       ?
zlib           x2     407.76607+-16.13783       400.69549+-15.13570        might be 1.0176x faster
typescript     x2     847.85144+-11.33011       843.33248+-10.91893      

<geometric>             6.79588+-0.06016          6.74915+-0.03678         might be 1.0069x faster


Seems like your perf testing was *super* noisy.  I suggest relanding this patch, and then investigating how to get your machine to have lower-noise performance results.  Typical tricks are:

- Use the --outer option to increase the number of iterations.  I find that 6 works well on my machine.
- Shut down most apps.  Usually it's best to just have Terminal running.
- Disconnect from the network.
Comment 14 Matthew Mirman 2015-07-14 14:59:32 PDT
Patch landed: http://trac.webkit.org/changeset/186819
Closing bug.