Bug 131423

Summary: DFG IR should keep the data flow of doubles and int52's separate from the data flow of JSValue's
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: atrick, barraclough, commit-queue, ggaren, juergen, mark.lam, mhahnenberg, msaboff, nrotem, oliver, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 131424    
Bug Blocks: 131689, 131419    
Attachments:
Description Flags
it begins
none
more
none
even more
none
maybe 1/3 done
none
more progress
none
it's getting real
none
not done yet
none
almost there
none
it compiles!
none
it ran something
none
it sometimes passes some tests
none
passing tests on 64-bit
none
passes all tests
none
the patch
none
the patch
none
the patch ggaren: review+

Description Filip Pizlo 2014-04-08 20:22:18 PDT
This accomplishes two things:

- It makes all complex conversions explicit in IR.  Currently JSValue-to-Double is almost always implicit.
- It makes it easy to use, and property handle, values that fall outside of the JSValue domain, like heavy NaNs.
Comment 1 Filip Pizlo 2014-04-08 20:29:51 PDT
Created attachment 228931 [details]
it begins
Comment 2 Filip Pizlo 2014-04-12 11:52:17 PDT
Created attachment 229208 [details]
more
Comment 3 Filip Pizlo 2014-04-12 12:54:52 PDT
Created attachment 229210 [details]
even more

This is starting to get fun.
Comment 4 Filip Pizlo 2014-04-12 16:16:09 PDT
Created attachment 229213 [details]
maybe 1/3 done
Comment 5 Filip Pizlo 2014-04-13 11:46:44 PDT
Created attachment 229237 [details]
more progress
Comment 6 Filip Pizlo 2014-04-13 14:06:32 PDT
Created attachment 229244 [details]
it's getting real
Comment 7 Filip Pizlo 2014-04-13 16:59:40 PDT
Created attachment 229251 [details]
not done yet

I can't wait to land this monster.
Comment 8 Filip Pizlo 2014-04-13 17:22:24 PDT
Created attachment 229252 [details]
almost there
Comment 9 Filip Pizlo 2014-04-13 17:23:16 PDT
(In reply to comment #4)
> Created an attachment (id=229213) [details]
> maybe 1/3 done

48.42 * 3 = 145.26.

This implies that I have roughly five more kilobytes of code before this patch is done.
Comment 10 Filip Pizlo 2014-04-13 18:29:31 PDT
Created attachment 229253 [details]
it compiles!
Comment 11 Filip Pizlo 2014-04-13 23:59:42 PDT
Created attachment 229267 [details]
it ran something

It's starting to not be completely broken.
Comment 12 Filip Pizlo 2014-04-14 13:36:41 PDT
Created attachment 229303 [details]
it sometimes passes some tests
Comment 13 Filip Pizlo 2014-04-14 15:31:26 PDT
Created attachment 229312 [details]
passing tests on 64-bit
Comment 14 Filip Pizlo 2014-04-14 21:58:37 PDT
Created attachment 229345 [details]
passes all tests
Comment 15 Filip Pizlo 2014-04-14 23:30:53 PDT
Created attachment 229354 [details]
the patch
Comment 16 WebKit Commit Bot 2014-04-14 23:32:31 PDT
Attachment 229354 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/AssemblyHelpers.h:421:  The parameter name "fpr" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp:190:  Multi line control clauses should use braces.  [whitespace/braces] [4]
ERROR: Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp:294:  Multi line control clauses should use braces.  [whitespace/braces] [4]
Total errors found: 3 in 47 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 17 Filip Pizlo 2014-04-14 23:33:35 PDT
Created attachment 229355 [details]
the patch
Comment 18 Filip Pizlo 2014-04-14 23:34:46 PDT
Created attachment 229357 [details]
the patch
Comment 19 WebKit Commit Bot 2014-04-14 23:37:33 PDT
Attachment 229357 [details] did not pass style-queue:


ERROR: Source/JavaScriptCore/jit/AssemblyHelpers.h:421:  The parameter name "fpr" adds no information, so it should be removed.  [readability/parameter_name] [5]
ERROR: Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp:190:  Multi line control clauses should use braces.  [whitespace/braces] [4]
Total errors found: 2 in 47 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 20 Filip Pizlo 2014-04-14 23:39:44 PDT
(In reply to comment #19)
> Attachment 229357 [details] did not pass style-queue:
> 
> 
> ERROR: Source/JavaScriptCore/jit/AssemblyHelpers.h:421:  The parameter name "fpr" adds no information, so it should be removed.  [readability/parameter_name] [5]

Fixed.

> ERROR: Source/JavaScriptCore/dfg/DFGStrengthReductionPhase.cpp:190:  Multi line control clauses should use braces.  [whitespace/braces] [4]

No.

> Total errors found: 2 in 47 files
> 
> 
> If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 21 Filip Pizlo 2014-04-15 00:01:14 PDT
Benchmark report for SunSpider, Octane, Kraken, and AsmBench on oldmac (MacPro4,1).

VMs tested:
"ToT" at /Volumes/Data/pizlo/secondary/OpenSource/WebKitBuild/Release/jsc (r167272)
"DisjointDataFlow" at /Volumes/Data/fromMiniMe/tertiary/OpenSource/WebKitBuild/Release/jsc (r167272)

Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to
get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                                   ToT                 DisjointDataFlow                                 
SunSpider:
   3d-cube                                    8.8556+-0.1658            8.8135+-0.1897        
   3d-morph                                   9.5043+-0.1738     ?      9.5260+-0.1173        ?
   3d-raytrace                               10.4978+-0.2492     ?     10.9246+-0.2754        ? might be 1.0407x slower
   access-binary-trees                        3.0791+-0.0494            3.0327+-0.0932          might be 1.0153x faster
   access-fannkuch                            9.2635+-0.2691            9.2375+-0.1992        
   access-nbody                               4.6771+-0.0702     !      4.9050+-0.0382        ! definitely 1.0487x slower
   access-nsieve                              5.7626+-0.1300     ?      5.8214+-0.1160        ? might be 1.0102x slower
   bitops-3bit-bits-in-byte                   2.0844+-0.1141            2.0511+-0.0997          might be 1.0163x faster
   bitops-bits-in-byte                        6.5818+-0.0625     ?      6.6650+-0.1064        ? might be 1.0126x slower
   bitops-bitwise-and                         3.1246+-0.0112            3.1223+-0.0481        
   bitops-nsieve-bits                         6.1300+-0.1292     ^      5.9076+-0.0246        ^ definitely 1.0376x faster
   controlflow-recursive                      3.4637+-0.0795     ?      3.5368+-0.1283        ? might be 1.0211x slower
   crypto-aes                                 6.0505+-0.1034     ?      6.0899+-0.0222        ?
   crypto-md5                                 3.4563+-0.0555     ?      3.5898+-0.1785        ? might be 1.0386x slower
   crypto-sha1                                4.7430+-0.2969            4.6967+-0.2188        
   date-format-tofte                         13.3832+-0.5011           13.1751+-0.5879          might be 1.0158x faster
   date-format-xparb                          9.9797+-0.1528            9.7536+-0.1629          might be 1.0232x faster
   math-cordic                                4.7658+-0.0667     ?      4.8741+-0.0544        ? might be 1.0227x slower
   math-partial-sums                         10.5080+-0.2057           10.3280+-0.1330          might be 1.0174x faster
   math-spectral-norm                         3.2244+-0.1131     ?      3.3599+-0.1325        ? might be 1.0420x slower
   regexp-dna                                12.0826+-0.1314           12.0682+-0.2592        
   string-base64                              6.1819+-0.0509     ?      6.2072+-0.0378        ?
   string-fasta                              11.4589+-0.2247     ?     11.5913+-0.2498        ? might be 1.0116x slower
   string-tagcloud                           16.9122+-0.2845     ?     16.9723+-0.3670        ?
   string-unpack-code                        33.1013+-0.2880     ?     33.1581+-0.1912        ?
   string-validate-input                      7.2788+-0.0794     ?      7.4886+-0.1569        ? might be 1.0288x slower

   <arithmetic> *                             8.3135+-0.0331     ?      8.3422+-0.0371        ? might be 1.0034x slower
   <geometric>                                6.8098+-0.0336     ?      6.8482+-0.0306        ? might be 1.0056x slower
   <harmonic>                                 5.7097+-0.0382     ?      5.7484+-0.0425        ? might be 1.0068x slower

                                                   ToT                 DisjointDataFlow                                 
Octane and V8v7:
   encrypt                                   0.44475+-0.00248          0.44453+-0.00156       
   decrypt                                   8.04512+-0.00860    ?     8.22488+-0.22501       ? might be 1.0223x slower
   deltablue                        x2       0.44441+-0.00607    ?     0.44577+-0.00213       ?
   earley                                    0.95063+-0.00746    ?     0.95481+-0.00341       ?
   boyer                                    10.65078+-0.07016         10.63740+-0.02689       
   navier-stokes                    x2       7.85221+-0.17849    ?     7.94955+-0.22564       ? might be 1.0124x slower
   raytrace                         x2       2.83404+-0.06045    ?     2.89518+-0.06709       ? might be 1.0216x slower
   regexp                           x2      32.36546+-1.59683         32.03733+-0.74428         might be 1.0102x faster
   richards                         x2       0.22832+-0.00646          0.22610+-0.00573       
   splay                            x2       0.65112+-0.02244          0.64333+-0.02370         might be 1.0121x faster
   pdfjs                            x2      98.71862+-1.00381         97.70550+-0.33678         might be 1.0104x faster
   mandreel                         x2     104.04172+-0.64701    ?   104.90456+-0.55541       ?
   gbemu                            x2      71.81361+-0.26283         71.46601+-0.90133       
   closure                                   0.89578+-0.00118    ?     0.89889+-0.00233       ?
   jquery                                   11.47915+-0.04427    ?    11.63422+-0.34224       ? might be 1.0135x slower
   box2d                            x2      27.53755+-0.25138    ?    27.83984+-0.27793       ? might be 1.0110x slower
   zlib                             x2     752.06250+-6.82752        739.43296+-27.22691        might be 1.0171x faster
   typescript                       x2    1250.82231+-15.87611      1247.02486+-7.30913       

V8v7:
   <arithmetic>                              6.80265+-0.19364          6.79101+-0.08695         might be 1.0017x faster
   <geometric> *                             2.02784+-0.02466    ?     2.03226+-0.01962       ? might be 1.0022x slower
   <harmonic>                                0.76635+-0.01124          0.76320+-0.00892         might be 1.0041x faster

Octane including V8v7:
   <arithmetic>                            157.70700+-0.93728        156.59789+-1.71884         might be 1.0071x faster
   <geometric> *                            12.10291+-0.07826    ?    12.11049+-0.07606       ? might be 1.0006x slower
   <harmonic>                                1.34976+-0.01854          1.34489+-0.01492         might be 1.0036x faster

                                                   ToT                 DisjointDataFlow                                 
Kraken:
   ai-astar                                  497.940+-9.174            495.352+-7.535         
   audio-beat-detection                      214.703+-1.289            213.870+-0.681         
   audio-dft                                 299.102+-4.001      ?     300.186+-5.285         ?
   audio-fft                                 121.517+-0.083      ?     121.562+-0.344         ?
   audio-oscillator                          385.390+-8.322            383.563+-1.129         
   imaging-darkroom                          292.080+-6.892            289.506+-0.556         
   imaging-desaturate                        113.918+-0.961            113.739+-0.575         
   imaging-gaussian-blur                     191.936+-4.664      ?     210.639+-44.432        ? might be 1.0974x slower
   json-parse-financial                       85.709+-2.721             85.371+-1.297         
   json-stringify-tinderbox                  106.445+-2.120      ?     107.718+-2.010         ? might be 1.0120x slower
   stanford-crypto-aes                        95.566+-2.141      ?      98.250+-2.946         ? might be 1.0281x slower
   stanford-crypto-ccm                       106.201+-2.901            101.793+-10.598          might be 1.0433x faster
   stanford-crypto-pbkdf2                    251.095+-0.946      ?     255.601+-6.666         ? might be 1.0179x slower
   stanford-crypto-sha256-iterative          112.441+-3.129            110.403+-0.196           might be 1.0185x faster

   <arithmetic> *                            205.289+-1.163      ?     206.254+-3.627         ? might be 1.0047x slower
   <geometric>                               174.845+-1.120      ?     175.462+-3.252         ? might be 1.0035x slower
   <harmonic>                                152.023+-1.141      ?     152.124+-3.070         ? might be 1.0007x slower

                                                   ToT                 DisjointDataFlow                                 
AsmBench:
   bigfib.cpp                               991.0034+-5.6074          988.5683+-10.4231       
   cray.c                                    57.6591+-0.4919     ?     58.5741+-0.6725        ? might be 1.0159x slower
   dry.c                                    935.4674+-76.5102         895.4475+-83.9800         might be 1.0447x faster
   FloatMM.c                               1792.0316+-40.6358    ?   1795.0815+-44.3661       ?
   gcc-loops.cpp                           2232.0565+-3.2209     ?   2235.8031+-7.7630        ?
   n-body.c                                3056.0710+-51.3423        3039.6989+-8.5335        
   Quicksort.c                               85.3543+-2.2289           84.5378+-0.5411        
   stepanov_container.cpp                  6255.2408+-13.1123    ?   6291.4020+-40.7525       ?
   Towers.c                                  71.7684+-0.4278     ?     72.1978+-0.4640        ?

   <arithmetic>                            1719.6281+-12.6983        1717.9234+-15.9505         might be 1.0010x faster
   <geometric> *                            663.6269+-7.2056          661.3768+-9.2199          might be 1.0034x faster
   <harmonic>                               193.2350+-1.5413     ?    194.0429+-1.5205        ? might be 1.0042x slower

                                                   ToT                 DisjointDataFlow                                 
All benchmarks:
   <arithmetic>                             294.9121+-1.6248          294.4772+-2.0231          might be 1.0015x faster
   <geometric>                               25.3706+-0.0800     ?     25.4292+-0.1100        ? might be 1.0023x slower
   <harmonic>                                 2.9347+-0.0338            2.9293+-0.0256          might be 1.0018x faster

                                                   ToT                 DisjointDataFlow                                 
Geomean of preferred means:
   <scaled-result>                           60.8462+-0.1486     ?     60.9265+-0.3885        ? might be 1.0013x slower
Comment 22 Geoffrey Garen 2014-04-15 12:25:24 PDT
Comment on attachment 229357 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=229357&action=review

r=me

Please make sure to test 32bit and iOS before landing.

> Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:263
> +                // DoubleAsInt42 node, which occurs after the Div/Mod node that the conversions

DoubleAsInt52?

> Source/JavaScriptCore/dfg/DFGNodeType.h:119
> +    macro(ValueRep, NodeResultJS) \

Seems like we should rename NodeResultJS to NodeResultValue to match.

> Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:2168
> +        // This is pretty annoying - boxDouble() on X86 clobbers the source. That's kinda gross.

A better way to say this is, here, "boxDouble() on X86 clobbers the source, so we need to make a copy", and then in the assembler, "FIXME: Don't clobber the source."
Comment 23 Filip Pizlo 2014-04-15 12:35:38 PDT
(In reply to comment #22)
> (From update of attachment 229357 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=229357&action=review
> 
> r=me
> 
> Please make sure to test 32bit and iOS before landing.
> 
> > Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:263
> > +                // DoubleAsInt42 node, which occurs after the Div/Mod node that the conversions
> 
> DoubleAsInt52?

Lol.  DoubleAsInt32.

Some day, we may have Int33, Int34, Int35, ..., all the way up to Int51.  But that day will hopefully not come anytime soon.

> 
> > Source/JavaScriptCore/dfg/DFGNodeType.h:119
> > +    macro(ValueRep, NodeResultJS) \
> 
> Seems like we should rename NodeResultJS to NodeResultValue to match.

Actually, I want to do more carnage to NodeResultMask.  Currently we have NodeResultNumber and NodeResultInt32, which are like NodeResultJS in almost every way.  I just filed: https://bugs.webkit.org/show_bug.cgi?id=131689.  Clearly, that bug is not a high priority - the nastiness it fixes isn't an immediate concern.

Because both this, and that, will be big changes, I'd rather not also search-and-replace NodeResultJS for now.

> 
> > Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:2168
> > +        // This is pretty annoying - boxDouble() on X86 clobbers the source. That's kinda gross.
> 
> A better way to say this is, here, "boxDouble() on X86 clobbers the source, so we need to make a copy", and then in the assembler, "FIXME: Don't clobber the source."

Good idea.  https://bugs.webkit.org/show_bug.cgi?id=131690
Comment 24 Filip Pizlo 2014-04-15 13:27:24 PDT
Landed in http://trac.webkit.org/changeset/167325