Bug 186218 - Make DFG to FTL OSR entry code more sane by removing bad RELEASE_ASSERTS and making it trigger compiles in outer loops before inner ones
Summary: Make DFG to FTL OSR entry code more sane by removing bad RELEASE_ASSERTS and ...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Saam Barati
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2018-06-01 18:29 PDT by Saam Barati
Modified: 2018-06-07 01:45 PDT (History)
13 users (show)

See Also:


Attachments
WIP (12.41 KB, patch)
2018-06-04 09:56 PDT, Saam Barati
no flags Details | Formatted Diff | Diff
patch (15.29 KB, patch)
2018-06-04 14:30 PDT, Saam Barati
saam: review-
Details | Formatted Diff | Diff
patch (16.22 KB, patch)
2018-06-04 15:51 PDT, Saam Barati
fpizlo: review+
Details | Formatted Diff | Diff
patch for landing (16.29 KB, patch)
2018-06-07 00:12 PDT, Saam Barati
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Saam Barati 2018-06-01 18:29:54 PDT
...
Comment 1 Saam Barati 2018-06-01 18:36:39 PDT
The code is pretty much spaghetti. I'm not sure I can succeed at making the code not be spaghetti, but I think I have the code paged in enough now to at least call BS on some stuff we're doing in it.

A few things:
- We don't attempt an OSR entry if the worklist tells us code is compiled. I have no idea why this line of code was added, but it looks wrong and makes no sense. We should obviously attempt to do OSR entry if we have compiled code.
- JF accidentally reverted the thing where we wanted to compile the outer most loop. Ben originally checked in code that walked each entry in a loop vector (from inner to outer loop) while updating some state. So after this loop, the state was set for the outermost loop. JF changed this to always use the inner-most loop meeting the condition that it has executed. I don't know which heuristic is better, but let's try going back to Ben's to see what happens. We should also pick our heuristic based on actually understanding what it's doing.
Comment 2 Saam Barati 2018-06-01 18:49:07 PDT
There are also some asserts in this code that make no sense and are causing some crashes.
Comment 3 Saam Barati 2018-06-01 18:49:35 PDT
<rdar://problem/38449540>
Comment 4 Saam Barati 2018-06-04 09:56:59 PDT
Created attachment 341907 [details]
WIP

Was neutral when running benchmarks yesterday.
Comment 5 Saam Barati 2018-06-04 14:30:10 PDT
Created attachment 341918 [details]
patch
Comment 6 Saam Barati 2018-06-04 15:11:46 PDT
Comment on attachment 341918 [details]
patch

Nice. My own test is crashing. Will investigate.
Comment 7 Saam Barati 2018-06-04 15:51:34 PDT
Created attachment 341929 [details]
patch
Comment 8 Saam Barati 2018-06-07 00:03:41 PDT
Seems neutral

Sunspider:
                                      og                      change                                      

3d-cube                         4.5335+-0.1077     ?      4.6629+-0.0807        ? might be 1.0285x slower
3d-morph                        5.2291+-0.1263            5.1328+-0.0658          might be 1.0188x faster
3d-raytrace                     4.7025+-0.1467            4.6687+-0.1229        
access-binary-trees             2.2823+-0.2093            2.1710+-0.1428          might be 1.0512x faster
access-fannkuch                 5.5396+-0.1272     ?      5.5463+-0.0916        ?
access-nbody                    2.7476+-0.1233            2.7282+-0.0779        
access-nsieve                   3.1159+-0.1296     ?      3.1821+-0.1498        ? might be 1.0212x slower
bitops-3bit-bits-in-byte        1.4203+-0.1298            1.3092+-0.0808          might be 1.0849x faster
bitops-bits-in-byte             2.9642+-0.0848     ?      3.0125+-0.1396        ? might be 1.0163x slower
bitops-bitwise-and              2.1775+-0.0766     ?      2.2126+-0.0845        ? might be 1.0161x slower
bitops-nsieve-bits              3.3507+-0.0700     ?      3.3660+-0.0982        ?
controlflow-recursive           2.3530+-0.1136            2.2644+-0.0733          might be 1.0391x faster
crypto-aes                      4.0562+-0.0772            4.0416+-0.0827        
crypto-md5                      2.7751+-0.1757            2.6402+-0.1053          might be 1.0511x faster
crypto-sha1                     2.8923+-0.1846            2.8741+-0.1434        
date-format-tofte               7.0804+-0.1632            7.0679+-0.1263        
date-format-xparb               5.4023+-0.0999            5.3899+-0.0985        
math-cordic                     2.9918+-0.1129            2.9692+-0.0900        
math-partial-sums               4.2851+-0.1162            4.2399+-0.1400          might be 1.0107x faster
math-spectral-norm              2.0139+-0.0494     ?      2.0525+-0.0791        ? might be 1.0192x slower
regexp-dna                      6.5818+-0.1474            6.5427+-0.1504        
string-base64                   3.9331+-0.1251     ?      3.9738+-0.0988        ? might be 1.0103x slower
string-fasta                    5.8509+-0.1165            5.8083+-0.1007        
string-tagcloud                 8.4712+-0.1865            8.4419+-0.1233        
string-unpack-code             18.1376+-0.2609     ?     18.2486+-0.2149        ?
string-validate-input           4.1932+-0.1413            4.0808+-0.1479          might be 1.0276x faster

<arithmetic>                    4.5800+-0.0283            4.5626+-0.0240          might be 1.0038x faster


Kraken:
                                              og                      change                                      

ai-astar                                86.353+-1.122             86.272+-0.970         
audio-beat-detection                    38.702+-0.824             37.905+-0.207           might be 1.0210x faster
audio-dft                               96.943+-0.911             96.321+-0.860         
audio-fft                               28.677+-0.319             28.555+-0.089         
audio-oscillator                        45.035+-0.874      ?      45.211+-0.740         ?
imaging-darkroom                        59.196+-0.420      ?      59.769+-0.621         ?
imaging-desaturate                      46.241+-1.354      ?      46.323+-1.287         ?
imaging-gaussian-blur                   61.823+-1.908      ?      61.889+-1.298         ?
json-parse-financial                    29.924+-0.602             29.341+-0.801           might be 1.0199x faster
json-stringify-tinderbox                19.285+-0.580      ?      19.461+-0.596         ?
stanford-crypto-aes                     45.266+-0.595             44.823+-0.694         
stanford-crypto-ccm                     40.188+-1.709      ?      42.222+-1.663         ? might be 1.0506x slower
stanford-crypto-pbkdf2                  59.240+-1.337             58.292+-0.998           might be 1.0163x faster
stanford-crypto-sha256-iterative        18.484+-0.198      ?      18.674+-0.319         ? might be 1.0103x slower

<arithmetic>                            48.240+-0.227             48.218+-0.196           might be 1.0004x faster

Octane:
                               og                      change                                      

encrypt                 0.14226+-0.00107    ?     0.14250+-0.00157       ?
decrypt                 2.52060+-0.05236          2.50127+-0.01857       
deltablue      x2       0.13210+-0.00695    ?     0.13668+-0.00739       ? might be 1.0347x slower
earley                  0.27072+-0.00190          0.27002+-0.00160       
boyer                   4.08958+-0.03067          4.06981+-0.03893       
navier-stokes  x2       4.71916+-0.03640    ?     4.72684+-0.02003       ?
raytrace       x2       0.69958+-0.00671          0.69796+-0.00355       
richards       x2       0.07551+-0.00057    ?     0.07637+-0.00087       ? might be 1.0114x slower
splay          x2       0.23196+-0.00337          0.22936+-0.00304         might be 1.0113x faster
regexp         x2      17.02444+-0.26960    ?    17.05077+-0.24658       ?
pdfjs          x2      32.98554+-0.27433         32.85914+-0.15934       
mandreel       x2      42.67114+-0.50024         42.62337+-0.48825       
gbemu          x2      30.25326+-0.37371         30.21056+-0.37130       
closure                 0.51465+-0.00412          0.51294+-0.00347       
jquery                  6.89535+-0.05234    ?     6.93038+-0.03677       ?
box2d          x2       8.17766+-0.04043          8.16934+-0.03734       
zlib           x2     290.31449+-1.30404    ?   291.47037+-1.64223       ?
typescript     x2     616.77960+-11.47325       616.16953+-6.53855       
splay-latency           2.29834+-0.15146          2.25637+-0.15124         might be 1.0186x faster

<geometric>             4.52523+-0.02523    ?     4.52979+-0.02359       ? might be 1.0010x slower
Comment 9 Saam Barati 2018-06-07 00:12:00 PDT
Created attachment 342131 [details]
patch for landing
Comment 10 WebKit Commit Bot 2018-06-07 01:45:05 PDT
Comment on attachment 342131 [details]
patch for landing

Clearing flags on attachment: 342131

Committed r232578: <https://trac.webkit.org/changeset/232578>
Comment 11 WebKit Commit Bot 2018-06-07 01:45:06 PDT
All reviewed patches have been landed.  Closing bug.