Bug 72312

Summary: DFG code blocks that have speculation checks on objects should refer to those objects weakly
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Nobody <webkit-unassigned>
Status: RESOLVED DUPLICATE    
Severity: Normal CC: ggaren
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on: 72313, 72346, 72367, 72467, 72563    
Bug Blocks: 72311    
Attachments:
Description Flags
the patch
none
the patch
none
the patch none

Description Filip Pizlo 2011-11-14 14:29:25 PST
If the DFG performs optimizations on things like get_by_id or call inlining, it will generate code that refers to objects that the source code did not have strong references to.  Instead, the references should be weak, so that if those objects are otherwise dead, the DFG's code does not keep them alive longer than necessary.
Comment 1 Filip Pizlo 2011-11-16 21:05:37 PST
Created attachment 115517 [details]
the patch

Still working on this, but it appears to work.  In the sense that it doesn't cause massive failures.  The mechanism should not kick in yet, though, since any weak references the DFG creates will be strongly referenced from the old JIT's code block for the same executable.
Comment 2 Filip Pizlo 2011-11-16 22:53:34 PST
This passes all tests and does not affect performance.



[pizlo@nitroflex bencher] ./bencher TipOfTree:/Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc DFGWeakFixpoint:/Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc --remote bigmac,oldmac --local
Packaging VM builds for remote hosts...
Sending VM builds to bigmac...
Running on bigmac...
376/376                                                              
Generating benchmark report at TipOfTree_DFGWeakFixpoint_SunSpiderV8Kraken_20111116_2231_benchReport.txt

Benchmark report for SunSpider, V8, and Kraken on bigmac.local (MacPro5,1).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc (r100556)
"DFGWeakFixpoint" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r100556)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime()
function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in
milliseconds.

                                            TipOfTree            DFGWeakFixpoint                                 
SunSpider:
   3d-cube                                7.4025+-0.0318    ?     7.4205+-0.0591       ?
   3d-morph                               8.5759+-0.1344    ^     8.3710+-0.0468       ^ definitely 1.0245x faster
   3d-raytrace                            7.7128+-0.0510          7.7036+-0.0581       
   access-binary-trees                    1.5980+-0.0042    ?     1.5990+-0.0170       ?
   access-fannkuch                        7.5230+-0.0142    ?     7.5241+-0.0135       ?
   access-nbody                           4.1804+-0.0190          4.1720+-0.0057       
   access-nsieve                          3.1509+-0.0438    ?     3.1579+-0.0404       ?
   bitops-3bit-bits-in-byte               1.2410+-0.0135          1.2386+-0.0150       
   bitops-bits-in-byte                    4.9122+-0.0111    ?     4.9190+-0.0144       ?
   bitops-bitwise-and                     3.2869+-0.0055          3.2857+-0.0082       
   bitops-nsieve-bits                     5.6567+-0.0421    ?     5.6683+-0.0517       ?
   controlflow-recursive                  2.2911+-0.0140    ?     2.2963+-0.0251       ?
   crypto-aes                             7.1446+-0.0363    ?     7.1503+-0.0403       ?
   crypto-md5                             2.5014+-0.0194    ?     2.5064+-0.0112       ?
   crypto-sha1                            2.1775+-0.0164    ?     2.1810+-0.0207       ?
   date-format-tofte                     10.6195+-0.0757    ?    10.6559+-0.0350       ?
   date-format-xparb                     11.0232+-0.1290         10.7497+-0.1798         might be 1.0254x faster
   math-cordic                            7.1711+-0.0601          7.1227+-0.0152       
   math-partial-sums                     10.4554+-0.0438         10.4411+-0.0189       
   math-spectral-norm                     2.6151+-0.0258          2.6041+-0.0217       
   regexp-dna                            12.9846+-0.0596         12.9728+-0.0542       
   string-base64                          3.9285+-0.0148    ?     3.9315+-0.0093       ?
   string-fasta                           7.3736+-0.0234    !     7.4478+-0.0469       ! definitely 1.0101x slower
   string-tagcloud                       12.9813+-0.0696    ?    13.0143+-0.0725       ?
   string-unpack-code                    22.2271+-0.0721         22.0984+-0.0928       
   string-validate-input                  5.7264+-0.2076          5.6436+-0.0373         might be 1.0147x faster

   <arithmetic> *                         6.7869+-0.0206          6.7644+-0.0206         might be 1.0033x faster
   <geometric>                            5.4023+-0.0186          5.3910+-0.0189         might be 1.0021x faster
   <harmonic>                             4.2005+-0.0174          4.1965+-0.0198         might be 1.0010x faster

                                            TipOfTree            DFGWeakFixpoint                                 
V8:
   crypto                                77.4661+-0.2099         77.2271+-0.2656       
   deltablue                            168.4244+-0.9719    ?   170.2991+-0.9884       ? might be 1.0111x slower
   earley-boyer                         104.4967+-1.4653    ?   104.6927+-1.4071       ?
   raytrace                              62.8356+-0.5364    ?    63.5544+-0.4021       ? might be 1.0114x slower
   regexp                               124.3039+-0.3228        123.5584+-0.5526       
   richards                             137.9984+-1.1186        137.4574+-0.2484       
   splay                                 90.2481+-1.0749    ?    90.3772+-1.1051       ?

   <arithmetic>                         109.3962+-0.4163    ?   109.5952+-0.3137       ? might be 1.0018x slower
   <geometric> *                        104.1090+-0.4122    ?   104.2988+-0.2974       ? might be 1.0018x slower
   <harmonic>                            98.9935+-0.4152    ?    99.2217+-0.2908       ? might be 1.0023x slower

                                            TipOfTree            DFGWeakFixpoint                                 
Kraken:
   ai-astar                             827.1420+-0.4496        817.6509+-10.4008        might be 1.0116x faster
   audio-beat-detection                 204.3278+-1.0583        203.6777+-0.6133       
   audio-dft                            260.7611+-2.6100        259.3759+-1.9791       
   audio-fft                            133.7633+-1.0539        133.0065+-0.6485       
   audio-oscillator                     293.5864+-0.5500        293.4817+-0.4592       
   imaging-darkroom                     338.4275+-5.2683        338.0507+-4.9685       
   imaging-desaturate                   240.7667+-0.0978    ?   240.8129+-0.0949       ?
   imaging-gaussian-blur                620.4826+-0.2404    ?   620.6064+-0.3426       ?
   json-parse-financial                  73.7686+-0.2356    !    74.3384+-0.0356       ! definitely 1.0077x slower
   json-stringify-tinderbox              86.8172+-0.6073         86.5426+-0.1903       
   stanford-crypto-aes                  117.9453+-1.3677        116.9190+-0.3102       
   stanford-crypto-ccm                  117.2711+-2.1838        115.7353+-0.7542         might be 1.0133x faster
   stanford-crypto-pbkdf2               233.0118+-0.9260    ^   231.2346+-0.6784       ^ definitely 1.0077x faster
   stanford-crypto-sha256-iterative      92.7507+-0.2058    ?    92.8232+-0.2092       ?

   <arithmetic> *                       260.0587+-0.4855        258.8754+-0.8583         might be 1.0046x faster
   <geometric>                          200.3032+-0.5880        199.5797+-0.3308         might be 1.0036x faster
   <harmonic>                           161.4670+-0.6430        161.0342+-0.1887         might be 1.0027x faster

                                            TipOfTree            DFGWeakFixpoint                                 
All benchmarks:
   <arithmetic>                          97.5118+-0.1409         97.1765+-0.2359         might be 1.0035x faster
   <geometric>                           24.6230+-0.0601         24.5745+-0.0557         might be 1.0020x faster
   <harmonic>                             7.4048+-0.0301          7.3978+-0.0342         might be 1.0009x faster

                                            TipOfTree            DFGWeakFixpoint                                 
Geomean of preferred means:
   <scaled-result>                       56.8516+-0.1218         56.7367+-0.1020         might be 1.0020x faster

Sending VM builds to oldmac...
Running on oldmac...
376/376                                                              
Generating benchmark report at TipOfTree_DFGWeakFixpoint_SunSpiderV8Kraken_20111116_2236_benchReport.txt

Benchmark report for SunSpider, V8, and Kraken on oldmac.local (MacPro4,1).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc (r100556)
"DFGWeakFixpoint" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r100556)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime()
function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in
milliseconds.

                                            TipOfTree            DFGWeakFixpoint                                 
SunSpider:
   3d-cube                                8.9939+-0.0825          8.9444+-0.0435       
   3d-morph                              10.5753+-0.1637    ^    10.1176+-0.0361       ^ definitely 1.0452x faster
   3d-raytrace                            9.2619+-0.1168          9.2029+-0.0666       
   access-binary-trees                    1.9065+-0.0090    ?     1.9100+-0.0080       ?
   access-fannkuch                        9.0863+-0.0115          9.0832+-0.0096       
   access-nbody                           5.0490+-0.0226          5.0372+-0.0054       
   access-nsieve                          3.7179+-0.0075          3.7178+-0.0181       
   bitops-3bit-bits-in-byte               1.4891+-0.0091    ?     1.4896+-0.0123       ?
   bitops-bits-in-byte                    5.9343+-0.0082          5.9278+-0.0139       
   bitops-bitwise-and                     3.9691+-0.0086    ?     3.9845+-0.0265       ?
   bitops-nsieve-bits                     6.8083+-0.0471    ?     6.8819+-0.0630       ? might be 1.0108x slower
   controlflow-recursive                  2.7662+-0.0176          2.7626+-0.0219       
   crypto-aes                             8.6018+-0.0531    ?     8.6409+-0.0500       ?
   crypto-md5                             2.9910+-0.0141    ?     2.9935+-0.0204       ?
   crypto-sha1                            2.6066+-0.0201    ?     2.6198+-0.0335       ?
   date-format-tofte                     12.9102+-0.0856    ?    13.1692+-0.1851       ? might be 1.0201x slower
   date-format-xparb                     13.8851+-0.4087    ^    13.2598+-0.1603       ^ definitely 1.0472x faster
   math-cordic                            8.6070+-0.0225    ?     8.6158+-0.0197       ?
   math-partial-sums                     12.5914+-0.0258    ?    12.6221+-0.0323       ?
   math-spectral-norm                     3.1252+-0.0049    ?     3.1270+-0.0066       ?
   regexp-dna                            15.7101+-0.0735         15.6987+-0.0602       
   string-base64                          4.7604+-0.0416          4.7557+-0.0330       
   string-fasta                           8.9272+-0.0259    ?     8.9639+-0.0251       ?
   string-tagcloud                       15.8031+-0.0985    ?    15.8938+-0.0961       ?
   string-unpack-code                    27.4174+-0.0675    ?    27.4486+-0.0542       ?
   string-validate-input                  6.8051+-0.0814    ?     6.8241+-0.0696       ?

   <arithmetic> *                         8.2423+-0.0338          8.2189+-0.0274         might be 1.0028x faster
   <geometric>                            6.5256+-0.0248          6.5147+-0.0231         might be 1.0017x faster
   <harmonic>                             5.0506+-0.0188          5.0500+-0.0194         might be 1.0001x faster

                                            TipOfTree            DFGWeakFixpoint                                 
V8:
   crypto                                93.4672+-0.3147         93.2981+-0.3405       
   deltablue                            202.9885+-1.2812    ?   204.3343+-0.5945       ?
   earley-boyer                         126.5185+-1.7474        125.8953+-1.3421       
   raytrace                              75.8272+-0.4731    ?    76.3673+-0.6252       ?
   regexp                               148.4860+-0.3750        147.8853+-0.2487       
   richards                             165.9017+-0.3001    ?   166.2819+-0.4990       ?
   splay                                106.3137+-0.7282    ?   106.3880+-1.1149       ?

   <arithmetic>                         131.3575+-0.3553    ?   131.4929+-0.3208       ? might be 1.0010x slower
   <geometric> *                        124.9888+-0.3483    ?   125.0944+-0.3708       ? might be 1.0008x slower
   <harmonic>                           118.8639+-0.3498    ?   118.9853+-0.4081       ? might be 1.0010x slower

                                            TipOfTree            DFGWeakFixpoint                                 
Kraken:
   ai-astar                             895.1609+-0.6140    ?   895.5434+-0.7446       ?
   audio-beat-detection                 250.0935+-1.7636        248.3998+-0.6289       
   audio-dft                            314.9273+-2.6615        314.3998+-2.6952       
   audio-fft                            162.4365+-1.1958    ?   162.7427+-1.1187       ?
   audio-oscillator                     351.4217+-1.3468    !   358.3105+-3.1365       ! definitely 1.0196x slower
   imaging-darkroom                     403.5768+-5.8311    ?   405.5841+-6.4234       ?
   imaging-desaturate                   291.2551+-0.0914        291.1852+-0.0766       
   imaging-gaussian-blur                750.6847+-0.1627    ?   750.7431+-0.1130       ?
   json-parse-financial                  90.0316+-0.3616    ?    90.6249+-0.3743       ?
   json-stringify-tinderbox             105.1410+-0.7503        105.0963+-0.5095       
   stanford-crypto-aes                  139.6053+-0.3937    !   145.2645+-3.7119       ! definitely 1.0405x slower
   stanford-crypto-ccm                  137.8774+-0.7153    ?   138.7943+-0.7749       ?
   stanford-crypto-pbkdf2               281.4804+-2.2261        281.2057+-2.3171       
   stanford-crypto-sha256-iterative     112.7609+-0.1899        112.6113+-0.2735       

   <arithmetic> *                       306.1752+-0.5808    ?   307.1790+-0.6449       ? might be 1.0033x slower
   <geometric>                          239.5798+-0.4921    !   240.7382+-0.6096       ! definitely 1.0048x slower
   <harmonic>                           194.4552+-0.4400    !   195.5881+-0.5319       ! definitely 1.0058x slower

                                            TipOfTree            DFGWeakFixpoint                                 
All benchmarks:
   <arithmetic>                         115.3246+-0.1789    ?   115.6308+-0.1878       ? might be 1.0027x slower
   <geometric>                           29.6285+-0.0779    ?    29.6475+-0.0724       ? might be 1.0006x slower
   <harmonic>                             8.9036+-0.0327          8.9033+-0.0336         might be 1.0000x faster

                                            TipOfTree            DFGWeakFixpoint                                 
Geomean of preferred means:
   <scaled-result>                       68.0707+-0.1521    ?    68.0999+-0.1367       ? might be 1.0004x slower

Running locally...
376/376                                                              
Generating benchmark report at TipOfTree_DFGWeakFixpoint_SunSpiderV8Kraken_20111116_2238_benchReport.txt

Benchmark report for SunSpider, V8, and Kraken on nitroflex.local (MacBookPro8,2).

VMs tested:
"TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc (r100556)
"DFGWeakFixpoint" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r100556)

Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample
measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime()
function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in
milliseconds.

                                            TipOfTree            DFGWeakFixpoint                                 
SunSpider:
   3d-cube                                6.9460+-0.1867          6.7194+-0.1208         might be 1.0337x faster
   3d-morph                               7.6314+-0.1379    ?     7.6325+-0.1602       ?
   3d-raytrace                            7.2052+-0.1985          7.1258+-0.1072         might be 1.0111x faster
   access-binary-trees                    1.5625+-0.0849          1.5398+-0.0517         might be 1.0147x faster
   access-fannkuch                        6.1227+-0.0908          6.1188+-0.0846       
   access-nbody                           3.4842+-0.0867          3.3981+-0.0667         might be 1.0254x faster
   access-nsieve                          2.4985+-0.0530    ?     2.5069+-0.0756       ?
   bitops-3bit-bits-in-byte               1.2494+-0.0190    ?     1.2716+-0.0358       ? might be 1.0178x slower
   bitops-bits-in-byte                    2.3695+-0.0817          2.3440+-0.0654         might be 1.0109x faster
   bitops-bitwise-and                     3.4721+-0.0672          3.4238+-0.0743         might be 1.0141x faster
   bitops-nsieve-bits                     5.1600+-0.0369    ?     5.2602+-0.0785       ? might be 1.0194x slower
   controlflow-recursive                  2.0428+-0.0329    ?     2.0540+-0.0554       ?
   crypto-aes                             6.9739+-0.1714          6.9190+-0.1607       
   crypto-md5                             2.3774+-0.0536          2.2922+-0.0375         might be 1.0372x faster
   crypto-sha1                            2.0697+-0.0512          2.0302+-0.0488         might be 1.0195x faster
   date-format-tofte                      9.9694+-0.2594    ?    10.0733+-0.1845       ? might be 1.0104x slower
   date-format-xparb                      9.9780+-0.1308    !    10.5261+-0.1992       ! definitely 1.0549x slower
   math-cordic                            6.3460+-0.0750    ?     6.3696+-0.1468       ?
   math-partial-sums                      7.4234+-0.1088          7.3576+-0.1024       
   math-spectral-norm                     2.3826+-0.0537          2.2977+-0.0415         might be 1.0369x faster
   regexp-dna                            10.7769+-0.1826    ?    10.9456+-0.1406       ? might be 1.0157x slower
   string-base64                          3.8982+-0.1135    ?     3.8995+-0.1257       ?
   string-fasta                           6.5992+-0.1566          6.5441+-0.1320       
   string-tagcloud                       11.5205+-0.2195         11.4117+-0.3031       
   string-unpack-code                    20.1020+-0.2994         19.8269+-0.3950         might be 1.0139x faster
   string-validate-input                  5.2314+-0.0889    ?     5.3145+-0.1251       ? might be 1.0159x slower

   <arithmetic> *                         5.9767+-0.0226          5.9693+-0.0255         might be 1.0012x faster
   <geometric>                            4.7696+-0.0276          4.7506+-0.0265         might be 1.0040x faster
   <harmonic>                             3.7707+-0.0358          3.7468+-0.0301         might be 1.0064x faster

                                            TipOfTree            DFGWeakFixpoint                                 
V8:
   crypto                                69.5986+-0.3279    ?    69.7588+-0.3738       ?
   deltablue                            149.5710+-1.8420    ?   149.6216+-1.6266       ?
   earley-boyer                          85.3260+-0.6411         84.8371+-0.8974       
   raytrace                              55.7197+-0.3247         55.5022+-0.4194       
   regexp                               102.2668+-0.3653        101.9430+-0.3022       
   richards                             116.7542+-0.4597    ?   117.0623+-0.7954       ?
   splay                                 70.5089+-0.7751    ?    71.5232+-1.3310       ? might be 1.0144x slower

   <arithmetic>                          92.8207+-0.3544    ?    92.8926+-0.4306       ? might be 1.0008x slower
   <geometric> *                         88.2391+-0.3096    ?    88.3206+-0.4070       ? might be 1.0009x slower
   <harmonic>                            84.0796+-0.2990    ?    84.1629+-0.4063       ? might be 1.0010x slower

                                            TipOfTree            DFGWeakFixpoint                                 
Kraken:
   ai-astar                             485.9056+-4.6336    ?   486.6909+-3.7186       ?
   audio-beat-detection                 189.0524+-2.1896        186.4712+-0.5929         might be 1.0138x faster
   audio-dft                            261.3420+-2.3663    ?   262.4310+-2.9045       ?
   audio-fft                            121.7954+-0.6730    ?   122.6599+-0.8257       ?
   audio-oscillator                     248.7274+-0.8064    ?   248.9647+-1.9696       ?
   imaging-darkroom                     299.4438+-6.1518        296.2831+-5.4980         might be 1.0107x faster
   imaging-desaturate                   221.6356+-0.6927    ?   221.8542+-0.6321       ?
   imaging-gaussian-blur                544.9619+-2.2368    ?   547.5263+-3.7770       ?
   json-parse-financial                  57.3320+-0.4201         57.1369+-0.2277       
   json-stringify-tinderbox              72.7738+-0.9726         72.2742+-0.3048       
   stanford-crypto-aes                   94.6696+-0.4103         94.5581+-0.7075       
   stanford-crypto-ccm                   98.3032+-0.4316    ?    98.3397+-0.6751       ?
   stanford-crypto-pbkdf2               186.2725+-0.6082    !   187.6683+-0.7824       ! definitely 1.0075x slower
   stanford-crypto-sha256-iterative      79.5267+-0.2744    ?    79.6907+-0.3087       ?

   <arithmetic> *                       211.5530+-0.5515    ?   211.6107+-0.6581       ? might be 1.0003x slower
   <geometric>                          168.6919+-0.2784        168.6176+-0.3932         might be 1.0004x faster
   <harmonic>                           135.9649+-0.2866        135.8237+-0.2465         might be 1.0010x faster

                                            TipOfTree            DFGWeakFixpoint                                 
All benchmarks:
   <arithmetic>                          80.1464+-0.1555    ?    80.1702+-0.1736       ? might be 1.0003x slower
   <geometric>                           21.3056+-0.0684         21.2587+-0.0709         might be 1.0022x faster
   <harmonic>                             6.6370+-0.0615          6.5960+-0.0518         might be 1.0062x faster

                                            TipOfTree            DFGWeakFixpoint                                 
Geomean of preferred means:
   <scaled-result>                       48.1403+-0.0747         48.1397+-0.1053         might be 1.0000x faster

[pizlo@nitroflex bencher]
Comment 3 Filip Pizlo 2011-11-17 00:00:20 PST
Created attachment 115537 [details]
the patch

Fixed 32-bit build.  All tests pass, performance is neutral.  Verified with printf's that the code was actually doing things; it was, and as expected: most code blocks find all their weak references to be live on the first pass, and all code blocks eventually always find all of their weak references to be live in the benchmarks we track.  The latter is true mainly because we still treat references from inline caches to be strong.
Comment 4 Filip Pizlo 2011-11-17 00:01:32 PST
Created attachment 115538 [details]
the patch

Removed bogus comment.
Comment 5 Oliver Hunt 2011-11-17 08:30:33 PST
Comment on attachment 115538 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=115538&action=review

> Source/JavaScriptCore/bytecode/CodeBlock.cpp:1685
> +    // If some weak references are dead, then this fixpoint iteration was
> +    // unsuccessful.
> +    if (!allAreLiveSoFar)
> +        return;

If we have weak references that are dead, where are we either clearing them, or removing the usage?  My reading of this says that we'll end up maintaining references to dead objects, that may subsequently become live again (through new allocations) -- this does not currently happen as all values are forced to be live via the constant tables, but if that were not the case this seems like it would be incorrect.  What have I missed?
Comment 6 Filip Pizlo 2011-11-17 13:45:29 PST
(In reply to comment #5)
> (From update of attachment 115538 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=115538&action=review
> 
> > Source/JavaScriptCore/bytecode/CodeBlock.cpp:1685
> > +    // If some weak references are dead, then this fixpoint iteration was
> > +    // unsuccessful.
> > +    if (!allAreLiveSoFar)
> > +        return;
> 
> If we have weak references that are dead, where are we either clearing them, or removing the usage?  My reading of this says that we'll end up maintaining references to dead objects, that may subsequently become live again (through new allocations) -- this does not currently happen as all values are forced to be live via the constant tables, but if that were not the case this seems like it would be incorrect.  What have I missed?

If we ever hit this point, then we would have also registered an unconditional finalizer.  If we finish GC without proving that the code block's weak references are live, then the unconditional finalizer will jettison the code block.

Weak references are never cleared.  If the code is executing (i.e. it's on the stack), we strongly mark its weak references.  If it's not executing, then we can jettison immediately during GC, so if any of its weak references are dead we just jettison and the code block dies immediately.
Comment 7 Geoffrey Garen 2011-11-17 13:53:24 PST
Comment on attachment 115538 [details]
the patch

View in context: https://bugs.webkit.org/attachment.cgi?id=115538&action=review

Thinking abstractly, I'm not sure this is the right strategy for making optimizations weakly referenced. But perhaps there's a concrete application I'm not considering, which validates this strategy. I would have a much easier time reasoning about this code if I could look at a benchmark or test case that demonstrated the problem we're trying to solve.

> Source/JavaScriptCore/bytecode/CodeBlock.cpp:1623
> +    // Add a weak reference harvester if we have not reached fixpoint and need to
> +    // run again.

To turn this into a "why" comment, I would write: GC doesn't have enough information yet for us to decide whether to keep our DFG data, so we need to register a handler to run again at the end of GC, when more information is available.

> Source/JavaScriptCore/bytecode/CodeBlock.cpp:1657
> +                // If the following three things are live, then the target of the
> +                // transition is also live:
> +                // - This code block. We know it's live already because otherwise
> +                //   we wouldn't be scanning ourselves.
> +                // - The code origin of the transition. Transitions may arise from
> +                //   code that was inlined. They are not relevant if the user's
> +                //   object that is required for the inlinee to run is no longer
> +                //   live.
> +                // - The source of the transition. The transition checks if some
> +                //   heap location holds the source, and if so, stores the target.
> +                //   Hence the source must be live for the transition to be live.

Why do we want structure transitions to perform checks on behalf of inlined functions? It would be much more natural for functions themselves to act as first-class weak citizens, such that a linked/inlined function disappearing would be sufficient cause to blow away an optimized code block.

More broadly, I don't understand why structure transitions get special treatment here. Why are they distinct from other weakly referenced optimizations in a DFG code block?

I'm concerned that an A->B transition keeps B alive if A is alive. This is the opposite of the structure marking strategy, which says that A should stay alive only if B is alive. The benefit of the structure marking strategy is that it ensures that a chain of structures can be retired if it becomes stale. In the case of object literals, A is the default empty object structure, which is permanent, so, with this patch's strategy, all transitions from an object literal are permanent. In the case of a constructor function, A is the function's .prototype's inheritorID, which will have been marked by the same object marking the CodeBlock, so, with this patch's strategy, the transition once again becomes permanent. (To clarify, by "permanent" I mean, "will live as long as a strong reference", not necessarily "memory leak".)

Maybe you're trying to make sure that a hot constructor CodeBlock survives even if the objects it constructs tend to be garbage? If so, I'm still not sure why the transitions are special compared to other weakly optimized assumptions. It seems like any weakly optimized assumption could pertain to a typically short-lived object, and cause jettison churn. Perhaps a better strategy would be an explicit jettison churn guard, instead.

> Source/JavaScriptCore/bytecode/CodeBlock.h:1111
> +            // Am I a DFG code block? If not, then I'm live if I am being scanned.
> +            if (!m_dfgData)
> +                return true;
> +            
> +            // If I am a DFG code block, then am I currently executing? If so,
> +            // then I'm definitely live.
> +            if (m_dfgData->mayBeExecuting)

I don't think these "what" comments add anything. A "why" comment explaining why baseline JIT code always assumes itself to be live might be helpful.
Comment 8 Filip Pizlo 2011-11-17 14:03:29 PST
(In reply to comment #7)
> (From update of attachment 115538 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=115538&action=review
> 
> Thinking abstractly, I'm not sure this is the right strategy for making optimizations weakly referenced. But perhaps there's a concrete application I'm not considering, which validates this strategy. I would have a much easier time reasoning about this code if I could look at a benchmark or test case that demonstrated the problem we're trying to solve.

// One place in your code:
function foo(f) {
    return f(42);
}

// Another place in your code:
function bar() {
    foo(someWindow.baz);
}

There's a good chance t hat someWindow.baz will be inlined into foo().  Now so long as foo() is alive, someWindow is alive.

> 
> > Source/JavaScriptCore/bytecode/CodeBlock.cpp:1623
> > +    // Add a weak reference harvester if we have not reached fixpoint and need to
> > +    // run again.
> 
> To turn this into a "why" comment, I would write: GC doesn't have enough information yet for us to decide whether to keep our DFG data, so we need to register a handler to run again at the end of GC, when more information is available.
> 
> > Source/JavaScriptCore/bytecode/CodeBlock.cpp:1657
> > +                // If the following three things are live, then the target of the
> > +                // transition is also live:
> > +                // - This code block. We know it's live already because otherwise
> > +                //   we wouldn't be scanning ourselves.
> > +                // - The code origin of the transition. Transitions may arise from
> > +                //   code that was inlined. They are not relevant if the user's
> > +                //   object that is required for the inlinee to run is no longer
> > +                //   live.
> > +                // - The source of the transition. The transition checks if some
> > +                //   heap location holds the source, and if so, stores the target.
> > +                //   Hence the source must be live for the transition to be live.
> 
> Why do we want structure transitions to perform checks on behalf of inlined functions? It would be much more natural for functions themselves to act as first-class weak citizens, such that a linked/inlined function disappearing would be sufficient cause to blow away an optimized code block.

They don't do anything on behalf of functions.  Functions are also separately weak referenced.  Transitions just honor the fact that the transition isn't going to happen unless the owner is alive.

> 
> More broadly, I don't understand why structure transitions get special treatment here. Why are they distinct from other weakly referenced optimizations in a DFG code block?

Say you have Foo used as constructor:

function Foo() {
   this.a = 1;
   this.b = 2;
   this.c = 3;
}

If during GC, the structure corresponding to empty->a->b->c may not be reachable from anywhere but Foo(), because Foo() hadn't run recently, or did, but it's result was short-lived.  But Foo() is alive.  So we want to keep empty->a->b->c alive as well, because otherwise, we'd have to throw away Foo().

Since Foo() may only exist in inlined form, the inline owner must look at the structure transitions of each of these assignments and check if both the owner (Foo) is alive, and the predecessor in the transition is alive, before deciding that the successor is alive.

> 
> I'm concerned that an A->B transition keeps B alive if A is alive. This is the opposite of the structure marking strategy, which says that A should stay alive only if B is alive. The benefit of the structure marking strategy is that it ensures that a chain of structures can be retired if it becomes stale. In the case of object literals, A is the default empty object structure, which is permanent, so, with this patch's strategy, all transitions from an object literal are permanent. In the case of a constructor function, A is the function's .prototype's inheritorID, which will have been marked by the same object marking the CodeBlock, so, with this patch's strategy, the transition once again becomes permanent. (To clarify, by "permanent" I mean, "will live as long as a strong reference", not necessarily "memory leak".)

No, they won't be permanent.  Because the prerequisite to the A->B transition keeping B alive is that the code owner is also alive.

> 
> Maybe you're trying to make sure that a hot constructor CodeBlock survives even if the objects it constructs tend to be garbage? If so, I'm still not sure why the transitions are special compared to other weakly optimized assumptions. It seems like any weakly optimized assumption could pertain to a typically short-lived object, and cause jettison churn. Perhaps a better strategy would be an explicit jettison churn guard, instead.

Not sure I follow.

The transition logic is not about making CodeBlocks survive.  It's about making the transitions they use remain valid.  Then separately the CodeBlock will survive if its weak references survive, and some of its weak references will be from its transitions (all of the values in WeakReferenceTransition are also in the CodeBlock's weak reference list).

> 
> > Source/JavaScriptCore/bytecode/CodeBlock.h:1111
> > +            // Am I a DFG code block? If not, then I'm live if I am being scanned.
> > +            if (!m_dfgData)
> > +                return true;
> > +            
> > +            // If I am a DFG code block, then am I currently executing? If so,
> > +            // then I'm definitely live.
> > +            if (m_dfgData->mayBeExecuting)
> 
> I don't think these "what" comments add anything. A "why" comment explaining why baseline JIT code always assumes itself to be live might be helpful.
Comment 9 Filip Pizlo 2011-11-17 23:14:45 PST
Comment on attachment 115538 [details]
the patch

I think it's better if this patch is just combined with https://bugs.webkit.org/show_bug.cgi?id=72311
Comment 10 Geoffrey Garen 2011-11-18 15:31:24 PST

*** This bug has been marked as a duplicate of bug 72311 ***