72312 – DFG code blocks that have speculation checks on objects should refer to those objects weakly

Filip Pizlo

Reported 2011-11-14 14:29:25 PST

If the DFG performs optimizations on things like get_by_id or call inlining, it will generate code that refers to objects that the source code did not have strong references to. Instead, the references should be weak, so that if those objects are otherwise dead, the DFG's code does not keep them alive longer than necessary.

Filip Pizlo

Comment 1 2011-11-16 21:05:37 PST

Created attachment 115517 [details] the patch Still working on this, but it appears to work. In the sense that it doesn't cause massive failures. The mechanism should not kick in yet, though, since any weak references the DFG creates will be strongly referenced from the old JIT's code block for the same executable.

Filip Pizlo

Comment 2 2011-11-16 22:53:34 PST

This passes all tests and does not affect performance. [pizlo@nitroflex bencher] ./bencher TipOfTree:/Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc DFGWeakFixpoint:/Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc --remote bigmac,oldmac --local Packaging VM builds for remote hosts... Sending VM builds to bigmac... Running on bigmac... 376/376 Generating benchmark report at TipOfTree_DFGWeakFixpoint_SunSpiderV8Kraken_20111116_2231_benchReport.txt Benchmark report for SunSpider, V8, and Kraken on bigmac.local (MacPro5,1). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc (r100556) "DFGWeakFixpoint" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r100556) Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree DFGWeakFixpoint SunSpider: 3d-cube 7.4025+-0.0318 ? 7.4205+-0.0591 ? 3d-morph 8.5759+-0.1344 ^ 8.3710+-0.0468 ^ definitely 1.0245x faster 3d-raytrace 7.7128+-0.0510 7.7036+-0.0581 access-binary-trees 1.5980+-0.0042 ? 1.5990+-0.0170 ? access-fannkuch 7.5230+-0.0142 ? 7.5241+-0.0135 ? access-nbody 4.1804+-0.0190 4.1720+-0.0057 access-nsieve 3.1509+-0.0438 ? 3.1579+-0.0404 ? bitops-3bit-bits-in-byte 1.2410+-0.0135 1.2386+-0.0150 bitops-bits-in-byte 4.9122+-0.0111 ? 4.9190+-0.0144 ? bitops-bitwise-and 3.2869+-0.0055 3.2857+-0.0082 bitops-nsieve-bits 5.6567+-0.0421 ? 5.6683+-0.0517 ? controlflow-recursive 2.2911+-0.0140 ? 2.2963+-0.0251 ? crypto-aes 7.1446+-0.0363 ? 7.1503+-0.0403 ? crypto-md5 2.5014+-0.0194 ? 2.5064+-0.0112 ? crypto-sha1 2.1775+-0.0164 ? 2.1810+-0.0207 ? date-format-tofte 10.6195+-0.0757 ? 10.6559+-0.0350 ? date-format-xparb 11.0232+-0.1290 10.7497+-0.1798 might be 1.0254x faster math-cordic 7.1711+-0.0601 7.1227+-0.0152 math-partial-sums 10.4554+-0.0438 10.4411+-0.0189 math-spectral-norm 2.6151+-0.0258 2.6041+-0.0217 regexp-dna 12.9846+-0.0596 12.9728+-0.0542 string-base64 3.9285+-0.0148 ? 3.9315+-0.0093 ? string-fasta 7.3736+-0.0234 ! 7.4478+-0.0469 ! definitely 1.0101x slower string-tagcloud 12.9813+-0.0696 ? 13.0143+-0.0725 ? string-unpack-code 22.2271+-0.0721 22.0984+-0.0928 string-validate-input 5.7264+-0.2076 5.6436+-0.0373 might be 1.0147x faster <arithmetic> * 6.7869+-0.0206 6.7644+-0.0206 might be 1.0033x faster <geometric> 5.4023+-0.0186 5.3910+-0.0189 might be 1.0021x faster <harmonic> 4.2005+-0.0174 4.1965+-0.0198 might be 1.0010x faster TipOfTree DFGWeakFixpoint V8: crypto 77.4661+-0.2099 77.2271+-0.2656 deltablue 168.4244+-0.9719 ? 170.2991+-0.9884 ? might be 1.0111x slower earley-boyer 104.4967+-1.4653 ? 104.6927+-1.4071 ? raytrace 62.8356+-0.5364 ? 63.5544+-0.4021 ? might be 1.0114x slower regexp 124.3039+-0.3228 123.5584+-0.5526 richards 137.9984+-1.1186 137.4574+-0.2484 splay 90.2481+-1.0749 ? 90.3772+-1.1051 ? <arithmetic> 109.3962+-0.4163 ? 109.5952+-0.3137 ? might be 1.0018x slower <geometric> * 104.1090+-0.4122 ? 104.2988+-0.2974 ? might be 1.0018x slower <harmonic> 98.9935+-0.4152 ? 99.2217+-0.2908 ? might be 1.0023x slower TipOfTree DFGWeakFixpoint Kraken: ai-astar 827.1420+-0.4496 817.6509+-10.4008 might be 1.0116x faster audio-beat-detection 204.3278+-1.0583 203.6777+-0.6133 audio-dft 260.7611+-2.6100 259.3759+-1.9791 audio-fft 133.7633+-1.0539 133.0065+-0.6485 audio-oscillator 293.5864+-0.5500 293.4817+-0.4592 imaging-darkroom 338.4275+-5.2683 338.0507+-4.9685 imaging-desaturate 240.7667+-0.0978 ? 240.8129+-0.0949 ? imaging-gaussian-blur 620.4826+-0.2404 ? 620.6064+-0.3426 ? json-parse-financial 73.7686+-0.2356 ! 74.3384+-0.0356 ! definitely 1.0077x slower json-stringify-tinderbox 86.8172+-0.6073 86.5426+-0.1903 stanford-crypto-aes 117.9453+-1.3677 116.9190+-0.3102 stanford-crypto-ccm 117.2711+-2.1838 115.7353+-0.7542 might be 1.0133x faster stanford-crypto-pbkdf2 233.0118+-0.9260 ^ 231.2346+-0.6784 ^ definitely 1.0077x faster stanford-crypto-sha256-iterative 92.7507+-0.2058 ? 92.8232+-0.2092 ? <arithmetic> * 260.0587+-0.4855 258.8754+-0.8583 might be 1.0046x faster <geometric> 200.3032+-0.5880 199.5797+-0.3308 might be 1.0036x faster <harmonic> 161.4670+-0.6430 161.0342+-0.1887 might be 1.0027x faster TipOfTree DFGWeakFixpoint All benchmarks: <arithmetic> 97.5118+-0.1409 97.1765+-0.2359 might be 1.0035x faster <geometric> 24.6230+-0.0601 24.5745+-0.0557 might be 1.0020x faster <harmonic> 7.4048+-0.0301 7.3978+-0.0342 might be 1.0009x faster TipOfTree DFGWeakFixpoint Geomean of preferred means: <scaled-result> 56.8516+-0.1218 56.7367+-0.1020 might be 1.0020x faster Sending VM builds to oldmac... Running on oldmac... 376/376 Generating benchmark report at TipOfTree_DFGWeakFixpoint_SunSpiderV8Kraken_20111116_2236_benchReport.txt Benchmark report for SunSpider, V8, and Kraken on oldmac.local (MacPro4,1). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc (r100556) "DFGWeakFixpoint" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r100556) Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree DFGWeakFixpoint SunSpider: 3d-cube 8.9939+-0.0825 8.9444+-0.0435 3d-morph 10.5753+-0.1637 ^ 10.1176+-0.0361 ^ definitely 1.0452x faster 3d-raytrace 9.2619+-0.1168 9.2029+-0.0666 access-binary-trees 1.9065+-0.0090 ? 1.9100+-0.0080 ? access-fannkuch 9.0863+-0.0115 9.0832+-0.0096 access-nbody 5.0490+-0.0226 5.0372+-0.0054 access-nsieve 3.7179+-0.0075 3.7178+-0.0181 bitops-3bit-bits-in-byte 1.4891+-0.0091 ? 1.4896+-0.0123 ? bitops-bits-in-byte 5.9343+-0.0082 5.9278+-0.0139 bitops-bitwise-and 3.9691+-0.0086 ? 3.9845+-0.0265 ? bitops-nsieve-bits 6.8083+-0.0471 ? 6.8819+-0.0630 ? might be 1.0108x slower controlflow-recursive 2.7662+-0.0176 2.7626+-0.0219 crypto-aes 8.6018+-0.0531 ? 8.6409+-0.0500 ? crypto-md5 2.9910+-0.0141 ? 2.9935+-0.0204 ? crypto-sha1 2.6066+-0.0201 ? 2.6198+-0.0335 ? date-format-tofte 12.9102+-0.0856 ? 13.1692+-0.1851 ? might be 1.0201x slower date-format-xparb 13.8851+-0.4087 ^ 13.2598+-0.1603 ^ definitely 1.0472x faster math-cordic 8.6070+-0.0225 ? 8.6158+-0.0197 ? math-partial-sums 12.5914+-0.0258 ? 12.6221+-0.0323 ? math-spectral-norm 3.1252+-0.0049 ? 3.1270+-0.0066 ? regexp-dna 15.7101+-0.0735 15.6987+-0.0602 string-base64 4.7604+-0.0416 4.7557+-0.0330 string-fasta 8.9272+-0.0259 ? 8.9639+-0.0251 ? string-tagcloud 15.8031+-0.0985 ? 15.8938+-0.0961 ? string-unpack-code 27.4174+-0.0675 ? 27.4486+-0.0542 ? string-validate-input 6.8051+-0.0814 ? 6.8241+-0.0696 ? <arithmetic> * 8.2423+-0.0338 8.2189+-0.0274 might be 1.0028x faster <geometric> 6.5256+-0.0248 6.5147+-0.0231 might be 1.0017x faster <harmonic> 5.0506+-0.0188 5.0500+-0.0194 might be 1.0001x faster TipOfTree DFGWeakFixpoint V8: crypto 93.4672+-0.3147 93.2981+-0.3405 deltablue 202.9885+-1.2812 ? 204.3343+-0.5945 ? earley-boyer 126.5185+-1.7474 125.8953+-1.3421 raytrace 75.8272+-0.4731 ? 76.3673+-0.6252 ? regexp 148.4860+-0.3750 147.8853+-0.2487 richards 165.9017+-0.3001 ? 166.2819+-0.4990 ? splay 106.3137+-0.7282 ? 106.3880+-1.1149 ? <arithmetic> 131.3575+-0.3553 ? 131.4929+-0.3208 ? might be 1.0010x slower <geometric> * 124.9888+-0.3483 ? 125.0944+-0.3708 ? might be 1.0008x slower <harmonic> 118.8639+-0.3498 ? 118.9853+-0.4081 ? might be 1.0010x slower TipOfTree DFGWeakFixpoint Kraken: ai-astar 895.1609+-0.6140 ? 895.5434+-0.7446 ? audio-beat-detection 250.0935+-1.7636 248.3998+-0.6289 audio-dft 314.9273+-2.6615 314.3998+-2.6952 audio-fft 162.4365+-1.1958 ? 162.7427+-1.1187 ? audio-oscillator 351.4217+-1.3468 ! 358.3105+-3.1365 ! definitely 1.0196x slower imaging-darkroom 403.5768+-5.8311 ? 405.5841+-6.4234 ? imaging-desaturate 291.2551+-0.0914 291.1852+-0.0766 imaging-gaussian-blur 750.6847+-0.1627 ? 750.7431+-0.1130 ? json-parse-financial 90.0316+-0.3616 ? 90.6249+-0.3743 ? json-stringify-tinderbox 105.1410+-0.7503 105.0963+-0.5095 stanford-crypto-aes 139.6053+-0.3937 ! 145.2645+-3.7119 ! definitely 1.0405x slower stanford-crypto-ccm 137.8774+-0.7153 ? 138.7943+-0.7749 ? stanford-crypto-pbkdf2 281.4804+-2.2261 281.2057+-2.3171 stanford-crypto-sha256-iterative 112.7609+-0.1899 112.6113+-0.2735 <arithmetic> * 306.1752+-0.5808 ? 307.1790+-0.6449 ? might be 1.0033x slower <geometric> 239.5798+-0.4921 ! 240.7382+-0.6096 ! definitely 1.0048x slower <harmonic> 194.4552+-0.4400 ! 195.5881+-0.5319 ! definitely 1.0058x slower TipOfTree DFGWeakFixpoint All benchmarks: <arithmetic> 115.3246+-0.1789 ? 115.6308+-0.1878 ? might be 1.0027x slower <geometric> 29.6285+-0.0779 ? 29.6475+-0.0724 ? might be 1.0006x slower <harmonic> 8.9036+-0.0327 8.9033+-0.0336 might be 1.0000x faster TipOfTree DFGWeakFixpoint Geomean of preferred means: <scaled-result> 68.0707+-0.1521 ? 68.0999+-0.1367 ? might be 1.0004x slower Running locally... 376/376 Generating benchmark report at TipOfTree_DFGWeakFixpoint_SunSpiderV8Kraken_20111116_2238_benchReport.txt Benchmark report for SunSpider, V8, and Kraken on nitroflex.local (MacBookPro8,2). VMs tested: "TipOfTree" at /Volumes/Data/pizlo/quinary/OpenSource/WebKitBuild/Release/jsc (r100556) "DFGWeakFixpoint" at /Volumes/Data/pizlo/tertiary/OpenSource/WebKitBuild/Release/jsc (r100556) Collected 12 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. TipOfTree DFGWeakFixpoint SunSpider: 3d-cube 6.9460+-0.1867 6.7194+-0.1208 might be 1.0337x faster 3d-morph 7.6314+-0.1379 ? 7.6325+-0.1602 ? 3d-raytrace 7.2052+-0.1985 7.1258+-0.1072 might be 1.0111x faster access-binary-trees 1.5625+-0.0849 1.5398+-0.0517 might be 1.0147x faster access-fannkuch 6.1227+-0.0908 6.1188+-0.0846 access-nbody 3.4842+-0.0867 3.3981+-0.0667 might be 1.0254x faster access-nsieve 2.4985+-0.0530 ? 2.5069+-0.0756 ? bitops-3bit-bits-in-byte 1.2494+-0.0190 ? 1.2716+-0.0358 ? might be 1.0178x slower bitops-bits-in-byte 2.3695+-0.0817 2.3440+-0.0654 might be 1.0109x faster bitops-bitwise-and 3.4721+-0.0672 3.4238+-0.0743 might be 1.0141x faster bitops-nsieve-bits 5.1600+-0.0369 ? 5.2602+-0.0785 ? might be 1.0194x slower controlflow-recursive 2.0428+-0.0329 ? 2.0540+-0.0554 ? crypto-aes 6.9739+-0.1714 6.9190+-0.1607 crypto-md5 2.3774+-0.0536 2.2922+-0.0375 might be 1.0372x faster crypto-sha1 2.0697+-0.0512 2.0302+-0.0488 might be 1.0195x faster date-format-tofte 9.9694+-0.2594 ? 10.0733+-0.1845 ? might be 1.0104x slower date-format-xparb 9.9780+-0.1308 ! 10.5261+-0.1992 ! definitely 1.0549x slower math-cordic 6.3460+-0.0750 ? 6.3696+-0.1468 ? math-partial-sums 7.4234+-0.1088 7.3576+-0.1024 math-spectral-norm 2.3826+-0.0537 2.2977+-0.0415 might be 1.0369x faster regexp-dna 10.7769+-0.1826 ? 10.9456+-0.1406 ? might be 1.0157x slower string-base64 3.8982+-0.1135 ? 3.8995+-0.1257 ? string-fasta 6.5992+-0.1566 6.5441+-0.1320 string-tagcloud 11.5205+-0.2195 11.4117+-0.3031 string-unpack-code 20.1020+-0.2994 19.8269+-0.3950 might be 1.0139x faster string-validate-input 5.2314+-0.0889 ? 5.3145+-0.1251 ? might be 1.0159x slower <arithmetic> * 5.9767+-0.0226 5.9693+-0.0255 might be 1.0012x faster <geometric> 4.7696+-0.0276 4.7506+-0.0265 might be 1.0040x faster <harmonic> 3.7707+-0.0358 3.7468+-0.0301 might be 1.0064x faster TipOfTree DFGWeakFixpoint V8: crypto 69.5986+-0.3279 ? 69.7588+-0.3738 ? deltablue 149.5710+-1.8420 ? 149.6216+-1.6266 ? earley-boyer 85.3260+-0.6411 84.8371+-0.8974 raytrace 55.7197+-0.3247 55.5022+-0.4194 regexp 102.2668+-0.3653 101.9430+-0.3022 richards 116.7542+-0.4597 ? 117.0623+-0.7954 ? splay 70.5089+-0.7751 ? 71.5232+-1.3310 ? might be 1.0144x slower <arithmetic> 92.8207+-0.3544 ? 92.8926+-0.4306 ? might be 1.0008x slower <geometric> * 88.2391+-0.3096 ? 88.3206+-0.4070 ? might be 1.0009x slower <harmonic> 84.0796+-0.2990 ? 84.1629+-0.4063 ? might be 1.0010x slower TipOfTree DFGWeakFixpoint Kraken: ai-astar 485.9056+-4.6336 ? 486.6909+-3.7186 ? audio-beat-detection 189.0524+-2.1896 186.4712+-0.5929 might be 1.0138x faster audio-dft 261.3420+-2.3663 ? 262.4310+-2.9045 ? audio-fft 121.7954+-0.6730 ? 122.6599+-0.8257 ? audio-oscillator 248.7274+-0.8064 ? 248.9647+-1.9696 ? imaging-darkroom 299.4438+-6.1518 296.2831+-5.4980 might be 1.0107x faster imaging-desaturate 221.6356+-0.6927 ? 221.8542+-0.6321 ? imaging-gaussian-blur 544.9619+-2.2368 ? 547.5263+-3.7770 ? json-parse-financial 57.3320+-0.4201 57.1369+-0.2277 json-stringify-tinderbox 72.7738+-0.9726 72.2742+-0.3048 stanford-crypto-aes 94.6696+-0.4103 94.5581+-0.7075 stanford-crypto-ccm 98.3032+-0.4316 ? 98.3397+-0.6751 ? stanford-crypto-pbkdf2 186.2725+-0.6082 ! 187.6683+-0.7824 ! definitely 1.0075x slower stanford-crypto-sha256-iterative 79.5267+-0.2744 ? 79.6907+-0.3087 ? <arithmetic> * 211.5530+-0.5515 ? 211.6107+-0.6581 ? might be 1.0003x slower <geometric> 168.6919+-0.2784 168.6176+-0.3932 might be 1.0004x faster <harmonic> 135.9649+-0.2866 135.8237+-0.2465 might be 1.0010x faster TipOfTree DFGWeakFixpoint All benchmarks: <arithmetic> 80.1464+-0.1555 ? 80.1702+-0.1736 ? might be 1.0003x slower <geometric> 21.3056+-0.0684 21.2587+-0.0709 might be 1.0022x faster <harmonic> 6.6370+-0.0615 6.5960+-0.0518 might be 1.0062x faster TipOfTree DFGWeakFixpoint Geomean of preferred means: <scaled-result> 48.1403+-0.0747 48.1397+-0.1053 might be 1.0000x faster [pizlo@nitroflex bencher]

Filip Pizlo

Comment 3 2011-11-17 00:00:20 PST

Created attachment 115537 [details] the patch Fixed 32-bit build. All tests pass, performance is neutral. Verified with printf's that the code was actually doing things; it was, and as expected: most code blocks find all their weak references to be live on the first pass, and all code blocks eventually always find all of their weak references to be live in the benchmarks we track. The latter is true mainly because we still treat references from inline caches to be strong.

Filip Pizlo

Comment 4 2011-11-17 00:01:32 PST

Created attachment 115538 [details] the patch Removed bogus comment.

Oliver Hunt

Comment 5 2011-11-17 08:30:33 PST

Comment on attachment 115538 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=115538&action=review > Source/JavaScriptCore/bytecode/CodeBlock.cpp:1685 > + // If some weak references are dead, then this fixpoint iteration was > + // unsuccessful. > + if (!allAreLiveSoFar) > + return; If we have weak references that are dead, where are we either clearing them, or removing the usage? My reading of this says that we'll end up maintaining references to dead objects, that may subsequently become live again (through new allocations) -- this does not currently happen as all values are forced to be live via the constant tables, but if that were not the case this seems like it would be incorrect. What have I missed?

Filip Pizlo

Comment 6 2011-11-17 13:45:29 PST

(In reply to comment #5) > (From update of attachment 115538 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=115538&action=review > > > Source/JavaScriptCore/bytecode/CodeBlock.cpp:1685 > > + // If some weak references are dead, then this fixpoint iteration was > > + // unsuccessful. > > + if (!allAreLiveSoFar) > > + return; > > If we have weak references that are dead, where are we either clearing them, or removing the usage? My reading of this says that we'll end up maintaining references to dead objects, that may subsequently become live again (through new allocations) -- this does not currently happen as all values are forced to be live via the constant tables, but if that were not the case this seems like it would be incorrect. What have I missed? If we ever hit this point, then we would have also registered an unconditional finalizer. If we finish GC without proving that the code block's weak references are live, then the unconditional finalizer will jettison the code block. Weak references are never cleared. If the code is executing (i.e. it's on the stack), we strongly mark its weak references. If it's not executing, then we can jettison immediately during GC, so if any of its weak references are dead we just jettison and the code block dies immediately.

Geoffrey Garen

Comment 7 2011-11-17 13:53:24 PST

Comment on attachment 115538 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=115538&action=review Thinking abstractly, I'm not sure this is the right strategy for making optimizations weakly referenced. But perhaps there's a concrete application I'm not considering, which validates this strategy. I would have a much easier time reasoning about this code if I could look at a benchmark or test case that demonstrated the problem we're trying to solve. > Source/JavaScriptCore/bytecode/CodeBlock.cpp:1623 > + // Add a weak reference harvester if we have not reached fixpoint and need to > + // run again. To turn this into a "why" comment, I would write: GC doesn't have enough information yet for us to decide whether to keep our DFG data, so we need to register a handler to run again at the end of GC, when more information is available. > Source/JavaScriptCore/bytecode/CodeBlock.cpp:1657 > + // If the following three things are live, then the target of the > + // transition is also live: > + // - This code block. We know it's live already because otherwise > + // we wouldn't be scanning ourselves. > + // - The code origin of the transition. Transitions may arise from > + // code that was inlined. They are not relevant if the user's > + // object that is required for the inlinee to run is no longer > + // live. > + // - The source of the transition. The transition checks if some > + // heap location holds the source, and if so, stores the target. > + // Hence the source must be live for the transition to be live. Why do we want structure transitions to perform checks on behalf of inlined functions? It would be much more natural for functions themselves to act as first-class weak citizens, such that a linked/inlined function disappearing would be sufficient cause to blow away an optimized code block. More broadly, I don't understand why structure transitions get special treatment here. Why are they distinct from other weakly referenced optimizations in a DFG code block? I'm concerned that an A->B transition keeps B alive if A is alive. This is the opposite of the structure marking strategy, which says that A should stay alive only if B is alive. The benefit of the structure marking strategy is that it ensures that a chain of structures can be retired if it becomes stale. In the case of object literals, A is the default empty object structure, which is permanent, so, with this patch's strategy, all transitions from an object literal are permanent. In the case of a constructor function, A is the function's .prototype's inheritorID, which will have been marked by the same object marking the CodeBlock, so, with this patch's strategy, the transition once again becomes permanent. (To clarify, by "permanent" I mean, "will live as long as a strong reference", not necessarily "memory leak".) Maybe you're trying to make sure that a hot constructor CodeBlock survives even if the objects it constructs tend to be garbage? If so, I'm still not sure why the transitions are special compared to other weakly optimized assumptions. It seems like any weakly optimized assumption could pertain to a typically short-lived object, and cause jettison churn. Perhaps a better strategy would be an explicit jettison churn guard, instead. > Source/JavaScriptCore/bytecode/CodeBlock.h:1111 > + // Am I a DFG code block? If not, then I'm live if I am being scanned. > + if (!m_dfgData) > + return true; > + > + // If I am a DFG code block, then am I currently executing? If so, > + // then I'm definitely live. > + if (m_dfgData->mayBeExecuting) I don't think these "what" comments add anything. A "why" comment explaining why baseline JIT code always assumes itself to be live might be helpful.

Filip Pizlo

Comment 8 2011-11-17 14:03:29 PST

(In reply to comment #7) > (From update of attachment 115538 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=115538&action=review > > Thinking abstractly, I'm not sure this is the right strategy for making optimizations weakly referenced. But perhaps there's a concrete application I'm not considering, which validates this strategy. I would have a much easier time reasoning about this code if I could look at a benchmark or test case that demonstrated the problem we're trying to solve. // One place in your code: function foo(f) { return f(42); } // Another place in your code: function bar() { foo(someWindow.baz); } There's a good chance t hat someWindow.baz will be inlined into foo(). Now so long as foo() is alive, someWindow is alive. > > > Source/JavaScriptCore/bytecode/CodeBlock.cpp:1623 > > + // Add a weak reference harvester if we have not reached fixpoint and need to > > + // run again. > > To turn this into a "why" comment, I would write: GC doesn't have enough information yet for us to decide whether to keep our DFG data, so we need to register a handler to run again at the end of GC, when more information is available. > > > Source/JavaScriptCore/bytecode/CodeBlock.cpp:1657 > > + // If the following three things are live, then the target of the > > + // transition is also live: > > + // - This code block. We know it's live already because otherwise > > + // we wouldn't be scanning ourselves. > > + // - The code origin of the transition. Transitions may arise from > > + // code that was inlined. They are not relevant if the user's > > + // object that is required for the inlinee to run is no longer > > + // live. > > + // - The source of the transition. The transition checks if some > > + // heap location holds the source, and if so, stores the target. > > + // Hence the source must be live for the transition to be live. > > Why do we want structure transitions to perform checks on behalf of inlined functions? It would be much more natural for functions themselves to act as first-class weak citizens, such that a linked/inlined function disappearing would be sufficient cause to blow away an optimized code block. They don't do anything on behalf of functions. Functions are also separately weak referenced. Transitions just honor the fact that the transition isn't going to happen unless the owner is alive. > > More broadly, I don't understand why structure transitions get special treatment here. Why are they distinct from other weakly referenced optimizations in a DFG code block? Say you have Foo used as constructor: function Foo() { this.a = 1; this.b = 2; this.c = 3; } If during GC, the structure corresponding to empty->a->b->c may not be reachable from anywhere but Foo(), because Foo() hadn't run recently, or did, but it's result was short-lived. But Foo() is alive. So we want to keep empty->a->b->c alive as well, because otherwise, we'd have to throw away Foo(). Since Foo() may only exist in inlined form, the inline owner must look at the structure transitions of each of these assignments and check if both the owner (Foo) is alive, and the predecessor in the transition is alive, before deciding that the successor is alive. > > I'm concerned that an A->B transition keeps B alive if A is alive. This is the opposite of the structure marking strategy, which says that A should stay alive only if B is alive. The benefit of the structure marking strategy is that it ensures that a chain of structures can be retired if it becomes stale. In the case of object literals, A is the default empty object structure, which is permanent, so, with this patch's strategy, all transitions from an object literal are permanent. In the case of a constructor function, A is the function's .prototype's inheritorID, which will have been marked by the same object marking the CodeBlock, so, with this patch's strategy, the transition once again becomes permanent. (To clarify, by "permanent" I mean, "will live as long as a strong reference", not necessarily "memory leak".) No, they won't be permanent. Because the prerequisite to the A->B transition keeping B alive is that the code owner is also alive. > > Maybe you're trying to make sure that a hot constructor CodeBlock survives even if the objects it constructs tend to be garbage? If so, I'm still not sure why the transitions are special compared to other weakly optimized assumptions. It seems like any weakly optimized assumption could pertain to a typically short-lived object, and cause jettison churn. Perhaps a better strategy would be an explicit jettison churn guard, instead. Not sure I follow. The transition logic is not about making CodeBlocks survive. It's about making the transitions they use remain valid. Then separately the CodeBlock will survive if its weak references survive, and some of its weak references will be from its transitions (all of the values in WeakReferenceTransition are also in the CodeBlock's weak reference list). > > > Source/JavaScriptCore/bytecode/CodeBlock.h:1111 > > + // Am I a DFG code block? If not, then I'm live if I am being scanned. > > + if (!m_dfgData) > > + return true; > > + > > + // If I am a DFG code block, then am I currently executing? If so, > > + // then I'm definitely live. > > + if (m_dfgData->mayBeExecuting) > > I don't think these "what" comments add anything. A "why" comment explaining why baseline JIT code always assumes itself to be live might be helpful.

Filip Pizlo

Comment 9 2011-11-17 23:14:45 PST

Comment on attachment 115538 [details] the patch I think it's better if this patch is just combined with https://bugs.webkit.org/show_bug.cgi?id=72311

Geoffrey Garen

Comment 10 2011-11-18 15:31:24 PST

*** This bug has been marked as a duplicate of bug 72311 ***

Attachments
the patch (14.33 KB, patch) 2011-11-16 21:05 PST, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch (15.52 KB, patch) 2011-11-17 00:00 PST, Filip Pizlo	no flags	Details Formatted Diff Diff
the patch (15.35 KB, patch) 2011-11-17 00:01 PST, Filip Pizlo	no flags	Details Formatted Diff Diff
Show Obsolete (2) View All Add attachment proposed patch, testcase, etc.