There is code in the treatment of op_ret in DFGByteCodeParser.cpp::parseBlock() that checks if an early return is from the first block. In that case, it skips the rest of the function, knowing that all other blocks are dead code. I would like to measure whether this is something that ever happens in actual code, because I do not see why it would happen, and if it does not, we could get rid of this code, and of everything that touches the shouldContinueParsing variable.
(In reply to Robin Morisset from comment #0) > There is code in the treatment of op_ret in > DFGByteCodeParser.cpp::parseBlock() that checks if an early return is from > the first block. > In that case, it skips the rest of the function, knowing that all other > blocks are dead code. > I would like to measure whether this is something that ever happens in > actual code, because I do not see why it would happen, and if it does not, > we could get rid of this code, and of everything that touches the > shouldContinueParsing variable. Lol. I think I remember why. Say you do: function foo() { return 42; } This will turn into this bytecode: op_enter op_ret 42 op_ret undefined This is because the parser views "return" as just a statement, and then emits a "op_ret undefined" at the end of foo because if foo did fall through, it would have returned undefined. The original DFG only supported single-basic-block functions. I think that was before I even started working on WebKit. So, probably, that check was there to even make it possible for the DFG to compile single basic block functions as single basic block functions. Also, if there is anything in the DFG that optimizes for single basic block, those optimizations would fail if you disabled this. Anyway, enough history. :-) I think it's a good idea to test removing these silly "optimizations". If it is performance-neutral, then it's good for the sanity of the code to just remove it.
Created attachment 323182 [details] benchmarks results I ran benchmarks with --inner 5 --outer 5 (I was not sure of what parameters to use), comparing top of tree ("Baseline"), with my patch from https://bugs.webkit.org/show_bug.cgi?id=177922 ("Refactor") and with a combination of that patch + removing the optimisation that stops parsing when there is a return in the first block of the function ("Return"). I am not sure exactly how to read the results: even after having disabled spotlight, wifi, inSync and every other application, it was still quite noisy. Overall, there does not seem to be a large difference on average, but a few micro-benchmarks have significant performance changes.
I ran some basic benchmarks in which I removed this "optimisation": Benchmark report for Octane on MacBook-Pro-4 (MacBookPro14,3). VMs tested: "Baseline" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/Baseline/Release/jsc "Return" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/ReturnSimpl/Release/jsc Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline Return encrypt 0.12978+-0.00274 ? 0.12990+-0.00180 ? decrypt 2.27387+-0.02004 2.26779+-0.01558 deltablue x2 0.12433+-0.00526 0.12070+-0.00666 might be 1.0300x faster earley 0.23988+-0.00299 0.23789+-0.00510 boyer 4.14790+-0.04999 ? 4.17062+-0.03407 ? navier-stokes x2 4.75290+-0.03824 ? 4.78339+-0.08019 ? raytrace x2 0.65215+-0.00778 0.65068+-0.00691 richards x2 0.07699+-0.00072 0.07680+-0.00075 splay x2 0.30160+-0.00452 0.29970+-0.00216 regexp x2 14.76220+-0.28300 ? 14.89334+-0.18142 ? pdfjs x2 37.43778+-0.94895 ? 37.52657+-0.59151 ? mandreel x2 36.69220+-0.56194 ? 37.27994+-1.44801 ? might be 1.0160x slower gbemu x2 27.81859+-0.16985 27.81324+-0.31214 closure 0.43844+-0.00260 ? 0.43908+-0.00490 ? jquery 5.70626+-0.06706 5.65665+-0.04046 box2d x2 7.13657+-0.06160 ? 7.18821+-0.06391 ? zlib x2 296.23094+-9.83051 ? 296.76506+-2.64597 ? typescript x2 595.48682+-5.07882 588.20536+-6.52180 might be 1.0124x faster <geometric> 4.45677+-0.01361 4.45171+-0.02024 might be 1.0011x faster And Benchmark report for Octane on MacBook-Pro-4 (MacBookPro14,3). VMs tested: "Baseline" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/Baseline/Release/jsc "Return" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/ReturnSimpl/Release/jsc Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline Return encrypt 0.12978+-0.00274 ? 0.12990+-0.00180 ? decrypt 2.27387+-0.02004 2.26779+-0.01558 deltablue x2 0.12433+-0.00526 0.12070+-0.00666 might be 1.0300x faster earley 0.23988+-0.00299 0.23789+-0.00510 boyer 4.14790+-0.04999 ? 4.17062+-0.03407 ? navier-stokes x2 4.75290+-0.03824 ? 4.78339+-0.08019 ? raytrace x2 0.65215+-0.00778 0.65068+-0.00691 richards x2 0.07699+-0.00072 0.07680+-0.00075 splay x2 0.30160+-0.00452 0.29970+-0.00216 regexp x2 14.76220+-0.28300 ? 14.89334+-0.18142 ? pdfjs x2 37.43778+-0.94895 ? 37.52657+-0.59151 ? mandreel x2 36.69220+-0.56194 ? 37.27994+-1.44801 ? might be 1.0160x slower gbemu x2 27.81859+-0.16985 27.81324+-0.31214 closure 0.43844+-0.00260 ? 0.43908+-0.00490 ? jquery 5.70626+-0.06706 5.65665+-0.04046 box2d x2 7.13657+-0.06160 ? 7.18821+-0.06391 ? zlib x2 296.23094+-9.83051 ? 296.76506+-2.64597 ? typescript x2 595.48682+-5.07882 588.20536+-6.52180 might be 1.0124x faster <geometric> 4.45677+-0.01361 4.45171+-0.02024 might be 1.0011x faster I think it shows pretty clearly that this "optimisation" does not bring anything in practice. So I will upload a patch that removes it (as soon as https://bugs.webkit.org/show_bug.cgi?id=177922 lands).
Created attachment 323253 [details] Patch
Just realized I had some copy-paste failure in my comment on benchmarks results above. One of the two benchmarks run should have shown the results of sunspider instead of octane. It also had no statistically significant regression.
Comment on attachment 323253 [details] Patch r=me
Comment on attachment 323253 [details] Patch Clearing flags on attachment: 323253 Committed r223112: <http://trac.webkit.org/changeset/223112>
All reviewed patches have been landed. Closing bug.
<rdar://problem/34900752>