177925 – Evaluate the benefit of skipping dead code in the DFGByteCodeParser when a function returns in its first block

RESOLVED FIXED 177925

Evaluate the benefit of skipping dead code in the DFGByteCodeParser when a function returns in its first block

https://bugs.webkit.org/show_bug.cgi?id=177925

Summary Evaluate the benefit of skipping dead code in the DFGByteCodeParser when a fu...

Robin Morisset

Reported 2017-10-05 02:38:22 PDT

There is code in the treatment of op_ret in DFGByteCodeParser.cpp::parseBlock() that checks if an early return is from the first block. In that case, it skips the rest of the function, knowing that all other blocks are dead code. I would like to measure whether this is something that ever happens in actual code, because I do not see why it would happen, and if it does not, we could get rid of this code, and of everything that touches the shouldContinueParsing variable.

Attachments
benchmarks results (103.90 KB, text/plain) 2017-10-09 10:02 PDT, Robin Morisset	no flags	Details
Patch (6.14 KB, patch) 2017-10-09 17:26 PDT, Robin Morisset	no flags	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Filip Pizlo

Comment 1 2017-10-06 14:58:27 PDT

(In reply to Robin Morisset from comment #0) > There is code in the treatment of op_ret in > DFGByteCodeParser.cpp::parseBlock() that checks if an early return is from > the first block. > In that case, it skips the rest of the function, knowing that all other > blocks are dead code. > I would like to measure whether this is something that ever happens in > actual code, because I do not see why it would happen, and if it does not, > we could get rid of this code, and of everything that touches the > shouldContinueParsing variable. Lol. I think I remember why. Say you do: function foo() { return 42; } This will turn into this bytecode: op_enter op_ret 42 op_ret undefined This is because the parser views "return" as just a statement, and then emits a "op_ret undefined" at the end of foo because if foo did fall through, it would have returned undefined. The original DFG only supported single-basic-block functions. I think that was before I even started working on WebKit. So, probably, that check was there to even make it possible for the DFG to compile single basic block functions as single basic block functions. Also, if there is anything in the DFG that optimizes for single basic block, those optimizations would fail if you disabled this. Anyway, enough history. :-) I think it's a good idea to test removing these silly "optimizations". If it is performance-neutral, then it's good for the sanity of the code to just remove it.

Robin Morisset

Comment 2 2017-10-09 10:02:01 PDT

Created attachment 323182 [details] benchmarks results I ran benchmarks with --inner 5 --outer 5 (I was not sure of what parameters to use), comparing top of tree ("Baseline"), with my patch from https://bugs.webkit.org/show_bug.cgi?id=177922 ("Refactor") and with a combination of that patch + removing the optimisation that stops parsing when there is a return in the first block of the function ("Return"). I am not sure exactly how to read the results: even after having disabled spotlight, wifi, inSync and every other application, it was still quite noisy. Overall, there does not seem to be a large difference on average, but a few micro-benchmarks have significant performance changes.

Robin Morisset

Comment 3 2017-10-09 17:03:55 PDT

I ran some basic benchmarks in which I removed this "optimisation": Benchmark report for Octane on MacBook-Pro-4 (MacBookPro14,3). VMs tested: "Baseline" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/Baseline/Release/jsc "Return" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/ReturnSimpl/Release/jsc Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline Return encrypt 0.12978+-0.00274 ? 0.12990+-0.00180 ? decrypt 2.27387+-0.02004 2.26779+-0.01558 deltablue x2 0.12433+-0.00526 0.12070+-0.00666 might be 1.0300x faster earley 0.23988+-0.00299 0.23789+-0.00510 boyer 4.14790+-0.04999 ? 4.17062+-0.03407 ? navier-stokes x2 4.75290+-0.03824 ? 4.78339+-0.08019 ? raytrace x2 0.65215+-0.00778 0.65068+-0.00691 richards x2 0.07699+-0.00072 0.07680+-0.00075 splay x2 0.30160+-0.00452 0.29970+-0.00216 regexp x2 14.76220+-0.28300 ? 14.89334+-0.18142 ? pdfjs x2 37.43778+-0.94895 ? 37.52657+-0.59151 ? mandreel x2 36.69220+-0.56194 ? 37.27994+-1.44801 ? might be 1.0160x slower gbemu x2 27.81859+-0.16985 27.81324+-0.31214 closure 0.43844+-0.00260 ? 0.43908+-0.00490 ? jquery 5.70626+-0.06706 5.65665+-0.04046 box2d x2 7.13657+-0.06160 ? 7.18821+-0.06391 ? zlib x2 296.23094+-9.83051 ? 296.76506+-2.64597 ? typescript x2 595.48682+-5.07882 588.20536+-6.52180 might be 1.0124x faster <geometric> 4.45677+-0.01361 4.45171+-0.02024 might be 1.0011x faster And Benchmark report for Octane on MacBook-Pro-4 (MacBookPro14,3). VMs tested: "Baseline" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/Baseline/Release/jsc "Return" at /Users/rmorisset/Webkit2/OpenSource/WebKitBuild/ReturnSimpl/Release/jsc Collected 6 samples per benchmark/VM, with 6 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds. Baseline Return encrypt 0.12978+-0.00274 ? 0.12990+-0.00180 ? decrypt 2.27387+-0.02004 2.26779+-0.01558 deltablue x2 0.12433+-0.00526 0.12070+-0.00666 might be 1.0300x faster earley 0.23988+-0.00299 0.23789+-0.00510 boyer 4.14790+-0.04999 ? 4.17062+-0.03407 ? navier-stokes x2 4.75290+-0.03824 ? 4.78339+-0.08019 ? raytrace x2 0.65215+-0.00778 0.65068+-0.00691 richards x2 0.07699+-0.00072 0.07680+-0.00075 splay x2 0.30160+-0.00452 0.29970+-0.00216 regexp x2 14.76220+-0.28300 ? 14.89334+-0.18142 ? pdfjs x2 37.43778+-0.94895 ? 37.52657+-0.59151 ? mandreel x2 36.69220+-0.56194 ? 37.27994+-1.44801 ? might be 1.0160x slower gbemu x2 27.81859+-0.16985 27.81324+-0.31214 closure 0.43844+-0.00260 ? 0.43908+-0.00490 ? jquery 5.70626+-0.06706 5.65665+-0.04046 box2d x2 7.13657+-0.06160 ? 7.18821+-0.06391 ? zlib x2 296.23094+-9.83051 ? 296.76506+-2.64597 ? typescript x2 595.48682+-5.07882 588.20536+-6.52180 might be 1.0124x faster <geometric> 4.45677+-0.01361 4.45171+-0.02024 might be 1.0011x faster I think it shows pretty clearly that this "optimisation" does not bring anything in practice. So I will upload a patch that removes it (as soon as https://bugs.webkit.org/show_bug.cgi?id=177922 lands).

Robin Morisset

Comment 4 2017-10-09 17:26:47 PDT

Created attachment 323253 [details] Patch

Robin Morisset

Comment 5 2017-10-09 17:29:32 PDT

Just realized I had some copy-paste failure in my comment on benchmarks results above. One of the two benchmarks run should have shown the results of sunspider instead of octane. It also had no statistically significant regression.

Saam Barati

Comment 6 2017-10-09 17:50:16 PDT

Comment on attachment 323253 [details] Patch r=me

WebKit Commit Bot

Comment 7 2017-10-09 18:30:06 PDT

Comment on attachment 323253 [details] Patch Clearing flags on attachment: 323253 Committed r223112: <http://trac.webkit.org/changeset/223112>

WebKit Commit Bot

Comment 8 2017-10-09 18:30:07 PDT

All reviewed patches have been landed. Closing bug.

Radar WebKit Bug Importer

Comment 9 2017-10-09 18:31:03 PDT

<rdar://problem/34900752>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component JavaScriptCore

Assignee

Robin Morisset

Reported

2017-10-05 02:38 PDT

Modified

2022-05-14 00:19 PDT History

CC List

9 users Show

URL

Keywords InRadar

Depends on

177922

Blocks

Dependencies

tree graph