Bug 160784 - [JSC] Revert most of r203808
Summary: [JSC] Revert most of r203808
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Benjamin Poulain
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-11 15:10 PDT by Benjamin Poulain
Modified: 2017-05-08 16:11 PDT (History)
3 users (show)

See Also:


Attachments
Patch (41.29 KB, patch)
2016-08-11 15:15 PDT, Benjamin Poulain
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Benjamin Poulain 2016-08-11 15:10:39 PDT
[JSC] Revert most of r203808
Comment 1 Benjamin Poulain 2016-08-11 15:15:45 PDT
Created attachment 285860 [details]
Patch
Comment 2 Geoffrey Garen 2016-08-11 16:27:25 PDT
Comment on attachment 285860 [details]
Patch

r=me

Ben mentioned that he saw time spent in madvise using fastMalloc. I think I can probably fix that with a little dedicated work.
Comment 3 WebKit Commit Bot 2016-08-11 17:23:52 PDT
Comment on attachment 285860 [details]
Patch

Clearing flags on attachment: 285860

Committed r204393: <http://trac.webkit.org/changeset/204393>
Comment 4 WebKit Commit Bot 2016-08-11 17:23:56 PDT
All reviewed patches have been landed.  Closing bug.
Comment 5 Saam Barati 2016-08-18 20:13:24 PDT
It looks like this progressed JetStream as expected by around 1%.
Comment 6 Geoffrey Garen 2016-08-19 09:44:27 PDT
Can you link to the result? Was it MacBook Air?
Comment 7 Geoffrey Garen 2017-05-08 15:26:58 PDT
I tested an up-to-date variant of this patch on a 2012 MacBook Air today and it appeared to be performance-neutral:

Benchmark report for Octane on admins-MacBook-Air-2 (MacBookAir5,2).

VMs tested:
"Baseline" at /Volumes/Big/ggaren/OpenSource/WebKitBuildBaseline/Release/jsc
"Patch" at /Volumes/Big/ggaren/OpenSource/WebKitBuild/Release/jsc (r215991)

Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc()
between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the
jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution
times with 95% confidence intervals in milliseconds.

                            Baseline                   Patch                                       

encrypt                 0.21080+-0.01634          0.20457+-0.00450         might be 1.0304x faster
decrypt                 3.53045+-0.03419    ?     3.57919+-0.15867       ? might be 1.0138x slower
deltablue      x2       0.17226+-0.01387    ?     0.17273+-0.00168       ?
earley                  0.36054+-0.00550    ?     0.36670+-0.01024       ? might be 1.0171x slower
boyer                   6.06766+-0.16303          6.01835+-0.14738       
navier-stokes  x2       5.94829+-0.05020          5.94349+-0.03419       
raytrace       x2       1.08885+-0.03810    ?     1.09791+-0.03960       ?
richards       x2       0.11626+-0.00874    ?     0.11770+-0.00836       ? might be 1.0124x slower
splay          x2       0.44481+-0.00488    ?     0.45395+-0.01899       ? might be 1.0205x slower
regexp         x2      25.87122+-1.85551    ?    25.87526+-0.73672       ?
pdfjs          x2      53.79756+-1.25736         53.56009+-0.63223       
mandreel       x2      73.84762+-3.91344    ?    78.50292+-18.05527      ? might be 1.0630x slower
gbemu          x2      52.98956+-11.26955        48.84833+-1.53337         might be 1.0848x faster
closure                 0.71097+-0.00467    ?     0.71797+-0.02552       ?
jquery                  9.94422+-0.20686          9.89728+-0.44068       
box2d          x2      13.90082+-1.11011    ?    14.98441+-2.20534       ? might be 1.0780x slower
zlib           x2     477.03310+-45.25841   ?   480.52492+-15.20169      ?
typescript     x2    1294.37280+-685.71484     1043.26562+-40.76296        might be 1.2407x faster

<geometric>             7.29211+-0.33598          7.25063+-0.10316         might be 1.0057x faster


Benchmark report for AsmBench on admins-MacBook-Air-2 (MacBookAir5,2).

VMs tested:
"Baseline" at /Volumes/Big/ggaren/OpenSource/WebKitBuildBaseline/Release/jsc
"Patch" at /Volumes/Big/ggaren/OpenSource/WebKitBuild/Release/jsc (r215991)

Collected 4 samples per benchmark/VM, with 4 VM invocations per benchmark. Emitted a call to gc()
between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the
jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times
with 95% confidence intervals in milliseconds.

                                 Baseline                   Patch                                       

bigfib.cpp                  702.9005+-25.5903         678.8968+-63.2468         might be 1.0354x faster
cray.c                      513.4209+-25.0330    ?    521.9055+-30.1923       ? might be 1.0165x slower
dry.c                       618.8755+-185.6885   ?    637.2598+-169.5742      ? might be 1.0297x slower
FloatMM.c                   939.8940+-14.4910    ?    940.7554+-15.4689       ?
gcc-loops.cpp              4916.3853+-44.1230        4843.4827+-50.9797         might be 1.0151x faster
n-body.c                   1106.8958+-13.9958        1105.8504+-13.3921       
Quicksort.c                 511.9797+-41.7437    ?    512.0046+-24.8828       ?
stepanov_container.cpp     4315.5320+-224.2974       4297.3511+-66.7650       
Towers.c                    300.1309+-7.9935          296.7681+-12.0559         might be 1.0113x faster

<geometric>                 968.3045+-36.8348         966.3857+-33.1399         might be 1.0020x faster
Comment 8 Geoffrey Garen 2017-05-08 15:29:16 PDT
(Also neutral on Mac Pro.)
Comment 9 Benjamin Poulain 2017-05-08 15:39:08 PDT
(In reply to Geoffrey Garen from comment #7)
> I tested an up-to-date variant of this patch on a 2012 MacBook Air today and
> it appeared to be performance-neutral:

Awesome!

Did you have to change the allocator?
Comment 10 Geoffrey Garen 2017-05-08 16:11:05 PDT
> Did you have to change the allocator?

I didn't end up changing the allocator.

My plan was to speed up the allocator by profiling -- but then I couldn't reproduce the regression. 

This might mean that some other changes since last year have resolved the issue -- but I guess there's still a chance that my computer just doesn't agree with the bot.

I'm going to re-test in-browser.