Chromium should use TCMalloc on Mac to go fast
http://code.google.com/p/chromium/issues/detail?id=158645
Created attachment 175330 [details] Patch
Created attachment 175364 [details] dom-modify results (first three without patch; last three with patch) This appears to be a 5-10% win on dom-modify, which is a fairly allocation heavy benchmark. I haven't run any other benchmarks. It might be worth landing this patch experimentally and seeing its effect on the perf bots.
Created attachment 175374 [details] Patch
One thing we need to check here is whether we still crash on failed allocations for objects that don't override operator new. If not, we might need to create a new configuration that uses the system malloc for the global operators but not for the per-class operators.
Comment on attachment 175374 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=175374&action=review > Source/WTF/ChangeLog:21 > + Fortunately, WebKit has a mechanism for selectively enablin TCMalloc enabling > Source/WTF/wtf/Platform.h:581 > + * WebKit isn't memory tight. "memory tight" is not the most descriptive phrase. :) > Source/WTF/wtf/Platform.h:593 > + * FIXME: I believe there is actually an extra branch in malloc in this configuration > + * because our "system" malloc already crashes on failed allocations. There's no need > + * for the branch in the USE(SYSTEM_MALLOC) implementation of fastMalloc that crashes > + * on failed allocations. You should just file a bug and link to it.
Thank you for following up with this.
You might want to announce this on chromium-dev, as it may confuse some.
(In reply to comment #5) > One thing we need to check here is whether we still crash on failed allocations for objects that don't override operator new. If not, we might need to create a new configuration that uses the system malloc for the global operators but not for the per-class operators. Yup. Easy to test with a gtest: http://stackoverflow.com/questions/6569713/testing-for-crash-with-google-test
Comment on attachment 175374 [details] Patch cq- pending death test.
I support landing this and seeing what happens on the perf bots.
This bug is related to bug 103027, which is about optimizing malloc on Windows.
Based on some discussion in bug 103027, it looks like http://src.chromium.org/viewvc/chrome/trunk/src/base/process_util_mac.mm is how Chromium causes the process to crash on out-of-memory, so the branch in fastMalloc is redundant on Mac as well as Windows.
We need to test to make sure chromium-mac crashes on out-of-memory with this patch, but if it does, it sounds like this patch is correct.
Created attachment 175873 [details] Patch for landing
Comment on attachment 175873 [details] Patch for landing Clearing flags on attachment: 175873 Committed r135666: <http://trac.webkit.org/changeset/135666>
All reviewed patches have been landed. Closing bug.
As expected, there was a 10% improvement in dom-modify: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dromaeo_domcoremodify/report.html?history=100&rev=169394 intl1 also got faster by 7.5%: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/intl1/report.html?history=100&rev=169394 intl2 also got faster by 4.5%: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/intl2/report.html?history=100&rev=169394 dom-perf improved by 9%: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dom_perf/report.html?history=100&rev=169394
I'm glad this worked out so well. Thanks for taking this on. I suspect we'll want to investigate how to use chromium's copy of tcmalloc directly (eventually).
3% improvement: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dromaeo_domcoretraverse/report.html?rev=169690&graph=dom_traverse_previousSibling&trace=score&history=150 8% improvement: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dromaeo_jslibeventjquery/report.html?rev=169690&graph=jslib_event_jquery_jQuery___trigger&trace=score&history=150 25% improvement: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dromaeo_jslibstylejquery/report.html?rev=169690&graph=jslib_style_jquery_jQuery____is__visible_&trace=score&history=150 http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dromaeo_jslibstylejquery/report.html?rev=169693&graph=jslib_style_jquery_jQuery____show__&trace=score&history=150 http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dromaeo_jslibstylejquery/report.html?rev=169693&graph=jslib_style_jquery_jQuery___width___x10&trace=score&history=150 7% improvement: http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/moz/report.html?rev=169690&graph=times&trace=t_extwr&history=150
7.4% improvement: http://build.chromium.org/f/chromium/perf/gpu-mac-release-intel/gpu_throughput/report.html?rev=169724&graph=many_images&trace=gpu_thread&history=150
12% improvement on Animation/balls: http://webkit-perf.appspot.com/graph.html#tests=[[3976565,2001,3001]]&sel=1353810236487.3518,1353841390292.866,32.294862429997565,37.20112003895788&displayrange=90&datatype=running
6.7% on html5-full-render.html as well: http://webkit-perf.appspot.com/graph.html#tests=[[3068,2001,32196],[3068,2001,7288486],[3068,2001,3001]]&sel=1353762378720.7466,1353879008367.7524,3796.4345053622596,4262.170346323358&displayrange=30&datatype=running