Bug 100872 - [v8] r132913 caused a bunch of performance regressions and progressions
: [v8] r132913 caused a bunch of performance regressions and progressions
Status: RESOLVED FIXED
: WebKit
Text
: 528+ (Nightly build)
: Unspecified Unspecified
: P2 Normal
Assigned To:
:
: PerfRegression
:
:
  Show dependency treegraph
 
Reported: 2012-10-31 11:36 PST by
Modified: 2012-12-03 16:07 PST (History)


Attachments
Patch (15.85 KB, patch)
2012-11-02 11:02 PST, Ojan Vafai
no flags Review Patch | Details | Formatted Diff | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2012-10-31 11:36:23 PST
http://trac.webkit.org/changeset/132913/ caused a bunch of performance regressions and progressions. Typically, we have a no perf regression policy in the tree. I think we should rollout for now and see if there is a fix that only improves performance. This will also help to verify that the other perf regressions I'm seeing are not from this patch. Sound OK?

Regressions:
http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dom_perf/report.html?rev=165183&graph=Events&trace=score&history=150
http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dromaeo_domcorequery/report.html?rev=164983&graph=dom_query_getElementById&trace=score&history=150


Progressions:
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dromaeo_jslibattrjquery/report.html?rev=165203&graph=jslib_attr_jquery_jQuery___addClass&trace=score&history=150
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dromaeo_jslibattrjquery/report.html?rev=165139&graph=jslib_attr_jquery_jQuery___hasClass_x10&trace=score&history=150
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dom_perf/report.html?rev=165231&graph=Total&trace=score&history=150
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dom_perf/report.html?rev=165213&graph=Get%20Elements&trace=score&history=150
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dom_perf/report.html?rev=165231&graph=DOMDivWalk&trace=score&history=150
------- Comment #2 From 2012-10-31 12:28:18 PST -------
This patch was expected to improve performance in many benchmarks, but regress a few. There is no patch that will only progress. I really need this patch to get tested in Canary for a while, since it's the first of a series of patches that should eventually regression free. If you could hold off on a rollback for a while, that would be optimal.
------- Comment #3 From 2012-10-31 12:41:29 PST -------
I think it makes sense to keep the patch in.  Mostly it's important to be aware of these issues and to make sure we're making the right trade-offs here.
------- Comment #4 From 2012-10-31 12:45:30 PST -------
(In reply to comment #3)
> I think it makes sense to keep the patch in.  Mostly it's important to be aware of these issues and to make sure we're making the right trade-offs here.

Is a 33% regression on getElementById OK? In general, we have this policy of not leaving regressions in because it's hard to see when (if!) the regression gets fixed due to other patches going in that affect the performance.

Can we understand why this test case regressed so much? Not sure what having this patch on the Canary helps with in terms of addressing this benchmark.
------- Comment #5 From 2012-11-01 10:09:29 PST -------
It's hard to be sure, but I think this patch greatly increase the variance of http://build.chromium.org/f/chromium/perf/linux-release-webkit-latest/dom_perf/report.html?rev=165238&graph=Get%20Elements&history=150. When it performs poorly it's a 60% regression.

I really think we need to rollout until this is better understood. Having high variance like this makes it impossible to identify future regressions.

It's hard to know for sure it was this patch, but this was the most likely candidate and a rollout would confirm.
------- Comment #6 From 2012-11-02 11:02:12 PST -------
Created an attachment (id=172088) [details]
Patch
------- Comment #7 From 2012-11-02 11:08:32 PST -------
(From update of attachment 172088 [details])
I am a big fan of the patch, but it looks like we need to investigate performance issues before landing.
------- Comment #8 From 2012-11-02 11:16:55 PST -------
Completely agree. If we believe we can make a patch that is regression free, then we should do so. Otherwise, we risk masking other performance regressions.

When we're sure that the regressions really are fundamental and it's just a trade-off of which scenarios we care more about, *then* we can evaluate whether we're making the right trade-offs. I think in this case, we'd still want to move forward with this change even if there are some unavoidable regressions, but lets try to get it so that it's strictly an improvement first.
------- Comment #9 From 2012-11-02 11:34:22 PST -------
(From update of attachment 172088 [details])
Clearing flags on attachment: 172088

Committed r133319: <http://trac.webkit.org/changeset/133319>
------- Comment #10 From 2012-11-02 11:34:26 PST -------
All reviewed patches have been landed.  Closing bug.
------- Comment #11 From 2012-11-02 11:59:53 PST -------
I'm very interested in seeing this patch come back.  I'm happy to help with the regressions next week Dan, if you like!
------- Comment #12 From 2012-11-09 02:01:16 PST -------
(In reply to comment #11)
> I'm very interested in seeing this patch come back.  I'm happy to help with the regressions next week Dan, if you like!

Eric, if you have time to figure out where the regression is coming in, that would be  really great. I can think of only 2 possible places that could cause slowdown.  You'll see in one spot I added the following:

// FIXME: There is no inline fast path for 8 bit atomic strings.

It could be that the stack assigned array in fromV8String for 16 bit strings is significantly faster than the malloc'd version used in the 8 bit call. If that's the case (and I assume it is), the fix is easy -just implement the 8 bit version. Also, might want to experiment with this value.

static const int inlineBufferSize = 16;

if the benchmarks uses a lot of length 18 strings, for instance, we could maybe get some speedup here.


The second thing could be the invocation of containsOnlyASCII here:

    if (string.is8Bit() && string.containsOnlyASCII()) {

it doesn't look like much, but it scans the entire string and won't return early. For larger strings, this could be costly as it could blow all the cpu cache lines. Once I'm done the V8 side of this patch, I will remove the call completely, so if this is causing a big regression, the solution is to either leave the patch out until the V8 side changes are in or to perform the check only for small strings.
------- Comment #13 From 2012-11-16 07:18:33 PST -------
(In reply to comment #12)
> (In reply to comment #11)
> > I'm very interested in seeing this patch come back.  I'm happy to help with the regressions next week Dan, if you like!
> 
> Eric, if you have time to figure out where the regression is coming in, that would be  really great. I can think of only 2 possible places that could cause slowdown.  You'll see in one spot I added the following:
> 
> // FIXME: There is no inline fast path for 8 bit atomic strings.
> 
> It could be that the stack assigned array in fromV8String for 16 bit strings is significantly faster than the malloc'd version used in the 8 bit call. If that's the case (and I assume it is), the fix is easy -just implement the 8 bit version. Also, might want to experiment with this value.


It turned out that the 8 fast path for atomic string creation actually had been implemented since I first created that patch. Once I changed the code to use it, performance shot up across the board.  I tested all benchmarks carefully and got no regressions in dromeao/dom-query or anywhere in dom_perf, with the possible exception of a couple of the benchmarks in dom_perf/accessors. I couldn't tell if variations there were caused by major gcs or something more fundamental. I'll keep an eye on it after the patch is committed. Test variance generally dropped across the board as well, but I'm not sure if that holds in all cases.