Bug 100872

Summary: [v8] r132913 caused a bunch of performance regressions and progressions
Product: WebKit Reporter: Ojan Vafai <ojan>
Component: TextAssignee: Ojan Vafai <ojan>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, dcarney, eric, haraken, japhet, webkit.review.bot
Priority: P2 Keywords: PerfRegression
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch none

Description Ojan Vafai 2012-10-31 11:36:23 PDT
http://trac.webkit.org/changeset/132913/ caused a bunch of performance regressions and progressions. Typically, we have a no perf regression policy in the tree. I think we should rollout for now and see if there is a fix that only improves performance. This will also help to verify that the other perf regressions I'm seeing are not from this patch. Sound OK?

Regressions:
http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dom_perf/report.html?rev=165183&graph=Events&trace=score&history=150
http://build.chromium.org/f/chromium/perf/mac-release-10.6-webkit-latest/dromaeo_domcorequery/report.html?rev=164983&graph=dom_query_getElementById&trace=score&history=150


Progressions:
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dromaeo_jslibattrjquery/report.html?rev=165203&graph=jslib_attr_jquery_jQuery___addClass&trace=score&history=150
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dromaeo_jslibattrjquery/report.html?rev=165139&graph=jslib_attr_jquery_jQuery___hasClass_x10&trace=score&history=150
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dom_perf/report.html?rev=165231&graph=Total&trace=score&history=150
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dom_perf/report.html?rev=165213&graph=Get%20Elements&trace=score&history=150
http://build.chromium.org/f/chromium/perf/chromium-rel-win7-webkit/dom_perf/report.html?rev=165231&graph=DOMDivWalk&trace=score&history=150
Comment 2 Dan Carney 2012-10-31 12:28:18 PDT
This patch was expected to improve performance in many benchmarks, but regress a few. There is no patch that will only progress. I really need this patch to get tested in Canary for a while, since it's the first of a series of patches that should eventually regression free. If you could hold off on a rollback for a while, that would be optimal.
Comment 3 Adam Barth 2012-10-31 12:41:29 PDT
I think it makes sense to keep the patch in.  Mostly it's important to be aware of these issues and to make sure we're making the right trade-offs here.
Comment 4 Ojan Vafai 2012-10-31 12:45:30 PDT
(In reply to comment #3)
> I think it makes sense to keep the patch in.  Mostly it's important to be aware of these issues and to make sure we're making the right trade-offs here.

Is a 33% regression on getElementById OK? In general, we have this policy of not leaving regressions in because it's hard to see when (if!) the regression gets fixed due to other patches going in that affect the performance.

Can we understand why this test case regressed so much? Not sure what having this patch on the Canary helps with in terms of addressing this benchmark.
Comment 5 Ojan Vafai 2012-11-01 10:09:29 PDT
It's hard to be sure, but I think this patch greatly increase the variance of http://build.chromium.org/f/chromium/perf/linux-release-webkit-latest/dom_perf/report.html?rev=165238&graph=Get%20Elements&history=150. When it performs poorly it's a 60% regression.

I really think we need to rollout until this is better understood. Having high variance like this makes it impossible to identify future regressions.

It's hard to know for sure it was this patch, but this was the most likely candidate and a rollout would confirm.
Comment 6 Ojan Vafai 2012-11-02 11:02:12 PDT
Created attachment 172088 [details]
Patch
Comment 7 Kentaro Hara 2012-11-02 11:08:32 PDT
Comment on attachment 172088 [details]
Patch

I am a big fan of the patch, but it looks like we need to investigate performance issues before landing.
Comment 8 Ojan Vafai 2012-11-02 11:16:55 PDT
Completely agree. If we believe we can make a patch that is regression free, then we should do so. Otherwise, we risk masking other performance regressions.

When we're sure that the regressions really are fundamental and it's just a trade-off of which scenarios we care more about, *then* we can evaluate whether we're making the right trade-offs. I think in this case, we'd still want to move forward with this change even if there are some unavoidable regressions, but lets try to get it so that it's strictly an improvement first.
Comment 9 WebKit Review Bot 2012-11-02 11:34:22 PDT
Comment on attachment 172088 [details]
Patch

Clearing flags on attachment: 172088

Committed r133319: <http://trac.webkit.org/changeset/133319>
Comment 10 WebKit Review Bot 2012-11-02 11:34:26 PDT
All reviewed patches have been landed.  Closing bug.
Comment 11 Eric Seidel (no email) 2012-11-02 11:59:53 PDT
I'm very interested in seeing this patch come back.  I'm happy to help with the regressions next week Dan, if you like!
Comment 12 Dan Carney 2012-11-09 02:01:16 PST
(In reply to comment #11)
> I'm very interested in seeing this patch come back.  I'm happy to help with the regressions next week Dan, if you like!

Eric, if you have time to figure out where the regression is coming in, that would be  really great. I can think of only 2 possible places that could cause slowdown.  You'll see in one spot I added the following:

// FIXME: There is no inline fast path for 8 bit atomic strings.

It could be that the stack assigned array in fromV8String for 16 bit strings is significantly faster than the malloc'd version used in the 8 bit call. If that's the case (and I assume it is), the fix is easy -just implement the 8 bit version. Also, might want to experiment with this value.

static const int inlineBufferSize = 16;

if the benchmarks uses a lot of length 18 strings, for instance, we could maybe get some speedup here.


The second thing could be the invocation of containsOnlyASCII here:

    if (string.is8Bit() && string.containsOnlyASCII()) {

it doesn't look like much, but it scans the entire string and won't return early. For larger strings, this could be costly as it could blow all the cpu cache lines. Once I'm done the V8 side of this patch, I will remove the call completely, so if this is causing a big regression, the solution is to either leave the patch out until the V8 side changes are in or to perform the check only for small strings.
Comment 13 Dan Carney 2012-11-16 07:18:33 PST
(In reply to comment #12)
> (In reply to comment #11)
> > I'm very interested in seeing this patch come back.  I'm happy to help with the regressions next week Dan, if you like!
> 
> Eric, if you have time to figure out where the regression is coming in, that would be  really great. I can think of only 2 possible places that could cause slowdown.  You'll see in one spot I added the following:
> 
> // FIXME: There is no inline fast path for 8 bit atomic strings.
> 
> It could be that the stack assigned array in fromV8String for 16 bit strings is significantly faster than the malloc'd version used in the 8 bit call. If that's the case (and I assume it is), the fix is easy -just implement the 8 bit version. Also, might want to experiment with this value.


It turned out that the 8 fast path for atomic string creation actually had been implemented since I first created that patch. Once I changed the code to use it, performance shot up across the board.  I tested all benchmarks carefully and got no regressions in dromeao/dom-query or anywhere in dom_perf, with the possible exception of a couple of the benchmarks in dom_perf/accessors. I couldn't tell if variations there were caused by major gcs or something more fundamental. I'll keep an eye on it after the patch is committed. Test variance generally dropped across the board as well, but I'm not sure if that holds in all cases.