Summary: | [v8] r132913 caused a bunch of performance regressions and progressions | ||||||
---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Ojan Vafai <ojan> | ||||
Component: | Text | Assignee: | Ojan Vafai <ojan> | ||||
Status: | RESOLVED FIXED | ||||||
Severity: | Normal | CC: | abarth, dcarney, eric, haraken, japhet, webkit.review.bot | ||||
Priority: | P2 | Keywords: | PerfRegression | ||||
Version: | 528+ (Nightly build) | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Attachments: |
|
Description
Ojan Vafai
2012-10-31 11:36:23 PDT
Another progression: http://build.chromium.org/f/chromium/perf/linux-release-webkit-latest/dromaeo_jslibattrjquery/report.html?rev=164991&graph=jslib_attr_jquery_jQuery___hasClass_x10&trace=score&history=150 This patch was expected to improve performance in many benchmarks, but regress a few. There is no patch that will only progress. I really need this patch to get tested in Canary for a while, since it's the first of a series of patches that should eventually regression free. If you could hold off on a rollback for a while, that would be optimal. I think it makes sense to keep the patch in. Mostly it's important to be aware of these issues and to make sure we're making the right trade-offs here. (In reply to comment #3) > I think it makes sense to keep the patch in. Mostly it's important to be aware of these issues and to make sure we're making the right trade-offs here. Is a 33% regression on getElementById OK? In general, we have this policy of not leaving regressions in because it's hard to see when (if!) the regression gets fixed due to other patches going in that affect the performance. Can we understand why this test case regressed so much? Not sure what having this patch on the Canary helps with in terms of addressing this benchmark. It's hard to be sure, but I think this patch greatly increase the variance of http://build.chromium.org/f/chromium/perf/linux-release-webkit-latest/dom_perf/report.html?rev=165238&graph=Get%20Elements&history=150. When it performs poorly it's a 60% regression. I really think we need to rollout until this is better understood. Having high variance like this makes it impossible to identify future regressions. It's hard to know for sure it was this patch, but this was the most likely candidate and a rollout would confirm. Created attachment 172088 [details]
Patch
Comment on attachment 172088 [details]
Patch
I am a big fan of the patch, but it looks like we need to investigate performance issues before landing.
Completely agree. If we believe we can make a patch that is regression free, then we should do so. Otherwise, we risk masking other performance regressions. When we're sure that the regressions really are fundamental and it's just a trade-off of which scenarios we care more about, *then* we can evaluate whether we're making the right trade-offs. I think in this case, we'd still want to move forward with this change even if there are some unavoidable regressions, but lets try to get it so that it's strictly an improvement first. Comment on attachment 172088 [details] Patch Clearing flags on attachment: 172088 Committed r133319: <http://trac.webkit.org/changeset/133319> All reviewed patches have been landed. Closing bug. I'm very interested in seeing this patch come back. I'm happy to help with the regressions next week Dan, if you like! (In reply to comment #11) > I'm very interested in seeing this patch come back. I'm happy to help with the regressions next week Dan, if you like! Eric, if you have time to figure out where the regression is coming in, that would be really great. I can think of only 2 possible places that could cause slowdown. You'll see in one spot I added the following: // FIXME: There is no inline fast path for 8 bit atomic strings. It could be that the stack assigned array in fromV8String for 16 bit strings is significantly faster than the malloc'd version used in the 8 bit call. If that's the case (and I assume it is), the fix is easy -just implement the 8 bit version. Also, might want to experiment with this value. static const int inlineBufferSize = 16; if the benchmarks uses a lot of length 18 strings, for instance, we could maybe get some speedup here. The second thing could be the invocation of containsOnlyASCII here: if (string.is8Bit() && string.containsOnlyASCII()) { it doesn't look like much, but it scans the entire string and won't return early. For larger strings, this could be costly as it could blow all the cpu cache lines. Once I'm done the V8 side of this patch, I will remove the call completely, so if this is causing a big regression, the solution is to either leave the patch out until the V8 side changes are in or to perform the check only for small strings. (In reply to comment #12) > (In reply to comment #11) > > I'm very interested in seeing this patch come back. I'm happy to help with the regressions next week Dan, if you like! > > Eric, if you have time to figure out where the regression is coming in, that would be really great. I can think of only 2 possible places that could cause slowdown. You'll see in one spot I added the following: > > // FIXME: There is no inline fast path for 8 bit atomic strings. > > It could be that the stack assigned array in fromV8String for 16 bit strings is significantly faster than the malloc'd version used in the 8 bit call. If that's the case (and I assume it is), the fix is easy -just implement the 8 bit version. Also, might want to experiment with this value. It turned out that the 8 fast path for atomic string creation actually had been implemented since I first created that patch. Once I changed the code to use it, performance shot up across the board. I tested all benchmarks carefully and got no regressions in dromeao/dom-query or anywhere in dom_perf, with the possible exception of a couple of the benchmarks in dom_perf/accessors. I couldn't tell if variations there were caused by major gcs or something more fundamental. I'll keep an eye on it after the patch is committed. Test variance generally dropped across the board as well, but I'm not sure if that holds in all cases. |