RESOLVED FIXED Bug 145762
JetStream should have a more rational story for jitter-oriented latency tests
https://bugs.webkit.org/show_bug.cgi?id=145762
Summary JetStream should have a more rational story for jitter-oriented latency tests
Filip Pizlo
Reported 2015-06-08 11:05:27 PDT
Currently we have some latency tests that are meant to measure jitter. They do this by computing the RMS. But the RMS is a pretty bad metric. The thing that it rewards isn't really the thing that you'd want your browser to do. These RMS-based tests involve taking the geomean of the RMS of some samples and the sample average. The lower the geomean, the better (in the JetStream harness we then invert the scores so that higher is better, but let's ignore that for this discussion and assume that lower is better). Here's an example of how this can go bad. A browser that always computes a task in some horribly long time (say, 1000ms) but never varies that time will perform better than a browser that usually computes the task super quickly (say, 10ms) and sometimes just a little bit less quickly (say, 15ms). The former browser will have an RMS of 0 and an average of 1000. The latter will have a RMS somewhere around 3.5 and an average of 12.5 (assuming equal probability of 10ms and 15ms). The geomean of (0, 1000) is 0. The geomean of (3.5, 12.5) is 6.6. Lower is better, so the former browser scores higher - even though it's obviously never better to have a browser always complete a task in 1000ms when a different browser can do it in 15ms in the worst case. JetStream should not have this pathology. The right way of avoiding it is to replace RMS with some other metric of how bad things get. A good metric is the average of the worst percentile. The worst 1% or the worst 5% would be good things to average. This will catch cases where the VM jittered due to JIT or GC, but it never have the pathology that we end up giving the better score to a VM whose best case is worst than another VM's worst case.
Attachments
the patch (12.78 KB, patch)
2015-06-08 11:10 PDT, Filip Pizlo
ggaren: review+
Filip Pizlo
Comment 1 2015-06-08 11:10:14 PDT
Created attachment 254500 [details] the patch
Filip Pizlo
Comment 2 2015-06-08 11:11:31 PDT
Comment on attachment 254500 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=254500&action=review > PerformanceTests/JetStream/JetStreamDriver.js:1 > -// Copyright (C) 2014 Apple Inc. All rights reserved. > +// Copyright (C) 2014, 2015 Apple Inc. All rights reserved. I'll revert this.
Geoffrey Garen
Comment 3 2015-06-08 12:36:37 PDT
Comment on attachment 254500 [details] the patch View in context: https://bugs.webkit.org/attachment.cgi?id=254500&action=review r=me > PerformanceTests/JetStream/Octane2/base.js:196 > + console.log("Called with numbers.length = " + numbers.length); Revert? > PerformanceTests/JetStream/Octane2/base.js:202 > + console.log("Done sorting."); Revert? > PerformanceTests/JetStream/Octane2/base.js:217 > + console.log("numbers we want: " + numbersWeWant); Revert? > PerformanceTests/JetStream/Octane2/base.js:224 > + console.log("result = " + result); Revert? > PerformanceTests/JetStream/Octane2/base.js:267 > + console.log("Pushing result: " + result.time + " " + result.latency); Revert? > PerformanceTests/JetStream/Octane2/base.js:342 > + var latency = BenchmarkSuite.AverageAbovePercentile(latencySamples, /* percentile = */ 95) * 1000; Can we just put 95 into a local named percentile instead?
Filip Pizlo
Comment 4 2015-06-08 12:38:59 PDT
(In reply to comment #3) > Comment on attachment 254500 [details] > the patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=254500&action=review > > r=me > > > PerformanceTests/JetStream/Octane2/base.js:196 > > + console.log("Called with numbers.length = " + numbers.length); > > Revert? > > > PerformanceTests/JetStream/Octane2/base.js:202 > > + console.log("Done sorting."); > > Revert? > > > PerformanceTests/JetStream/Octane2/base.js:217 > > + console.log("numbers we want: " + numbersWeWant); > > Revert? > > > PerformanceTests/JetStream/Octane2/base.js:224 > > + console.log("result = " + result); > > Revert? > > > PerformanceTests/JetStream/Octane2/base.js:267 > > + console.log("Pushing result: " + result.time + " " + result.latency); > > Revert? > > > PerformanceTests/JetStream/Octane2/base.js:342 > > + var latency = BenchmarkSuite.AverageAbovePercentile(latencySamples, /* percentile = */ 95) * 1000; > > Can we just put 95 into a local named percentile instead? Yup, I fixed all of those.
Ryosuke Niwa
Comment 5 2015-06-08 12:46:04 PDT
Oops, it looks like run-benchmark currently hard-codes 1.0.1 as the benchmark directory and this will break the script :( Could you hold off on landing this patch until Dewei or I fix it?
Filip Pizlo
Comment 6 2015-06-10 11:47:22 PDT
Note You need to log in before you can comment on or make changes to this bug.