Summary: | SunSpider run times are not reproducible | ||
---|---|---|---|
Product: | WebKit | Reporter: | Paul Biggar <pbiggar> |
Component: | Tools / Tests | Assignee: | Nobody <webkit-unassigned> |
Status: | RESOLVED DUPLICATE | ||
Severity: | Normal | CC: | mjs, oliver |
Priority: | P2 | ||
Version: | 528+ (Nightly build) | ||
Hardware: | PC | ||
OS: | All | ||
Bug Depends on: | 32804, 43257, 43255, 43256 | ||
Bug Blocks: |
Description
Paul Biggar
2010-07-30 08:43:26 PDT
How are you running sunspider? I tend to see variance in the order of 0.3-0.5% with 30 runs I've seen large variability with even 1000 runs. I run it on Linux, where I am told variability is worse. I don't believe the number reported on the TOTAL line is accurate. It seems to be based on the difference between the two run-times, rather than the sum of the differences across all the benchmarks. As a simple example, 3bit-bits-in-byte benchmark runs in either 0ms or 1ms for me, and has massive variability, but that is completely masked by using the TOTAL value. I wrote https://bug580532.bugzilla.mozilla.org/attachment.cgi?id=459618 as a better measure of variability. It's not perfect, but it allows you see how variable a test run is. I don't think summing the absolute differences from the mean for each subtest is a statistically valid procedure, for computing the variance of the total score. You are correct that individual subtests, particularly the tests that are now very short runtime in modern implementations, have much higher variance than the total. (In reply to comment #3) > I don't think summing the absolute differences from the mean for each subtest is a statistically valid procedure, for computing the variance of the total score. This is true. I ran it past our resident stats expert dmandelin, who confirmed it to be "not too bad as a rough measure". It seemed to work so I stopped there. As I understand it, this is complicated by the fact that the benchmarks run for wildly different lengths of time. I don't know if you'd consider solving this for 0.9.2. |