Right now page loading performance tests (PerformanceTests/PageLoad/; most notably all SVG perf. tests) and replay tests (PerformanceTests/Replay) measure the load time in Python. We should do this measurement in C++ code instead to reduce the variance.
Created attachment 164652 [details] Sample output On my second thought, I'm not certain this is an overall improvement. The variance has increased after the change.
(In reply to comment #1) > Created an attachment (id=164652) [details] > Sample output > > On my second thought, I'm not certain this is an overall improvement. The variance has increased after the change. Just for curiosity, do you have any WIP patch for this?
Created attachment 164654 [details] Change used to generate the output
(In reply to comment #3) > Created an attachment (id=164654) [details] > Change used to generate the output Thanks for the patch. This needs DRT help from each port and records only the start and the end. If we can track some other timings, this might worth having. But if this is only for the start-end measurement, touching DRTs looks a bit overkill.
(In reply to comment #4) > This needs DRT help from each port and records only the start and the end. > If we can track some other timings, this might worth having. > But if this is only for the start-end measurement, touching DRTs looks a bit overkill. Right. I had initially thought this will reduce the variance for some tests (e.g. https://bugs.webkit.org/show_bug.cgi?id=97062) but it turned out that it only amplified as the total time has reduced. With this patch, the variance is ~150% on that test :( I'm inclined to say this change is probably not worth the effort.
(In reply to comment #5) > (In reply to comment #4) > > This needs DRT help from each port and records only the start and the end. > > If we can track some other timings, this might worth having. > > But if this is only for the start-end measurement, touching DRTs looks a bit overkill. > > Right. I had initially thought this will reduce the variance for some tests (e.g. https://bugs.webkit.org/show_bug.cgi?id=97062) but it turned out that it only amplified as the total time has reduced. With this patch, the variance is ~150% on that test :( > > I'm inclined to say this change is probably not worth the effort. It should be more precise to measure from DRT and we should either modify DRT because of the memory measurements. Just because keeping the variance low by using python to the measurements doesn't seem a good idea to me.