Bug 160505

Summary: Use requestAnimationFrame() to schedule new frames on the performance test Animation/balls.html
Product: WebKit Reporter: Carlos Alberto Lopez Perez <clopez>
Component: Tools / TestsAssignee: Carlos Alberto Lopez Perez <clopez>
Status: NEW ---    
Severity: Normal CC: benjamin, bugs-noreply, cdumez, cgarcia, commit-queue, dpranke, jdapena, lforschler, ossy, rniwa, simon.fraser, tomz, tonikitoo, tony
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch simon.fraser: review-

Description Carlos Alberto Lopez Perez 2016-08-03 06:46:06 PDT
On the GTK+ port we recently switched by default to use the new threaded compositor. This has improved overall performance.

However, the performance test Animation/balls.html shows a performance drop of 22% https://perf.webkit.org/#mode=charts&chartList=%5B%5B%22gtk%22%2C%22Animation%2Fballs%3AFrameRate%22%5D%5D&days=100

And if you run the test manually, it is completely evident that rather than a performance drop there is huge performance boost.

 - Before (no threaded compositor): https://people.igalia.com/clopez/wkbug/animation-balls-threaded-compositor/videos/webkitgtk-simplecompositor_balls-old-settinterval.mp4
 - Now  (with threaded compositor): https://people.igalia.com/clopez/wkbug/animation-balls-threaded-compositor/videos/webkitgtk-threadedcompositor_balls-old-settinterval.mp4

So, as you can see the balls move better and the FPS rate seems improved with the new threaded compositor.
However the FPS values measured by the test don't match what it is being painted on the screen.

After a detailed analysis, I have concluded that the issue is on the test itself.

The test is measuring much more FPS than it is actually drawing, because it uses "setInterval(animate,1)" to schedule new frames.
And the issues with this are:

  - Issue #1: There is no guarantee that a given call to animate() would have finished when the next one arrives. So this is calling animate() each 1 millisecond without caring about if the previous call to animate() has finished.

  - Issue #2: There is no guarantee that after animate() has finished updating the ball positions this new positions will be draw at all on the screen. 


I have tried also to use setTimeout(animate,0) and this fixes the first issue but not the second. It is very possible that the test thinks that it is drawing a FPS when in reality is just completing the animate() callback updating the ball positions, but nothing on the screen has been still drawn because the internal repaint timer has not been fired.

So, the best solution (IMHO) is to rely on requestAnimationFrame() that guarantees that any call to animate() will be drawn.

I have uploaded the 3 versions of the benchmark here, in case you can check if there is any difference on the FPS values reported with your browser:

 - https://people.igalia.com/clopez/wkbug/animation-balls-threaded-compositor/Animation/


I also have done a detailed analysis of the measured performance and real performance with some videos:

 - https://people.igalia.com/clopez/wkbug/animation-balls-threaded-compositor/videos/analisis.html

As you can see the version of the test that measures a FPS value nearest to the one later obtained by inspecting the video frames is the one using requestAnimationFrame() for all cases.
Comment 1 Carlos Alberto Lopez Perez 2016-08-03 06:53:55 PDT
Created attachment 285225 [details]
Patch
Comment 2 Carlos Garcia Campos 2016-08-03 07:30:49 PDT
Wow, awesome work!
Comment 3 Simon Fraser (smfr) 2016-08-03 07:38:16 PDT
Comment on attachment 285225 [details]
Patch

I don't think you should change this test. If it was written using setInterval, then it should stay that way. Feel free to add a new rad-based test alongside this one.
Comment 4 Simon Fraser (smfr) 2016-08-03 07:38:36 PDT
^rAF-based
Comment 5 Carlos Alberto Lopez Perez 2016-08-03 09:06:24 PDT
(In reply to comment #0)
> The test is measuring much more FPS than it is actually drawing, because it
> uses "setInterval(animate,1)" to schedule new frames.
> And the issues with this are:
> 
>   - Issue #1: There is no guarantee that a given call to animate() would
> have finished when the next one arrives. So this is calling animate() each 1
> millisecond without caring about if the previous call to animate() has
> finished.
> 

I'm wrong in this point...

setInterval() will not fire a new call to the function if the previous has not returned. It will wait until the previous call returns.
Comment 6 Carlos Alberto Lopez Perez 2016-08-03 09:09:49 PDT
(In reply to comment #3)
> Comment on attachment 285225 [details]
> Patch
> 
> I don't think you should change this test. If it was written using
> setInterval, then it should stay that way. Feel free to add a new rad-based
> test alongside this one.

Well, I think the the test is broken.

It tries to measure the FPS of the animation, but it not measuring that.

It is just measuring how many animate() function calls are executed per second, but is completing ignoring if the new position of the balls is displayed or not on the screen (i.e. the frame has been drawn)

Why you think we should not fix a broken test?
Or on the other hand do you have some reason to think the test is not broken?
Comment 7 Simon Fraser (smfr) 2016-08-03 09:13:27 PDT
(In reply to comment #6)
> (In reply to comment #3)
> > Comment on attachment 285225 [details]
> > Patch
> > 
> > I don't think you should change this test. If it was written using
> > setInterval, then it should stay that way. Feel free to add a new rad-based
> > test alongside this one.
> 
> Well, I think the the test is broken.
> 
> It tries to measure the FPS of the animation, but it not measuring that.
> 
> It is just measuring how many animate() function calls are executed per
> second, but is completing ignoring if the new position of the balls is
> displayed or not on the screen (i.e. the frame has been drawn)
> 
> Why you think we should not fix a broken test?
> Or on the other hand do you have some reason to think the test is not broken?

Because this test represents the way that many old web pages do animation. Even though it doesn't detect real framerate, it will detect changes in setInterval and layout behavior.

We have plenty of other perf tests using requestAnimationFrame (in content-animation, and Animometer).
Comment 8 Carlos Alberto Lopez Perez 2016-08-03 09:22:17 PDT
(In reply to comment #7)
> 
> Because this test represents the way that many old web pages do animation.
> Even though it doesn't detect real framerate, it will detect changes in
> setInterval and layout behavior.
> 

So, then I'm a bit confused.

How should I interpret the results we get with the new threaded compositor of WebKitGTK+?

Without the threaded compositor, the test reports more completed calls to animate() in the first seconds (more FPS according to the the test), but the frames draw are much lower.

Is this telling us something useful?
Comment 9 Simon Fraser (smfr) 2016-08-03 09:29:57 PDT
(In reply to comment #8)
> (In reply to comment #7)
> > 
> > Because this test represents the way that many old web pages do animation.
> > Even though it doesn't detect real framerate, it will detect changes in
> > setInterval and layout behavior.
> > 
> 
> So, then I'm a bit confused.
> 
> How should I interpret the results we get with the new threaded compositor
> of WebKitGTK+?
> 
> Without the threaded compositor, the test reports more completed calls to
> animate() in the first seconds (more FPS according to the the test), but the
> frames draw are much lower.
> 
> Is this telling us something useful?

It's tell you the round-trip time of setInterval(1) with some style/layout changes in between.

We know this isn't a good way to measure actual user-visible framerate. That's why we added Animometer.
Comment 10 Carlos Alberto Lopez Perez 2016-08-03 09:58:15 PDT
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > 
> > > Because this test represents the way that many old web pages do animation.
> > > Even though it doesn't detect real framerate, it will detect changes in
> > > setInterval and layout behavior.
> > > 
> > 
> > So, then I'm a bit confused.
> > 
> > How should I interpret the results we get with the new threaded compositor
> > of WebKitGTK+?
> > 
> > Without the threaded compositor, the test reports more completed calls to
> > animate() in the first seconds (more FPS according to the the test), but the
> > frames draw are much lower.
> > 
> > Is this telling us something useful?
> 
> It's tell you the round-trip time of setInterval(1) with some style/layout
> changes in between.
> 

the round-trip time of setInterval(1) with some style/layout changes _that were not displayed_.

Is this really the useful information that we should benchmark and try to optimize for?

The purpose of some benchmark test is to have something you want to optimize for, and not measure some internal round-trip that goes nowhere.

What if somebody reports a bug with a regression in the FPS rate of Animation/balls? What we should do? Ignore the bug and close it saying that the test is just broken but that we don't want to fix it?

I think I will ignore this test from now on.

But I also think it does a disservice to all developers having a performance test that reports something that nobody should be optimizing for.

> We know this isn't a good way to measure actual user-visible framerate.
> That's why we added Animometer.

Its the first time I hear about it.

At first I tried running it with the run-perf-tests because its under the directory PerformanceTests, but it don't works if run it directly or with run-perf-tests.

However it works with the script run-benchmark, so I guess that is how is meant to be executed.

Unfortunately the data from the run-benchmark script that the GTK+ bot produces goes nowhere, because (AFAIK) https://perf.webkit.org only supports the data from the script run-benchmark.
Comment 11 Carlos Alberto Lopez Perez 2016-08-03 10:00:50 PDT
(In reply to comment #10)
> Unfortunately the data from the run-benchmark script that the GTK+ bot
> produces goes nowhere, because (AFAIK) https://perf.webkit.org only supports
> the data from the script run-benchmark.

I mean: https://perf.webkit.org only supports the data from the script run-perf-tests.
Comment 12 Carlos Alberto Lopez Perez 2016-08-03 11:02:25 PDT
(In reply to comment #10)
> > We know this isn't a good way to measure actual user-visible framerate.
> > That's why we added Animometer.
> 
> Its the first time I hear about it.
> 
> At first I tried running it with the run-perf-tests because its under the
> directory PerformanceTests, but it don't works if run it directly or with
> run-perf-tests.
> 
> However it works with the script run-benchmark, so I guess that is how is
> meant to be executed.
> 

This are some numbers I just got for this test:

- Without the threaded compositor:

Animometer:Score:Geometric: 41.8pt stdev=0.0%
          Animometer:Score:Geometric: 41.8pt stdev=0.0%
                    Canvas Arcs:Score: 163pt stdev=0.0%
                    Canvas Lines:Score: 1.00pt stdev=0.0%
                    Design:Score: 14.9pt stdev=0.0%
                    Focus:Score: 80.2pt stdev=0.0%
                    Images:Score: 6.00pt stdev=0.0%
                    Leaves:Score: 29.2pt stdev=0.0%
                    Multiply:Score: 271pt stdev=0.0%
                    Paths:Score: 451pt stdev=0.0%
                    Suits:Score: 94.5pt stdev=0.0%


- With the threaded compositor:

Animometer:Score:Geometric: 74.1pt stdev=0.0%
          Animometer:Score:Geometric: 74.1pt stdev=0.0%
                    Canvas Arcs:Score: 166pt stdev=0.0%
                    Canvas Lines:Score: 1.00pt stdev=0.0%
                    Design:Score: 15.1pt stdev=0.0%
                    Focus:Score: 328pt stdev=0.0%
                    Images:Score: 196pt stdev=0.0%
                    Leaves:Score: 34.4pt stdev=0.0%
                    Multiply:Score: 258pt stdev=0.0%
                    Paths:Score: 499pt stdev=0.0%
                    Suits:Score: 94.2pt stdev=0.0%