Bug 35115 - REGRESSION: r54919 can cause image animation to start late
Summary: REGRESSION: r54919 can cause image animation to start late
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Images (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC All
: P2 Normal
Assignee: Peter Kasting
URL:
Keywords:
Depends on: 35287
Blocks: 35065
  Show dependency treegraph
 
Reported: 2010-02-18 11:03 PST by Peter Kasting
Modified: 2010-02-24 13:52 PST (History)
4 users (show)

See Also:


Attachments
patch v1 (3.32 KB, patch)
2010-02-18 12:56 PST, Peter Kasting
no flags Details | Formatted Diff | Diff
patch v2 (3.31 KB, patch)
2010-02-18 13:03 PST, Peter Kasting
abarth: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Kasting 2010-02-18 11:03:09 PST
The patch on bug 35065 introduced this bug.  Fix is easy.
Comment 1 Peter Kasting 2010-02-18 12:56:08 PST
Created attachment 49035 [details]
patch v1
Comment 2 WebKit Review Bot 2010-02-18 13:01:52 PST
Attachment 49035 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/platform/graphics/BitmapImage.cpp:271:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Peter Kasting 2010-02-18 13:03:51 PST
Created attachment 49037 [details]
patch v2

Fix style nit
Comment 4 Adam Barth 2010-02-19 16:36:32 PST
Comment on attachment 49037 [details]
patch v2

Are we not able to test image animation bugs?
Comment 5 Peter Kasting 2010-02-19 16:42:00 PST
I have utterly no clue how to test this particular case.
Comment 6 Peter Kasting 2010-02-19 16:46:00 PST
Fixed in r55039.
Comment 7 Mark Rowe (bdash) 2010-02-21 16:42:17 PST
I filed bug 35222 about a regression this introduced.
Comment 8 Mark Rowe (bdash) 2010-02-22 03:19:37 PST
Reopening as this was rolled out in r55076 to address bug 35222.
Comment 9 Peter Kasting 2010-02-22 12:54:01 PST
Let me circle back and describe what's going on with all these bugs.

Skipping frames during image load is possible when the network is bursty -- if you are given some data, then no data, then data for multiple frames at once, you can skip to the last available frame in one shot.  The more common behavior is that data streams in slowly and you see each frame show up until we have the next one.  In either case, we won't skip past the end -- when we load the image completely (and it finishes animating if necessary) we'll start animating from the beginning at the correct speed.

The patch on this bug does not change this behavior.  This behavior comes from the original set of timing fixes a couple of years ago.  The patch on this bug merely preserves the behavior that we had before bug 35065 was fixed, where the timer starts from when we first ask the image to animate, rather than when we are able to advance to frame 2.

This can make a visible change in the following scenario.  The network gives us frame 1 of the animation, then hangs a long time, then provides most of the rest of the data all at once.  The behaviors are:
* Before bug 35065 landed: We would erroneously keep adding the frame 1 start time every time through this function, resulting in the image not animating at all for a long time during and after load, and then starting to animate smoothly.
* After bug 35065 landed: We would erroneously fail to start the animation timer until frame 2 arrived, at which point we'd begin animating smoothly.
* After this patch landed: We start the timer immediately, and when the rest of the data arrives, skip forward.

HOWEVER, note the following similar case: if we get frames 1 _and 2_, then a pause, then the rest of the data, the behavior before and after this patch landed is identical, and corresponds to the third bullet above.

What's really happening here is that the fix on bug 35065 eliminated a bunch of unintentional delays during the initial load and exposed the (designed) behavior that we can frameskip during initial load (although we usually don't).  So this patch did not really "cause" bug 35222, although it made it possible to occur in one more frame's worth of cases.

We can decide that we are OK with this behavior (better for sites that sync GIFs to each other or other page elements), in which case we should simply reland this patch and move on.  Or we can decide that we'd like each frame to appear for at least its desired duration during load, in which case I can rewrite the timing code to accomplish this, which will probably equate to relanding this patch and adding some conditionals.
Comment 10 Mark Rowe (bdash) 2010-02-22 15:10:07 PST
Can you please provide a concrete example of the issue that you’re attempting to describe here?  It’s much easier to discuss a concrete problem than a hypothetical issue.
Comment 11 Peter Kasting 2010-02-22 15:14:41 PST
Which issue are you referring to?  Comment 9 is applicable to any animated GIF, it is merely dependent on the network.

(If possible I'd really like to discuss over IRC, I think it will be faster and easier than bugzilla)
Comment 12 Mark Rowe (bdash) 2010-02-22 18:38:02 PST
(In reply to comment #11)
> Which issue are you referring to?  Comment 9 is applicable to any animated GIF,
> it is merely dependent on the network.
> 

The issue that this bug was filed about.
Comment 13 Peter Kasting 2010-02-22 20:34:45 PST
The issue that this was filed about can apply to any animated GIF, its magnitude is merely dependent on network performance.  As I tried to say on bug 35065, the issue is that after the patch on that bug, we display the first frame until we have the complete second frame, and then start the first frame's animation timer.  This means that the animation start lags by however long it takes us to get the second frame's data, which depends on the network.  Note that the issue here is not about how we display frames in general as the image is loading -- as I tried to say in comment 9, that behavior, such as whether bursty networks can cause frameskipping during the initial load, is a more general issue.  This bug was only about whether we start the animation timer when we receive frame 1 or when we receive frame 2, nothing more.

Whatever decision we make about how the user should see frames during load, we want this patch or something similar.  If the network is fast, it makes us display the animation with perfect timing, and if the network is slow and we want to modify the current display algorithm, we'd want to do it by ensuring we show each frame for its entire duration, not the behavior that this patch fixes (which is just a plain and simple bug).
Comment 14 Mark Rowe (bdash) 2010-02-22 23:25:25 PST
(In reply to comment #13)
> The issue that this was filed about can apply to any animated GIF, its
> magnitude is merely dependent on network performance.  As I tried to say on bug
> 35065, the issue is that after the patch on that bug, we display the first
> frame until we have the complete second frame, and then start the first frame's
> animation timer.  This means that the animation start lags by however long it
> takes us to get the second frame's data, which depends on the network.  Note
> that the issue here is not about how we display frames in general as the image
> is loading -- as I tried to say in comment 9, that behavior, such as whether
> bursty networks can cause frameskipping during the initial load, is a more
> general issue.  This bug was only about whether we start the animation timer
> when we receive frame 1 or when we receive frame 2, nothing more.

Ok, now I see what you’re saying.  I have a concern here that is based on looking at the computed values for BitmapImage::frameIsCompleteAtIndex inside BitmapImage::startAnimation while loading some animated GIFs over a slow-ish network: frameIsCompleteAtIndex appears to always return false until the image has completely loaded.  This happens even for frame 0 when the initial frame of the animation has already painted on-screen.  This doesn’t seem to match the intended behavior of the function.

> Whatever decision we make about how the user should see frames during load, we
> want this patch or something similar.  If the network is fast, it makes us
> display the animation with perfect timing, and if the network is slow and we
> want to modify the current display algorithm, we'd want to do it by ensuring we
> show each frame for its entire duration, not the behavior that this patch fixes
> (which is just a plain and simple bug).

It is not clear to me what you mean by “modify the current display algorithm”.  Are you talking about the algorithm as designed, or as it currently behaves?  From my observations these seem to be two different things.

It seems that the “as designed” behavior is that which I describe as being wrong in bug 35222: after painting the first frame the animation starts, and we drop any frames that aren’t loaded by the time they would have painted had the image been animating smoothly.  Given the behavior of BitmapImage::frameIsCompleteAtIndex that I describe above, you can see that this results in no animation occurring while the image loads and then us immediately skipping many frames in the animation to catch up to where the animation would have been.

The current behavior is different: after the second frame of the animation is available, the animation starts.  Due to the behavior of BitmapImage::frameIsCompleteAtIndex always returning false until the image is completely loaded, this leads to the animation being paused until the image is completely loaded meaning that no frames will be dropped even when the network is slow.  In the case that the image is already cached it presumably results in the initial frame being displayed for slightly longer than was intended.

I suspect that if BitmapImage::frameIsCompleteAtIndex was behaving as expected that the only difference between the two behaviors would be that the second would lead to initial frame being displayed for longer than was intended (e.g., the original topic of this bug).  If we fix the behavior of BitmapImage::frameIsCompleteAtIndex the original patch on this bug may then be what is needed.
Comment 15 Peter Kasting 2010-02-23 00:46:19 PST
(In reply to comment #14)
> I have a concern here that is based on
> looking at the computed values for BitmapImage::frameIsCompleteAtIndex inside
> BitmapImage::startAnimation while loading some animated GIFs over a slow-ish
> network: frameIsCompleteAtIndex appears to always return false until the image
> has completely loaded.  This happens even for frame 0 when the initial frame of
> the animation has already painted on-screen.  This doesn’t seem to match the
> intended behavior of the function.

Yeah, that's not right at all.  I suspect a CG-specific bug here; for CG this eventually boils down to the statement (in ImageSourceCG.cpp):

CGImageSourceGetStatusAtIndex(m_decoder, index) == kCGImageStatusComplete

It may be that this function is not returning "complete" for any index until the whole image is done, or perhaps it's just doing that for < Snow Leopard (since I know some GIF shouldn't-need-the-whole-image-but-we-do issues were supposed to be fixed in Snow Leopard).  Seems like you could probably test this fairly quickly?

I haven't tested closely, but based on the overall behavior I see and the code in them, I think the open-source decoders return the desired values for this function.  This means that Safari and Chromium would see vastly different behavior here which would help explain why we've been talking past each other.

I could be wrong and there's a bug in BitmapImage.cpp instead, but a quick inspection didn't highlight anything obvious and I've never seen the behavior you describe locally.

> It seems that the “as designed” behavior is that which I describe as being
> wrong in bug 35222: after painting the first frame the animation starts, and we
> drop any frames that aren’t loaded by the time they would have painted had the
> image been animating smoothly.

With the caveat of "if we have frames beyond them to skip to", I think this is accurate.

> Given the behavior of
> BitmapImage::frameIsCompleteAtIndex that I describe above, you can see that
> this results in no animation occurring while the image loads and then us
> immediately skipping many frames in the animation to catch up to where the
> animation would have been.

Yes, that's obviously gruesomely bad.  That turns what I described as a rare case into the common case, even for moderately fast networks.

There is a small saving grace here, which is that if the load has taken longer than the entire length of the animation, the code (with the buggy frameIsCompleteAtIndex()) will start the animation at the beginning (of its second cycle, not that that's visible to the user), so on REALLY slow networks, this doesn't look as bad.  But that's fairly irrelevant as we definitely want to fix this broken oracle function.

> The current behavior is different: after the second frame of the animation is
> available, the animation starts.  Due to the behavior of
> BitmapImage::frameIsCompleteAtIndex always returning false until the image is
> completely loaded, this leads to the animation being paused until the image is
> completely loaded meaning that no frames will be dropped even when the network
> is slow.  In the case that the image is already cached it presumably results in
> the initial frame being displayed for slightly longer than was intended.

That sounds accurate to me.

> I suspect that if BitmapImage::frameIsCompleteAtIndex was behaving as expected
> that the only difference between the two behaviors would be that the second
> would lead to initial frame being displayed for longer than was intended (e.g.,
> the original topic of this bug).  If we fix the behavior of
> BitmapImage::frameIsCompleteAtIndex the original patch on this bug may then be
> what is needed.

Thanks for the great analysis.  Fixing this function should certainly be the top priority, and it's nice to know why the seemingly-innocent patch on this bug could have caused such a large-seeming regression.  Once we fix, then we should be back in the world described by comment 9, and can move forward.
Comment 16 Mark Rowe (bdash) 2010-02-23 01:22:51 PST
(In reply to comment #15)
> (In reply to comment #14)
> > I have a concern here that is based on
> > looking at the computed values for BitmapImage::frameIsCompleteAtIndex inside
> > BitmapImage::startAnimation while loading some animated GIFs over a slow-ish
> > network: frameIsCompleteAtIndex appears to always return false until the image
> > has completely loaded.  This happens even for frame 0 when the initial frame of
> > the animation has already painted on-screen.  This doesn’t seem to match the
> > intended behavior of the function.
> 
> Yeah, that's not right at all.  I suspect a CG-specific bug here; for CG this
> eventually boils down to the statement (in ImageSourceCG.cpp):
> 
> CGImageSourceGetStatusAtIndex(m_decoder, index) == kCGImageStatusComplete
> 
> It may be that this function is not returning "complete" for any index until
> the whole image is done, or perhaps it's just doing that for < Snow Leopard
> (since I know some GIF shouldn't-need-the-whole-image-but-we-do issues were
> supposed to be fixed in Snow Leopard).  Seems like you could probably test this
> fairly quickly?

That’s basically the conclusion that I had arrived at too.  It seems CGImageSourceGetStatusAtIndex claims that all frames of an animated GIF are incomplete when using an incremental image source that has not yet received the complete data of the image.  This certainly isn’t the behavior that the author of ImageSource::frameIsCompleteAtIndex (or any reasonable person) expects.  I’m not sure if this behavior changed from Leopard to SnowLeopard, but I filed a bug against CGImageSourceGetStatusAtIndex expressing my displeasure at the misleading results that it gives (<rdar://problem/7679174>).

> I haven't tested closely, but based on the overall behavior I see and the code
> in them, I think the open-source decoders return the desired values for this
> function.  This means that Safari and Chromium would see vastly different
> behavior here which would help explain why we've been talking past each other.

That would explain a lot!

> Thanks for the great analysis.  Fixing this function should certainly be the
> top priority, and it's nice to know why the seemingly-innocent patch on this
> bug could have caused such a large-seeming regression.  Once we fix, then we
> should be back in the world described by comment 9, and can move forward.

I think I may have a solution for this.  I’ll write up a bug about BitmapImage::frameIsCompleteAtIndex and will throw a patch up after some more testing.
Comment 17 Mark Rowe (bdash) 2010-02-23 01:46:39 PST
Bug for that issue is bug 35287.
Comment 18 Peter Kasting 2010-02-23 15:15:15 PST
Now that bug 35287 is fixed (thanks!), should we re-land this?  And then do you want to take a further look at how we should behave during image load, or do you think the current behavior is OK?
Comment 19 Mark Rowe (bdash) 2010-02-23 18:14:14 PST
(In reply to comment #18)
> Now that bug 35287 is fixed (thanks!), should we re-land this?  And then do you
> want to take a further look at how we should behave during image load, or do
> you think the current behavior is OK?

Yes, I think it’s fine to land this change now.
Comment 20 Peter Kasting 2010-02-24 13:52:28 PST
Fixed in r55199.