Bug 111144

Summary: GIFImageReader to count frames and decode in one pass
Product: WebKit Reporter: Hin-Chung Lam <hclam>
Component: ImagesAssignee: Hin-Chung Lam <hclam>
Status: RESOLVED FIXED    
Severity: Normal CC: jschuh, pkasting, senorblanco, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 98098, 111395, 112099    
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch
none
Patch for landing none

Description Hin-Chung Lam 2013-02-28 22:51:18 PST
This means that GIFImageReader can parse (count frame) and decode a GIF file in one pass. The code currently do frame counting and decoding separately. The frame counting is particularly bad and has O(n^2) behavior on a slow network. By doing parse plus decode in one pass we can efficiently decode a GIF file. This also enables random frame seeking and deferred decoding for GIF.
Comment 1 Hin-Chung Lam 2013-03-09 17:47:17 PST
Created attachment 192359 [details]
Patch
Comment 2 Hin-Chung Lam 2013-03-09 17:51:54 PST
Created attachment 192360 [details]
Patch
Comment 3 Hin-Chung Lam 2013-03-09 17:57:57 PST
Created attachment 192361 [details]
Patch
Comment 4 Hin-Chung Lam 2013-03-09 18:18:05 PST
Comment on attachment 192361 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192361&action=review

This patch might be hard to review. I've added inline comments to the review tool.

Done exhausting testing locally to cover a lot of corner cases, in addition to bit-exact as before I ensure states (e.g. loop count, frame count, frame status) remains the same.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:-380
> -            // m_bytesToConsume is the current component size because it hasn't been updated.

This and line 387 don't do prepare / decode any more. Save the information for later decoding step.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:-543
> -            if (m_frameContext) {

This blob of code remains the same, just always saved to a new frame context.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:-669
> -            if (query == GIFImageDecoder::GIFFullQuery && !m_frameContext)

This blob of code remains the same as before. New code saves frame information regardless of query and frameContext is always valid so we can get rid of one scope of {}.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:-755
> -                if (m_frameContext && !m_frameContext->rowsRemaining) {

This blob code doesn't make sense, removed. Also note that rowsRemaining is irrelevant here, decoding state is moved to GIFLZWContext and done in a later time,

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:-760
> -            } else {

This blob of code is removed, notification of frameComplete() is moved to decode() above.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:784
> +    if (m_frames.isEmpty() || m_frames.last()->isComplete())

This new method is to ensure we don't add a new frame if the current frame is not complete yet.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:87
> +    GIFLZWContext(WebCore::GIFImageDecoder* client, GIFFrameContext* frameContext)

Reshuffling of some members and moved GIFLZWContext to the top of file. There's basically no new members defined in GIFLZWContext.

The ownership is now GIFImageReader owns GIFFrameContext owns GIFLZWContext. Each GIFFrameContext can be independently decoded.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:137
> +struct GIFLZWBlock {

New data structure for each individual LZW block.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:164
> +    int datasize;

This readonly state is moved from GIFLZWContext to here. The concept is that GIFLZWContext is a state machine, while GIFFrameContext is mostly readonly.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:211
> +    size_t m_currentLzwBlock;

These are new member for keeping tracking of decoding states.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:279
> +    bool isFirstFrame() const

Helper method to detect if the current frame is the first frame.
Comment 5 Hin-Chung Lam 2013-03-09 20:59:13 PST
Comment on attachment 192361 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192361&action=review

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:101
> +bool GIFLZWContext::outputRow()

I moved this method from GIFImageReader to GIFLZWContext. I also moved some of the output state machine into this class. There's no change in logic in this method. Note that now everything m_frameContext-> are all readonly. Cleanup will come in a later patch.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:201
> +bool GIFLZWContext::doLZW(const unsigned char* block, size_t bytesInBlock)

This method is now moved from GIFImageReader to GIFLZWContext. Again no major change in logic.
Comment 6 Peter Kasting 2013-03-11 16:59:51 PDT
Comment on attachment 192361 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192361&action=review

> Source/WebCore/ChangeLog:15
> +        This change fixed the performance problem of decoding GIF files when

Nit: fixed -> fixes.  You might want to refer here to the existing description of the particular performance problem that you're removing from GIFImageDecoder::frameCount().

> Source/WebCore/ChangeLog:24
> +        implementaiton. However this algorithm does not perform any decoding

Nit: implementaiton -> implementation

> Source/WebCore/ChangeLog:29
> +        implementation look through all frame information saved and decode

Nit: look -> looks, decode -> decodes

> Source/WebCore/ChangeLog:32
> +        Because of the separate of parse and decode, each frame can be decoded

Nit: separate of parse and decode -> separation of parsing and decoding

> Source/WebCore/ChangeLog:33
> +        separately. This paves the way to support seeking in GIF file.

Nit: file -> files

> Source/WebCore/ChangeLog:38
> +        Exhaustive testing is done locally with a corpus of 60k images.

Nit: is -> was

> Source/WebCore/ChangeLog:40
> +        The results was bit-exact as the previous implementation with only

Nit: was bit-exact as -> were bit-identical when compared to

> Source/WebCore/ChangeLog:43
> +        other image viewers, e.g. Preview on Mac.

I'd be interested in knowing what kind of bustage, specifically, is decoded differently.

> Source/WebCore/ChangeLog:45
> +        These were also no crashing when decoding the entire corpus.

Nit: These were -> There was

> Source/WebCore/platform/image-decoders/gif/GIFImageDecoder.cpp:111
> +    else if (m_reader && (!m_reader->imagesCount() || m_reader->parseFailed()))

Nit: I'd just combine this with the previous conditional with ||

> Source/WebCore/platform/image-decoders/gif/GIFImageDecoder.cpp:327
> +    const int oldSize = m_frameBufferCache.size();

Nit: size_t

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:246
> +                return false;

Does returning false here always cause us to set the decode failed bit?  It's not instantly obvious, especially the way decoding responsibility now seems to use more layers than before.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:325
> +    // Check for no remaining rows can handle bad GIF data with extra blocks.

I can't parse this.  Does it mean "Some bad GIFs have extra blocks beyond the last row, which we don't want to decode."?

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:336
> +    if (m_currentLzwBlock == m_lzwBlocks.size() || !m_lzwContext->hasRemainingRows()) {

Doesn't this have to be true in order for control flow to reach here?

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:347
> +    // Try to be tolerant and don't report parsing errors but we will not parse again.

Why is important to be tolerant here?  Why not just fail entirely, which is what I think the old code did?  Wouldn't this remove the "parse failed" variable and method?  Is this the source of the behavior change?

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:354
> +    while (m_currentDecodingFrame < m_frames.size() && m_currentDecodingFrame < haltAtFrame) {

Nit: Simpler:

while (m_currentDecodingFrame < std::min(m_frames.size(), haltAtFrame)) {

This also suggests that haltAtFrame (and other frame count numbers anywhere else) should perhaps be a size_t?

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:359
> +        if (!success)

Just call decode() directly in this statement and eliminate the temp.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:371
> +    // All frames decoded.

Nit: Comment adds nothing

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:395
> +        const size_t currentComponentSize = m_bytesToConsume;

This second temp doesn't seem necessary; can't we just call addLzwBlock() with m_bytesToConsume, as the old doLZW() call used to do?

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:567
> +            if (*currentComponent & 0x1) {
> +                currentFrame->tpixel = currentComponent[3];
> +                currentFrame->isTransparent = true;
> +            } else {
> +                currentFrame->isTransparent = false;
> +                // ignoring gfx control extension
>              }

Nit: Simpler, and more accurate comment.  You may need a "!!" or "static_cast<bool>()" if the first line produces a compiler warning.  Even with that, the code is not only shorter, but it reads more clearly to me, because the conditional is testing a named variable whose meaning is obvious.

currentFrame->isTransparent = *currentComponent & 0x1;
if (currentFrame->isTransparent)
    currentFrame->tpixel = currentComponent[3];

// We ignore the "user input" bit.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:571
> +            currentFrame->disposalMethod = (WebCore::ImageFrame::FrameDisposalMethod)disposalMethod;

Nit: Let's switch this to a C++-style cast

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:574
> +            // Some specs say 3rd bit (value 4), other specs say value 3
> +            // Let's choose 3 (the more popular)

Better comment:

// Some specs say that disposal method 3 is "overwrite previous", others that setting the third bit of the field (i.e. method 4) is.  We map both to the same value.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:697
> +            if (currentComponent[8] & 0x40)
> +                currentFrame->interlaced = true;
> +            else
> +                currentFrame->interlaced = false;

Nit: Simpler.  Same comment about compiler warning applies.

currentFrame->interlaced = currentComponent[8] & 0x40;

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:708
> +                // Overlaying interlaced, transparent GIFs over
> +                // existing image data using the Haeberli display hack
> +                // requires saving the underlying image in order to
> +                // avoid jaggies at the transparency edges. We are
> +                // unprepared to deal with that, so don't display such
> +                // images progressively

Nit: If you hoist this comment to live above the conditional, the code becomes:

currentFrame->progressiveDisplay = isFirstFrame();

I wonder if the comment implies that we could expand this to cover non-transparent frames too, or something.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:716
> +            // bits per pixel is currentComponent[8]&0x07
> +
>              // has a local colormap?
>              if (currentComponent[8] & 0x80) {
>                  int numColors = 2 << (currentComponent[8] & 0x7);

Nit: How about this.  (Compiler warning note, again)

currentFrame->isLocalColormapDefined = currentComponent[8] & 0x80;
if (currentFrame->isLocalColormapDefined) {
    // The three low-order bits of currentComponent[8] specify the bits per pixel.
    int numColors = 2 << (currentComponent[8] & 0x7);
...

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:768
>              break;

Nit: I'd make this return false

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:784
>> +    if (m_frames.isEmpty() || m_frames.last()->isComplete())
> 
> This new method is to ensure we don't add a new frame if the current frame is not complete yet.

OK... I'm a little fuzzy on this, the two callsites look kind of like they unconditionally added a frame, but I'm not really sure.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:279
>> +    bool isFirstFrame() const
> 
> Helper method to detect if the current frame is the first frame.

Maybe it should be named "decodingFirstFrame()"?  "currentFrameIsFirstFrame()"?

> Source/WebKit/chromium/tests/GIFImageDecoderTest.cpp:178
> +    // There are two frames scanned because we have not encountered  a decoding

Nit: Extra space
Comment 7 Hin-Chung Lam 2013-03-12 01:18:21 PDT
Comment on attachment 192361 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192361&action=review

>> Source/WebCore/ChangeLog:15
>> +        This change fixed the performance problem of decoding GIF files when
> 
> Nit: fixed -> fixes.  You might want to refer here to the existing description of the particular performance problem that you're removing from GIFImageDecoder::frameCount().

Done.

>> Source/WebCore/ChangeLog:24
>> +        implementaiton. However this algorithm does not perform any decoding
> 
> Nit: implementaiton -> implementation

Done.

>> Source/WebCore/ChangeLog:29
>> +        implementation look through all frame information saved and decode
> 
> Nit: look -> looks, decode -> decodes

Done.

>> Source/WebCore/ChangeLog:32
>> +        Because of the separate of parse and decode, each frame can be decoded
> 
> Nit: separate of parse and decode -> separation of parsing and decoding

Done.

>> Source/WebCore/ChangeLog:33
>> +        separately. This paves the way to support seeking in GIF file.
> 
> Nit: file -> files

Done.

>> Source/WebCore/ChangeLog:38
>> +        Exhaustive testing is done locally with a corpus of 60k images.
> 
> Nit: is -> was

Done.

>> Source/WebCore/ChangeLog:43
>> +        other image viewers, e.g. Preview on Mac.
> 
> I'd be interested in knowing what kind of bustage, specifically, is decoded differently.

The difference here is because of this: New code adds a new GIFFrameContext for GIFControlExtension and GIFImageHeader and those two files stop right between these two states, result in an empty frame. Old code doesn't add a new frame for GIFControlExtension but only at GIFImageHeader.

To solve this difference I can check if image header is defined before counting as a new frame. However no matter I add a new frame for GIFControlExtension or GIFImageHeader, both will result in an empty frame if the file truncates before a LZW block appear. See the fix in setHeaderDefined() and imagesCount().

>> Source/WebCore/ChangeLog:45
>> +        These were also no crashing when decoding the entire corpus.
> 
> Nit: These were -> There was

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageDecoder.cpp:111
>> +    else if (m_reader && (!m_reader->imagesCount() || m_reader->parseFailed()))
> 
> Nit: I'd just combine this with the previous conditional with ||

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageDecoder.cpp:327
>> +    const int oldSize = m_frameBufferCache.size();
> 
> Nit: size_t

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:246
>> +                return false;
> 
> Does returning false here always cause us to set the decode failed bit?  It's not instantly obvious, especially the way decoding responsibility now seems to use more layers than before.

Yes, return false here always results in m_client->setFailed() being called. I'll comments to clarify.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:325
>> +    // Check for no remaining rows can handle bad GIF data with extra blocks.
> 
> I can't parse this.  Does it mean "Some bad GIFs have extra blocks beyond the last row, which we don't want to decode."?

Right. Updated comments.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:336
>> +    if (m_currentLzwBlock == m_lzwBlocks.size() || !m_lzwContext->hasRemainingRows()) {
> 
> Doesn't this have to be true in order for control flow to reach here?

You're right. This is actually a bug. I've added a unit test for this. The bug was that it reset the LZW context when progressively decoding the file, as the decoder processes more blocks. The unit test uncovers another problem that LZW decoding is preformed before |datasize| and other fields properly defined. They are fixed in the latest patch.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:347
>> +    // Try to be tolerant and don't report parsing errors but we will not parse again.
> 
> Why is important to be tolerant here?  Why not just fail entirely, which is what I think the old code did?  Wouldn't this remove the "parse failed" variable and method?  Is this the source of the behavior change?

Added comments for this. Previous implementation uses a separate GIFImageReader for counting frames, which means counting can fail but decoding cannot. The code here is to match previous implementation.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:354
>> +    while (m_currentDecodingFrame < m_frames.size() && m_currentDecodingFrame < haltAtFrame) {
> 
> Nit: Simpler:
> 
> while (m_currentDecodingFrame < std::min(m_frames.size(), haltAtFrame)) {
> 
> This also suggests that haltAtFrame (and other frame count numbers anywhere else) should perhaps be a size_t?

Added a FIXME for this. Try not to change this because this involves changing GIFImageDecoder as well.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:359
>> +        if (!success)
> 
> Just call decode() directly in this statement and eliminate the temp.

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:371
>> +    // All frames decoded.
> 
> Nit: Comment adds nothing

Removed.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:395
>> +        const size_t currentComponentSize = m_bytesToConsume;
> 
> This second temp doesn't seem necessary; can't we just call addLzwBlock() with m_bytesToConsume, as the old doLZW() call used to do?

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:567
>>              }
> 
> Nit: Simpler, and more accurate comment.  You may need a "!!" or "static_cast<bool>()" if the first line produces a compiler warning.  Even with that, the code is not only shorter, but it reads more clearly to me, because the conditional is testing a named variable whose meaning is obvious.
> 
> currentFrame->isTransparent = *currentComponent & 0x1;
> if (currentFrame->isTransparent)
>     currentFrame->tpixel = currentComponent[3];
> 
> // We ignore the "user input" bit.

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:571
>> +            currentFrame->disposalMethod = (WebCore::ImageFrame::FrameDisposalMethod)disposalMethod;
> 
> Nit: Let's switch this to a C++-style cast

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:574
>> +            // Let's choose 3 (the more popular)
> 
> Better comment:
> 
> // Some specs say that disposal method 3 is "overwrite previous", others that setting the third bit of the field (i.e. method 4) is.  We map both to the same value.

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:697
>> +                currentFrame->interlaced = false;
> 
> Nit: Simpler.  Same comment about compiler warning applies.
> 
> currentFrame->interlaced = currentComponent[8] & 0x40;

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:708
>> +                // images progressively
> 
> Nit: If you hoist this comment to live above the conditional, the code becomes:
> 
> currentFrame->progressiveDisplay = isFirstFrame();
> 
> I wonder if the comment implies that we could expand this to cover non-transparent frames too, or something.

Done. Added a FIXME for the case you mentioned.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:716
>>                  int numColors = 2 << (currentComponent[8] & 0x7);
> 
> Nit: How about this.  (Compiler warning note, again)
> 
> currentFrame->isLocalColormapDefined = currentComponent[8] & 0x80;
> if (currentFrame->isLocalColormapDefined) {
>     // The three low-order bits of currentComponent[8] specify the bits per pixel.
>     int numColors = 2 << (currentComponent[8] & 0x7);
> ...

currentFrame->isLocalColormapDefined actually has an additional meaning that the entire local colormap is received so we can't set it early (See line 742). Changing this to accomodate comments.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:768
>>              break;
> 
> Nit: I'd make this return false

Done.

>>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:784
>>> +    if (m_frames.isEmpty() || m_frames.last()->isComplete())
>> 
>> This new method is to ensure we don't add a new frame if the current frame is not complete yet.
> 
> OK... I'm a little fuzzy on this, the two callsites look kind of like they unconditionally added a frame, but I'm not really sure.

There are two control blocks that can adda a new frame, which is not desirable and avoided here. For the case if frame list empty we should always add a new frame.

The second condition is tricky: in the stream, either GIFControlExtension or GIFImageHeader can come first, but we don't want to add a new frame twice. We know that there will only be one "data complete" signal for each frame, and we only want to start a new frame if the previous frame is data complete.

>>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:279
>>> +    bool isFirstFrame() const
>> 
>> Helper method to detect if the current frame is the first frame.
> 
> Maybe it should be named "decodingFirstFrame()"?  "currentFrameIsFirstFrame()"?

Done.

>> Source/WebKit/chromium/tests/GIFImageDecoderTest.cpp:178
>> +    // There are two frames scanned because we have not encountered  a decoding
> 
> Nit: Extra space

Done. There's no need to update these two tests.
Comment 8 Hin-Chung Lam 2013-03-12 01:33:55 PDT
Created attachment 192666 [details]
Patch
Comment 9 Hin-Chung Lam 2013-03-12 01:36:43 PDT
Comment on attachment 192666 [details]
Patch

Thanks for Peter for the thorough review.

Major changes in the last patch:
* Fixed the problem of empty last frame. No more differences in the 60k images corpus.
* Fixed two bugs uncovered by your review. Added a unit test to verify.
* Fixed the 3 bytes out of bound read.
Comment 10 Stephen White 2013-03-12 08:48:18 PDT
Comment on attachment 192666 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192666&action=review

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:83
> +// FIXME: Make this a class and hide private members.

Looks like a class to me.  Is this already done?

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:98
> +        , irow(0)

Unless I'm missing something, it seems unfortunate to mix m_-prefixed members and non-prefixed.

Perhaps you could fix this in a renaming-only patch.
Comment 11 Stephen White 2013-03-12 08:48:38 PDT
Comment on attachment 192361 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192361&action=review

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:101
>> +bool GIFLZWContext::outputRow()
> 
> I moved this method from GIFImageReader to GIFLZWContext. I also moved some of the output state machine into this class. There's no change in logic in this method. Note that now everything m_frameContext-> are all readonly. Cleanup will come in a later patch.

Could the member var be made pointer-to-const then?  (I have no idea, I'm just guessing).
Comment 12 Hin-Chung Lam 2013-03-12 11:54:05 PDT
Comment on attachment 192666 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192666&action=review

Fixed several comments by Stephen. Changed GIFFrameContext* to const GFC*.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:83
>> +// FIXME: Make this a class and hide private members.
> 
> Looks like a class to me.  Is this already done?

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:98
>> +        , irow(0)
> 
> Unless I'm missing something, it seems unfortunate to mix m_-prefixed members and non-prefixed.
> 
> Perhaps you could fix this in a renaming-only patch.

Yes I plan to fix in a later patch.
Comment 13 Hin-Chung Lam 2013-03-12 11:54:37 PDT
Created attachment 192779 [details]
Patch
Comment 14 Hin-Chung Lam 2013-03-13 16:29:22 PDT
Review ping.
Comment 15 Peter Kasting 2013-03-13 22:18:23 PDT
Sorry, I have been flooded.  I will try to get to this tomorrow.
Comment 16 Hin-Chung Lam 2013-03-13 23:12:36 PDT
No problem. Thanks for taking your time to look at this.
Comment 17 Peter Kasting 2013-03-14 18:24:08 PDT
Comment on attachment 192779 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192779&action=review

Most stuff is nitpicks, the integer overflow worry is the biggest concern I have.

> Source/WebCore/ChangeLog:16
> +        it is received very slowly. Existing implementation creates a new

Nit: it is -> they are

> Source/WebCore/ChangeLog:17
> +        GIFImageReader for counting frames for every byte that comes in, this

Nit: Is this accurate?  Shouldn't "for every byte that comes in" read "every time it is notified that new data has been received"?

> Source/WebKit/chromium/ChangeLog:112
> +        that progressive decoding produces the same results as a truncated file.

Nit: This is unclear, how about "...verify that continually re-decoding an image one additional byte at a time produces the same outputs as repeatedly decoding (with brand new decoders) truncated versions of the image that are one byte longer each time."

> Source/WebCore/platform/image-decoders/gif/GIFImageDecoder.cpp:79
> +    decode(std::numeric_limits<unsigned>::max(), GIFFrameCountQuery);

Nit: Personally, I'd probably just pass static_cast<unsigned>(-1), which does the same thing in a slightly less verbose way and avoids the need for the #include... up to you

> Source/WebCore/platform/image-decoders/gif/GIFImageDecoder.cpp:110
> +    if (failed()
> +        || (m_reader && (!m_reader->imagesCount() || m_reader->parseFailed())))

Tiny nit: With WebKit's lack of a column limit, I find combining these on one line fractionally clearer

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:202
> +// Return true if decoding block was successful and the entire block was consumed.
> +// Return false if there was a critical decoding error. The error should be reported to client.

It looks like maybe OUTPUT_ROW can return true if more block remains but no rows remain.  Perhaps the comment should be:

Returns true if decoding was successful.  In this case the block will have been completely consumed and/or rowsRemaining will be 0.  Otherwise, decoding failed; returns false in this case, which will always cause the GIFImageReader to set the "decode failed" flag.

(Hopefully this is accurate?)

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:316
> +// Return true if decoding for this frame has progressed.
> +// Return false if there was a critical decoding error. The error should be reported to client.

This also seems inaccurate, since sometimes decoding hasn't progressed and we still return true (e.g. the very first return statement in the function).  Maybe:

Returns false if a decoding error occurred.  This is a fatal error and causes the GIFImageReader to set the "decode failed" flag.  Otherwise, either not enough data is available to decode further than before, or the new data has been decoded successfully; returns true in this case.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:338
> +        if (blockPosition + blockSize > length)

Are either of the additions in here subject to a potential integer overflow condition?  They look exploitable if they are.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:348
> +    // Note that some broken GIF files have no more LZW blocks but still some rows remaining.
> +    // We treat it as frame complete as well, to be consistent with previous implementation.

This second half of the comment seems like it belongs on isComplete(), not here.

Also, I'm concerned about anywhere where a comment mentions the "previous implementation".  For one, readers may not know what the "previous implementation" was; for another, this comment might become misleading over time if we change or refactor the implementation more; and for a third, consistency with the past is not really a great reason on its own.  I'm willing to change some of these behaviors if doing so would be more sane or compatible; otherwise, we should probably phrase things more like "do XYZ because GIFs in the wild rely on it".

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:365
> +    // Try to be tolerant and don't report parsing errors but we will not parse again.
> +    // Previous implementation used a separate GIFImageReader to count frames, so parsing error was tolerated
> +    // but not decoding error. Do a separate case here to match previous implementation.

Can we ever actually successfully decode after a failed parse?  Does that just result in the image decoding up to the parse failure and then claiming to be complete?

I'm just wondering if the complexity here can be simplified without a visible change to the consumer.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:379
> +        // Exit this loop and wait for more data to continue decoding.

Nit: It's a little misleading to put this here since it really only applies if the conditional is true.  How about writing "// We need more data to continue decoding." as an EOL comment on the break statement?

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:582
> +            // NOTE: This relies on the values in the FrameDisposalMethod enum

Nit: I'd add a blank line above here because the NOTE isn't connected to the line above it.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:586
> +

Nit: ...and I'd probably take away the blank here.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:620
> +            // GIFConsumeNetscapeExtension alway reads 3 bytes from the stream; We should at least wait for this amount.

Nit: alway -> always

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:722
> +            // bits per pixel is currentComponent[8]&0x07
> +

Nit: Remove these two lines, the comment is restated more clearly below
Comment 18 Hin-Chung Lam 2013-03-15 12:17:24 PDT
Created attachment 193350 [details]
Patch
Comment 19 Hin-Chung Lam 2013-03-15 12:18:18 PDT
Comment on attachment 192779 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192779&action=review

>> Source/WebCore/platform/image-decoders/gif/GIFImageDecoder.cpp:110
>> +        || (m_reader && (!m_reader->imagesCount() || m_reader->parseFailed())))
> 
> Tiny nit: With WebKit's lack of a column limit, I find combining these on one line fractionally clearer

okay

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:202
>> +// Return false if there was a critical decoding error. The error should be reported to client.
> 
> It looks like maybe OUTPUT_ROW can return true if more block remains but no rows remain.  Perhaps the comment should be:
> 
> Returns true if decoding was successful.  In this case the block will have been completely consumed and/or rowsRemaining will be 0.  Otherwise, decoding failed; returns false in this case, which will always cause the GIFImageReader to set the "decode failed" flag.
> 
> (Hopefully this is accurate?)

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:316
>> +// Return false if there was a critical decoding error. The error should be reported to client.
> 
> This also seems inaccurate, since sometimes decoding hasn't progressed and we still return true (e.g. the very first return statement in the function).  Maybe:
> 
> Returns false if a decoding error occurred.  This is a fatal error and causes the GIFImageReader to set the "decode failed" flag.  Otherwise, either not enough data is available to decode further than before, or the new data has been decoded successfully; returns true in this case.

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:338
>> +        if (blockPosition + blockSize > length)
> 
> Are either of the additions in here subject to a potential integer overflow condition?  They look exploitable if they are.

When this code is reached, the LZW block should be complete (in parse()). This is really being extra careful about going out of bounds. I don't thin they have integer flow problem, they are both size_t and blockSize is no larger than 255. For the integer overflow to occur the entire file has to exceed the memory limit (32bits or 64bits), which I don't think is possible.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:348
>> +    // We treat it as frame complete as well, to be consistent with previous implementation.
> 
> This second half of the comment seems like it belongs on isComplete(), not here.
> 
> Also, I'm concerned about anywhere where a comment mentions the "previous implementation".  For one, readers may not know what the "previous implementation" was; for another, this comment might become misleading over time if we change or refactor the implementation more; and for a third, consistency with the past is not really a great reason on its own.  I'm willing to change some of these behaviors if doing so would be more sane or compatible; otherwise, we should probably phrase things more like "do XYZ because GIFs in the wild rely on it".

I dropped the comment that mentioned previous implementation but keep the explanation of the behavior of the code for handling wild GIFs.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:365
>> +    // but not decoding error. Do a separate case here to match previous implementation.
> 
> Can we ever actually successfully decode after a failed parse?  Does that just result in the image decoding up to the parse failure and then claiming to be complete?
> 
> I'm just wondering if the complexity here can be simplified without a visible change to the consumer.

You are right. It is totally possible that parse failure has occurred but all frames are fully decoded and complete. This is to match previous implementation, I'd change the behavior in a later patch if we want to. Also dropped the comment about previous implementation.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:379
>> +        // Exit this loop and wait for more data to continue decoding.
> 
> Nit: It's a little misleading to put this here since it really only applies if the conditional is true.  How about writing "// We need more data to continue decoding." as an EOL comment on the break statement?

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:582
>> +            // NOTE: This relies on the values in the FrameDisposalMethod enum
> 
> Nit: I'd add a blank line above here because the NOTE isn't connected to the line above it.

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:586
>> +
> 
> Nit: ...and I'd probably take away the blank here.

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:620
>> +            // GIFConsumeNetscapeExtension alway reads 3 bytes from the stream; We should at least wait for this amount.
> 
> Nit: alway -> always

Done.

>> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:722
>> +
> 
> Nit: Remove these two lines, the comment is restated more clearly below

Done.
Comment 20 Hin-Chung Lam 2013-03-15 12:19:08 PDT
Comment on attachment 192779 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192779&action=review

>> Source/WebCore/platform/image-decoders/gif/GIFImageDecoder.cpp:79
>> +    decode(std::numeric_limits<unsigned>::max(), GIFFrameCountQuery);
> 
> Nit: Personally, I'd probably just pass static_cast<unsigned>(-1), which does the same thing in a slightly less verbose way and avoids the need for the #include... up to you

Using ::max() makes it more clear to me.
Comment 21 Peter Kasting 2013-03-15 21:26:52 PDT
Comment on attachment 193350 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=193350&action=review

It seems like some of my comments, in particular about the change description, you didn't respond up or down to.

This generally looks OK to me.  I wonder if we can simplify imagesCount() a bit.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:347
> +    // Note that some broken GIF files have no more LZW blocks but still some rows remaining, we treat it as frame complete as well.

I take back what I said about putting this last line on isComplete() -- instead it makes sense to put a note like this in the state machine where we call setComplete() (but not here).

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:362
> +    // Try to be tolerant and don't report parsing errors but we will not parse again.

Consider adding a TODO to just fail on parsing errors?

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:617
> +            // GIFConsumeNetscapeExtension always reads 3 bytes from the stream; We should at least wait for this amount.

Nit: We -> we

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.h:262
> +        // GIFControlExtension adds a new frame but is not counted until GIFImageHeader is reached.

Does this actually matter for anything?  I can't think why it's important.
Comment 22 Hin-Chung Lam 2013-03-17 20:38:42 PDT
Created attachment 193482 [details]
Patch
Comment 23 Hin-Chung Lam 2013-03-17 20:40:05 PDT
Comment on attachment 192779 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=192779&action=review

Updated ChangeLog as well.

>> Source/WebCore/ChangeLog:16
>> +        it is received very slowly. Existing implementation creates a new
> 
> Nit: it is -> they are

Done.

>> Source/WebCore/ChangeLog:17
>> +        GIFImageReader for counting frames for every byte that comes in, this
> 
> Nit: Is this accurate?  Shouldn't "for every byte that comes in" read "every time it is notified that new data has been received"?

Done.

>> Source/WebKit/chromium/ChangeLog:112
>> +        that progressive decoding produces the same results as a truncated file.
> 
> Nit: This is unclear, how about "...verify that continually re-decoding an image one additional byte at a time produces the same outputs as repeatedly decoding (with brand new decoders) truncated versions of the image that are one byte longer each time."

Done.
Comment 24 Hin-Chung Lam 2013-03-18 15:21:44 PDT
Review ping.
Comment 25 Peter Kasting 2013-03-18 20:02:28 PDT
(In reply to comment #24)
> Review ping.

In case you're waiting on me, I'm assuming at this point that further review is going to be done by a real reviewer, since you've addressed all my issues.
Comment 26 Hin-Chung Lam 2013-03-18 20:04:49 PDT
pkasting: Thanks for your thorough review. It was very helpful.
Comment 27 Stephen White 2013-03-19 07:10:58 PDT
Comment on attachment 193482 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=193482&action=review

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:355
> +// This method uses GIFFrameContext:decode() to decode each frame, decoding error is reported to client as a critical failure.

Nit:  Comma splice.  Split into two sentences, or use a semicolon.

> Source/WebCore/platform/image-decoders/gif/GIFImageReader.cpp:356
> +// Return true if decoding has progressed. return false; if an error has occurred.

Nit:  English and C mixed is cute, but I'd stick to one or the other for both the true and false.
Comment 28 Stephen White 2013-03-19 07:11:37 PDT
+jschuh for security issues, if any
Comment 29 Stephen White 2013-03-19 07:12:14 PDT
Comment on attachment 193482 [details]
Patch

OK.  r=me
Comment 30 Hin-Chung Lam 2013-03-19 12:09:07 PDT
Created attachment 193893 [details]
Patch for landing
Comment 31 WebKit Review Bot 2013-03-19 12:31:32 PDT
Comment on attachment 193893 [details]
Patch for landing

Clearing flags on attachment: 193893

Committed r146237: <http://trac.webkit.org/changeset/146237>
Comment 32 WebKit Review Bot 2013-03-19 12:31:37 PDT
All reviewed patches have been landed.  Closing bug.