74062 – WebPImageDecoder progressive decodes fail to decode valid images

RESOLVED FIXED 74062

WebPImageDecoder progressive decodes fail to decode valid images

https://bugs.webkit.org/show_bug.cgi?id=74062

Summary WebPImageDecoder progressive decodes fail to decode valid images

noel gordon

Reported 2011-12-08 00:05:08 PST

crbug.com/102598

Attachments
Patch (2.46 KB, patch) 2011-12-08 13:49 PST, noel gordon	no flags	Details Formatted Diff Diff
send progressive image perl (917 bytes, text/plain) 2011-12-09 15:32 PST, noel gordon	no flags	Details
Patch (59.50 KB, patch) 2012-08-23 01:24 PDT, noel gordon	no flags	Details Formatted Diff Diff
Patch (1.07 MB, patch) 2012-08-23 21:42 PDT, noel gordon	no flags	Details Formatted Diff Diff
Fix linux ews (1.55 MB, patch) 2012-08-23 22:09 PDT, noel gordon	no flags	Details Formatted Diff Diff
Add Linux ews exceptions (1.07 MB, patch) 2012-08-23 23:46 PDT, noel gordon	no flags	Details Formatted Diff Diff
Linux image diff (413.13 KB, image/png) 2012-08-27 01:34 PDT, noel gordon	no flags	Details
webp-progressive-load-actual-linux.png (367.96 KB, image/png) 2012-08-27 19:59 PDT, noel gordon	no flags	Details
webp-progressive-load-expected-linux.png (367.96 KB, image/png) 2012-08-27 19:59 PDT, noel gordon	no flags	Details
Patch sync to head (1.07 MB, patch) 2012-08-28 02:14 PDT, noel gordon	no flags	Details Formatted Diff Diff
Patch (443.49 KB, patch) 2012-08-28 22:14 PDT, noel gordon	no flags	Details Formatted Diff Diff
Patch for landing (443.49 KB, patch) 2012-08-29 17:58 PDT, noel gordon	no flags	Details Formatted Diff Diff
Show Obsolete (7) View All Add attachment proposed patch, testcase, etc.

noel gordon

Comment 1 2011-12-08 01:23:36 PST

The WEBP header is followed by a so-called P0 header, then some data to decode. A partial P0 header can make WebPIDecGetRGB return false during progressive decodes. In the decoder we find ... if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) return setFailed(); and hence the image decode fails.

Pascal Massimino

Comment 2 2011-12-08 05:16:16 PST

Yes, exactly. libwebp needs to have the full P0 header available before being able to show a pixel row. Note: a future-proof safe handling could be: if (!WebPIDecGetRGB(...)) { if (m_lastVisibleRow > 0) { // case A: that's a real error. Something bad happened _after_ we started showing pixels. return setFailed(); } else { return false; // case B: not fully seen P0 yet. "Buffering..." } }

noel gordon

Comment 3 2011-12-08 09:09:06 PST

Would "case A" be caught by the previous lines? const VP8StatusCode status = WebPIUpdate(m_decoder, dataBytes, dataSize); if (status != VP8_STATUS_OK && status != VP8_STATUS_SUSPENDED) return setFailed(); // Case A detected here? if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) return setFailed();

Pascal Massimino

Comment 4 2011-12-08 09:36:33 PST

(In reply to comment #3) > Would "case A" be caught by the previous lines? Not necessarily: when i said 'something bad happened', it's related to the blitting / video-surface rather than decoding itself. Like: the user closed the tab in-between and the rendering surface is now invalid, for instance. WebPIDecGetRGB() could return NULL in this case, even if the decoding was ok. But that hypothetical future. Right now, WebPIDecGetRGB() will only return NULL if we didn't buffer enough of the P0 header. Note: if you have a bitstream error (in WebPIUpdate()), you can still show the current rgb buffer. In this case, WebPIUpdate() would return an error but you coud still call WebPIDecGetRGB() and see what you have to show. > > const VP8StatusCode status = WebPIUpdate(m_decoder, dataBytes, dataSize); > if (status != VP8_STATUS_OK && status != VP8_STATUS_SUSPENDED) > return setFailed(); // Case A detected here? > if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) > return setFailed();

noel gordon

Comment 5 2011-12-08 11:21:48 PST

(In reply to comment #4) > (In reply to comment #3) > > Would "case A" be caught by the previous lines? > > Not necessarily: when i said 'something bad happened', it's related to the blitting / video-surface rather than decoding itself. Like: the user closed the tab in-between and the rendering surface is now invalid, for instance. WebPIDecGetRGB() could return NULL in this case, even if the decoding was ok. WebKit always has a valid renderering surface. Surface creation might fail, but in that case Webkit won't display the image, and so does not need an image decoder. You can assume the drawing surface, a pixel buffer, is properly allocated and valid at all times and the same for the image decoder needed to write decoded pixels onto that surface. I should note that the surface is logically created before the image decoder and that it gets filled with transparent black rgba(0,0,0,0) on successful creation. > But that hypothetical future. Right now, WebPIDecGetRGB() will only return NULL if we didn't buffer enough of the P0 header. And that return NULL behavior suggests if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) return false; is enough to fix the current bug. > Note: if you have a bitstream error (in WebPIUpdate()), you can still show the current rgb buffer. In this case, WebPIUpdate() would return an error but you coud still call WebPIDecGetRGB() and see what you have to show. In a future change, I intend to remove the WebPIDecGetRGB() calls, since I don't need or care about how many lines I've shown. Pumping input data into WebPIUpdate() as the data arrives, should be enough to successfully decode, or else suspend, or else fail decoding on a bit-stream error, just by reading WebPIUpdate()s return value. I don't need to call WebPIDecGetRGB() to help with any of that.

Pascal Massimino

Comment 6 2011-12-08 12:32:28 PST

(In reply to comment #5) > (In reply to comment #4) > > (In reply to comment #3) > > > Would "case A" be caught by the previous lines? > > > > Not necessarily: when i said 'something bad happened', it's related to the blitting / video-surface rather than decoding itself. Like: the user closed the tab in-between and the rendering surface is now invalid, for instance. WebPIDecGetRGB() could return NULL in this case, even if the decoding was ok. > > WebKit always has a valid renderering surface. Surface creation might fail, but in that case Webkit > won't display the image, and so does not need an image decoder. > > You can assume the drawing surface, a pixel buffer, is properly allocated and valid at all times and > the same for the image decoder needed to write decoded pixels onto that surface. > > I should note that the surface is logically created before the image decoder and that it gets filled > with transparent black rgba(0,0,0,0) on successful creation. > > > > But that hypothetical future. Right now, WebPIDecGetRGB() will only return NULL if we didn't buffer enough of the P0 header. > > And that return NULL behavior suggests > > if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) > return false; > > is enough to fix the current bug. yes, that's correct. > > > > Note: if you have a bitstream error (in WebPIUpdate()), you can still show the current rgb buffer. In this case, WebPIUpdate() would return an error but you coud still call WebPIDecGetRGB() and see what you have to show. > > In a future change, I intend to remove the WebPIDecGetRGB() calls, since I don't need or care about > how many lines I've shown. > > Pumping input data into WebPIUpdate() as the data arrives, should be enough to successfully decode, > or else suspend, or else fail decoding on a bit-stream error, just by reading WebPIUpdate()s return > value. I don't need to call WebPIDecGetRGB() to help with any of that. sgtm, since you're ensuring the display buffer out-lives the decoder you pass it to.

noel gordon

Comment 7 2011-12-08 13:29:03 PST

(In reply to comment #6) > > And that return NULL behavior suggests > > > > if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) > > return false; > > > > is enough to fix the current bug. > > yes, that's correct. OK good. James Zern suggested exactly the same fix and some more explanation that I will include in the ChangeLog. > > Pumping input data into WebPIUpdate() as the data arrives, should be enough to successfully decode, > > or else suspend, or else fail decoding on a bit-stream error, just by reading WebPIUpdate()s return > > value. I don't need to call WebPIDecGetRGB() to help with any of that. > > sgtm, since you're ensuring the display buffer out-lives the decoder you pass it to. Awesome. Thanks for the advice here Pascal, I appreciate it. Let's submit the fix.

noel gordon

Comment 8 2011-12-08 13:49:30 PST

Created attachment 118462 [details] Patch

Adam Barth

Comment 9 2011-12-08 13:55:42 PST

Comment on attachment 118462 [details] Patch Ok. We might be able to test this actually. You just need to serve a WebP image using PHP and introduce a sleep in the middle of the byte stream at the appropriate byte. That will trigger a network packet boundary and cause us to do an incremental decode at exactly the right place to trigger this bug.

noel gordon

Comment 10 2011-12-08 14:10:59 PST

Might we ask Pascal to provide such a test :) Note the bug is due to use of WebPIDecGetRGB(), and as I mentioned above, I plan to ditch it in a soon-to-appear followup patch ... > Pumping input data into WebPIUpdate() as the data arrives, should be enough to successfully decode, > or else suspend, or else fail decoding on a bit-stream error, just by reading WebPIUpdate()s return > value. I don't need to call WebPIDecGetRGB() to help with any of that.

noel gordon

Comment 11 2011-12-09 15:31:19 PST

I built a PHP test with a sleep in middle of the output stream, then dropped it in preference for a Perl-based test to get reliable output flushing. I tested with the image from the user bug report. That page is http://www.paginaswebynnova.com/aplicaciones/webp30/test.html Without my change, that page breaks for me in chromium on a non-cache page refresh. Attached the perl test I used to attempt to simulate the decoding error using DRT ...

noel gordon

Comment 12 2011-12-09 15:32:55 PST

Created attachment 118657 [details] send progressive image perl

noel gordon

Comment 13 2011-12-09 15:55:17 PST

I tried using various methods for sleep (sleep/usleep/select), varied the initial packet size and packet delay time, but the webp decoder never enters a progressive decode.

Adam Barth

Comment 14 2011-12-10 00:40:53 PST

Comment on attachment 118462 [details] Patch Thanks for trying to write a test. Noel and I discussed this topic off-bug, and we couldn't come up a test that actually worked. If anyone else can figure out how to test this patch, please let me or Noel know.

WebKit Review Bot

Comment 15 2011-12-10 01:54:54 PST

Comment on attachment 118462 [details] Patch Clearing flags on attachment: 118462 Committed r102519: <http://trac.webkit.org/changeset/102519>

WebKit Review Bot

Comment 16 2011-12-10 01:54:59 PST

All reviewed patches have been landed. Closing bug.

noel gordon

Comment 17 2012-08-23 01:20:04 PDT

(In reply to comment #14) > (From update of attachment 118462 [details]) > Thanks for trying to write a test. Noel and I discussed this topic off-bug, and we couldn't come up a test that actually worked. If anyone else can figure out how to test this patch, please let me or Noel know. Reopened: found a way with http/tests load and pause php script. (In reply to comment #9) > (From update of attachment 118462 [details]) > Ok. We might be able to test this actually. You just need to serve a WebP image using PHP and introduce a sleep in the middle of the byte stream at the appropriate byte. That will trigger a network packet boundary and cause us to do an incremental decode at exactly the right place to trigger this bug. P0 header is after byte 30, so a progressive load test should pause at byte 31 say and then continue. Also add a partial load test.

noel gordon

Comment 18 2012-08-23 01:24:38 PDT

Created attachment 160110 [details] Patch

noel gordon

Comment 19 2012-08-23 21:42:08 PDT

Created attachment 160324 [details] Patch

noel gordon

Comment 20 2012-08-23 21:54:11 PDT

jzern provided the test image (large.webp) to test the progressive decode case and suggested that the pause be at 5003 per the original bug, so let's do that.

noel gordon

Comment 21 2012-08-23 22:09:10 PDT

Created attachment 160327 [details] Fix linux ews

noel gordon

Comment 22 2012-08-23 23:42:57 PDT

Linux results are indeed different, mark them as such until the webp folks fix it. Colors are different in the large image test (maybe the webp premultiplier is buggy?). For the partial load test, the decoded image band is smaller for the same amount of encoded image data.

noel gordon

Comment 23 2012-08-23 23:46:22 PDT

Created attachment 160344 [details] Add Linux ews exceptions

noel gordon

Comment 24 2012-08-27 01:34:40 PDT

Created attachment 160664 [details] Linux image diff

noel gordon

Comment 25 2012-08-27 01:35:51 PDT

> Colors are different in the large image test (maybe the webp premultiplier is buggy?). The differences are small, see Linux image diff.

Pascal Massimino

Comment 26 2012-08-27 02:36:07 PDT

(In reply to comment #25) > > Colors are different in the large image test (maybe the webp premultiplier is buggy?). > > The differences are small, see Linux image diff. Noel, could you upload the pair of images (instead of the diff) please? Could it be that Linux uses the built-in premultiplication and Mac doesn't (or the converse)? How can we align both code pathes?

noel gordon

Comment 27 2012-08-27 19:58:20 PDT

(In reply to comment #26) > Noel, could you upload the pair of images (instead of the diff) please? Attached the expected and actual images from chrome linux DRT.

noel gordon

Comment 28 2012-08-27 19:59:01 PDT

Created attachment 160888 [details] webp-progressive-load-actual-linux.png

noel gordon

Comment 29 2012-08-27 19:59:30 PDT

Created attachment 160889 [details] webp-progressive-load-expected-linux.png

noel gordon

Comment 30 2012-08-27 20:59:22 PDT

(In reply to comment #26) > Could it be that Linux uses the built-in premultiplication and Mac doesn't (or the converse)? Nope. The webp decoder does not use the built-in premultiplication code on any platform.

noel gordon

Comment 31 2012-08-27 22:53:21 PDT

(In reply to comment #22) The other issue for Linux is why does it produce a diff for the partial load test? > For the partial load test, the decoded image band is smaller for the same amount of encoded image data. CachedImage.cpp provides the shared buffer containing the encoded image data. I examined the shared buffer size when the cache sends a pointer to this data to the image decoder. This happens each time data arrives from the network. http://trac.webkit.org/browser/trunk/Source/WebCore/loader/cache/CachedImage.cpp?rev=126094#L342 win 2400 mac 1024, 2400 linux 2048 Linux network data returns 2048 bytes, while win and mac provide 2400 bytes. Hence the difference in the test image result. The results for the progressive loading test, which requests 5003 bytes of data, pauses for 1 second, and then continues, are similar: win 3933, 5003, 21387, 37771, 37798, 37798 mac 1024, 5003, 21387, 37771, 37798, 37798 linux 2048, 4096, 20480, 36864, 37798, 37798 The Linux DRT network data (2048, 4096, 20480 = 10 * 2048, 36864 = 18 * 2048,...) is returned in multiples of 2048 bytes until all data is received (37798 bytes).

noel gordon

Comment 32 2012-08-28 02:14:44 PDT

Created attachment 160932 [details] Patch sync to head

Pascal Massimino

Comment 33 2012-08-28 10:24:27 PDT

(In reply to comment #31) > (In reply to comment #22) > > The other issue for Linux is why does it produce a diff for the partial load test? > > > For the partial load test, the decoded image band is smaller for the same amount of encoded image data. > > CachedImage.cpp provides the shared buffer containing the encoded image data. I examined the shared buffer size when the cache sends a pointer to this data to the image decoder. This happens each time data arrives from the network. > > http://trac.webkit.org/browser/trunk/Source/WebCore/loader/cache/CachedImage.cpp?rev=126094#L342 > > win 2400 > mac 1024, 2400 > linux 2048 > > Linux network data returns 2048 bytes, while win and mac provide 2400 bytes. Hence the difference in the test image result. The results for the progressive loading test, which requests 5003 bytes of data, pauses for 1 second, and then continues, are similar: > > win 3933, 5003, 21387, 37771, 37798, 37798 > mac 1024, 5003, 21387, 37771, 37798, 37798 > linux 2048, 4096, 20480, 36864, 37798, 37798 > > The Linux DRT network data (2048, 4096, 20480 = 10 * 2048, 36864 = 18 * 2048,...) is returned in multiples of 2048 bytes until all data is received (37798 bytes). Here's some remarks and tentative explanation: * i tested the 3 sequences of partial buffers (on linux) and couldn't see any problem (including valgrind test). That's not a certitude there aren't any, but just a strong indication. * large.webp has no alpha channel, and is lossy. So lossless and premultiply is likely not the problem here. * large.webp is originally a 800x600 * the expected/actual PNGs are rescaled to something like 700x500 * the difference comes from the fact that linux / windows rescaler are different One way to be sure would be forcing the test <img> to actually use a 800x600 canvas exactly.

noel gordon

Comment 34 2012-08-28 22:11:07 PDT

(In reply to comment #33) > * large.webp is originally a 800x600 > * the expected/actual PNGs are rescaled to something like 700x500 > * the difference comes from the fact that linux / windows rescaler are different It's the rescaler. Want to avoid scroll bars in the test output. James created a 700 x 500 px version and suggested offset 4237 to reproduce the original bug.

noel gordon

Comment 35 2012-08-28 22:14:23 PDT

Created attachment 161135 [details] Patch

WebKit Review Bot

Comment 36 2012-08-29 09:58:28 PDT

Comment on attachment 161135 [details] Patch Rejecting attachment 161135 [details] from commit-queue. Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 1 ERROR: /mnt/git/webkit-commit-queue/LayoutTests/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive). Full output: http://queues.webkit.org/results/13695251

noel gordon

Comment 37 2012-08-29 17:58:13 PDT

Created attachment 161372 [details] Patch for landing

WebKit Review Bot

Comment 38 2012-08-29 22:19:35 PDT

Comment on attachment 161372 [details] Patch for landing Clearing flags on attachment: 161372 Committed r127101: <http://trac.webkit.org/changeset/127101>

WebKit Review Bot

Comment 39 2012-08-29 22:19:41 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component New Bugs

Assignee

noel gordon

Reported

2011-12-08 00:05 PST

Modified

2012-08-29 22:19 PDT History

CC List

5 users Show

URL

Keywords

Depends on

Blocks

58851

Dependencies

tree graph