crbug.com/102598
The WEBP header is followed by a so-called P0 header, then some data to decode. A partial P0 header can make WebPIDecGetRGB return false during progressive decodes. In the decoder we find ... if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) return setFailed(); and hence the image decode fails.
Yes, exactly. libwebp needs to have the full P0 header available before being able to show a pixel row. Note: a future-proof safe handling could be: if (!WebPIDecGetRGB(...)) { if (m_lastVisibleRow > 0) { // case A: that's a real error. Something bad happened _after_ we started showing pixels. return setFailed(); } else { return false; // case B: not fully seen P0 yet. "Buffering..." } }
Would "case A" be caught by the previous lines? const VP8StatusCode status = WebPIUpdate(m_decoder, dataBytes, dataSize); if (status != VP8_STATUS_OK && status != VP8_STATUS_SUSPENDED) return setFailed(); // Case A detected here? if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) return setFailed();
(In reply to comment #3) > Would "case A" be caught by the previous lines? Not necessarily: when i said 'something bad happened', it's related to the blitting / video-surface rather than decoding itself. Like: the user closed the tab in-between and the rendering surface is now invalid, for instance. WebPIDecGetRGB() could return NULL in this case, even if the decoding was ok. But that hypothetical future. Right now, WebPIDecGetRGB() will only return NULL if we didn't buffer enough of the P0 header. Note: if you have a bitstream error (in WebPIUpdate()), you can still show the current rgb buffer. In this case, WebPIUpdate() would return an error but you coud still call WebPIDecGetRGB() and see what you have to show. > > const VP8StatusCode status = WebPIUpdate(m_decoder, dataBytes, dataSize); > if (status != VP8_STATUS_OK && status != VP8_STATUS_SUSPENDED) > return setFailed(); // Case A detected here? > if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) > return setFailed();
(In reply to comment #4) > (In reply to comment #3) > > Would "case A" be caught by the previous lines? > > Not necessarily: when i said 'something bad happened', it's related to the blitting / video-surface rather than decoding itself. Like: the user closed the tab in-between and the rendering surface is now invalid, for instance. WebPIDecGetRGB() could return NULL in this case, even if the decoding was ok. WebKit always has a valid renderering surface. Surface creation might fail, but in that case Webkit won't display the image, and so does not need an image decoder. You can assume the drawing surface, a pixel buffer, is properly allocated and valid at all times and the same for the image decoder needed to write decoded pixels onto that surface. I should note that the surface is logically created before the image decoder and that it gets filled with transparent black rgba(0,0,0,0) on successful creation. > But that hypothetical future. Right now, WebPIDecGetRGB() will only return NULL if we didn't buffer enough of the P0 header. And that return NULL behavior suggests if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) return false; is enough to fix the current bug. > Note: if you have a bitstream error (in WebPIUpdate()), you can still show the current rgb buffer. In this case, WebPIUpdate() would return an error but you coud still call WebPIDecGetRGB() and see what you have to show. In a future change, I intend to remove the WebPIDecGetRGB() calls, since I don't need or care about how many lines I've shown. Pumping input data into WebPIUpdate() as the data arrives, should be enough to successfully decode, or else suspend, or else fail decoding on a bit-stream error, just by reading WebPIUpdate()s return value. I don't need to call WebPIDecGetRGB() to help with any of that.
(In reply to comment #5) > (In reply to comment #4) > > (In reply to comment #3) > > > Would "case A" be caught by the previous lines? > > > > Not necessarily: when i said 'something bad happened', it's related to the blitting / video-surface rather than decoding itself. Like: the user closed the tab in-between and the rendering surface is now invalid, for instance. WebPIDecGetRGB() could return NULL in this case, even if the decoding was ok. > > WebKit always has a valid renderering surface. Surface creation might fail, but in that case Webkit > won't display the image, and so does not need an image decoder. > > You can assume the drawing surface, a pixel buffer, is properly allocated and valid at all times and > the same for the image decoder needed to write decoded pixels onto that surface. > > I should note that the surface is logically created before the image decoder and that it gets filled > with transparent black rgba(0,0,0,0) on successful creation. > > > > But that hypothetical future. Right now, WebPIDecGetRGB() will only return NULL if we didn't buffer enough of the P0 header. > > And that return NULL behavior suggests > > if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) > return false; > > is enough to fix the current bug. yes, that's correct. > > > > Note: if you have a bitstream error (in WebPIUpdate()), you can still show the current rgb buffer. In this case, WebPIUpdate() would return an error but you coud still call WebPIDecGetRGB() and see what you have to show. > > In a future change, I intend to remove the WebPIDecGetRGB() calls, since I don't need or care about > how many lines I've shown. > > Pumping input data into WebPIUpdate() as the data arrives, should be enough to successfully decode, > or else suspend, or else fail decoding on a bit-stream error, just by reading WebPIUpdate()s return > value. I don't need to call WebPIDecGetRGB() to help with any of that. sgtm, since you're ensuring the display buffer out-lives the decoder you pass it to.
(In reply to comment #6) > > And that return NULL behavior suggests > > > > if (!WebPIDecGetRGB(m_decoder, &newLastVisibleRow, 0, 0, 0)) > > return false; > > > > is enough to fix the current bug. > > yes, that's correct. OK good. James Zern suggested exactly the same fix and some more explanation that I will include in the ChangeLog. > > Pumping input data into WebPIUpdate() as the data arrives, should be enough to successfully decode, > > or else suspend, or else fail decoding on a bit-stream error, just by reading WebPIUpdate()s return > > value. I don't need to call WebPIDecGetRGB() to help with any of that. > > sgtm, since you're ensuring the display buffer out-lives the decoder you pass it to. Awesome. Thanks for the advice here Pascal, I appreciate it. Let's submit the fix.
Created attachment 118462 [details] Patch
Comment on attachment 118462 [details] Patch Ok. We might be able to test this actually. You just need to serve a WebP image using PHP and introduce a sleep in the middle of the byte stream at the appropriate byte. That will trigger a network packet boundary and cause us to do an incremental decode at exactly the right place to trigger this bug.
Might we ask Pascal to provide such a test :) Note the bug is due to use of WebPIDecGetRGB(), and as I mentioned above, I plan to ditch it in a soon-to-appear followup patch ... > Pumping input data into WebPIUpdate() as the data arrives, should be enough to successfully decode, > or else suspend, or else fail decoding on a bit-stream error, just by reading WebPIUpdate()s return > value. I don't need to call WebPIDecGetRGB() to help with any of that.
I built a PHP test with a sleep in middle of the output stream, then dropped it in preference for a Perl-based test to get reliable output flushing. I tested with the image from the user bug report. That page is http://www.paginaswebynnova.com/aplicaciones/webp30/test.html Without my change, that page breaks for me in chromium on a non-cache page refresh. Attached the perl test I used to attempt to simulate the decoding error using DRT ...
Created attachment 118657 [details] send progressive image perl
I tried using various methods for sleep (sleep/usleep/select), varied the initial packet size and packet delay time, but the webp decoder never enters a progressive decode.
Comment on attachment 118462 [details] Patch Thanks for trying to write a test. Noel and I discussed this topic off-bug, and we couldn't come up a test that actually worked. If anyone else can figure out how to test this patch, please let me or Noel know.
Comment on attachment 118462 [details] Patch Clearing flags on attachment: 118462 Committed r102519: <http://trac.webkit.org/changeset/102519>
All reviewed patches have been landed. Closing bug.
(In reply to comment #14) > (From update of attachment 118462 [details]) > Thanks for trying to write a test. Noel and I discussed this topic off-bug, and we couldn't come up a test that actually worked. If anyone else can figure out how to test this patch, please let me or Noel know. Reopened: found a way with http/tests load and pause php script. (In reply to comment #9) > (From update of attachment 118462 [details]) > Ok. We might be able to test this actually. You just need to serve a WebP image using PHP and introduce a sleep in the middle of the byte stream at the appropriate byte. That will trigger a network packet boundary and cause us to do an incremental decode at exactly the right place to trigger this bug. P0 header is after byte 30, so a progressive load test should pause at byte 31 say and then continue. Also add a partial load test.
Created attachment 160110 [details] Patch
Created attachment 160324 [details] Patch
jzern provided the test image (large.webp) to test the progressive decode case and suggested that the pause be at 5003 per the original bug, so let's do that.
Created attachment 160327 [details] Fix linux ews
Linux results are indeed different, mark them as such until the webp folks fix it. Colors are different in the large image test (maybe the webp premultiplier is buggy?). For the partial load test, the decoded image band is smaller for the same amount of encoded image data.
Created attachment 160344 [details] Add Linux ews exceptions
Created attachment 160664 [details] Linux image diff
> Colors are different in the large image test (maybe the webp premultiplier is buggy?). The differences are small, see Linux image diff.
(In reply to comment #25) > > Colors are different in the large image test (maybe the webp premultiplier is buggy?). > > The differences are small, see Linux image diff. Noel, could you upload the pair of images (instead of the diff) please? Could it be that Linux uses the built-in premultiplication and Mac doesn't (or the converse)? How can we align both code pathes?
(In reply to comment #26) > Noel, could you upload the pair of images (instead of the diff) please? Attached the expected and actual images from chrome linux DRT.
Created attachment 160888 [details] webp-progressive-load-actual-linux.png
Created attachment 160889 [details] webp-progressive-load-expected-linux.png
(In reply to comment #26) > Could it be that Linux uses the built-in premultiplication and Mac doesn't (or the converse)? Nope. The webp decoder does not use the built-in premultiplication code on any platform.
(In reply to comment #22) The other issue for Linux is why does it produce a diff for the partial load test? > For the partial load test, the decoded image band is smaller for the same amount of encoded image data. CachedImage.cpp provides the shared buffer containing the encoded image data. I examined the shared buffer size when the cache sends a pointer to this data to the image decoder. This happens each time data arrives from the network. http://trac.webkit.org/browser/trunk/Source/WebCore/loader/cache/CachedImage.cpp?rev=126094#L342 win 2400 mac 1024, 2400 linux 2048 Linux network data returns 2048 bytes, while win and mac provide 2400 bytes. Hence the difference in the test image result. The results for the progressive loading test, which requests 5003 bytes of data, pauses for 1 second, and then continues, are similar: win 3933, 5003, 21387, 37771, 37798, 37798 mac 1024, 5003, 21387, 37771, 37798, 37798 linux 2048, 4096, 20480, 36864, 37798, 37798 The Linux DRT network data (2048, 4096, 20480 = 10 * 2048, 36864 = 18 * 2048,...) is returned in multiples of 2048 bytes until all data is received (37798 bytes).
Created attachment 160932 [details] Patch sync to head
(In reply to comment #31) > (In reply to comment #22) > > The other issue for Linux is why does it produce a diff for the partial load test? > > > For the partial load test, the decoded image band is smaller for the same amount of encoded image data. > > CachedImage.cpp provides the shared buffer containing the encoded image data. I examined the shared buffer size when the cache sends a pointer to this data to the image decoder. This happens each time data arrives from the network. > > http://trac.webkit.org/browser/trunk/Source/WebCore/loader/cache/CachedImage.cpp?rev=126094#L342 > > win 2400 > mac 1024, 2400 > linux 2048 > > Linux network data returns 2048 bytes, while win and mac provide 2400 bytes. Hence the difference in the test image result. The results for the progressive loading test, which requests 5003 bytes of data, pauses for 1 second, and then continues, are similar: > > win 3933, 5003, 21387, 37771, 37798, 37798 > mac 1024, 5003, 21387, 37771, 37798, 37798 > linux 2048, 4096, 20480, 36864, 37798, 37798 > > The Linux DRT network data (2048, 4096, 20480 = 10 * 2048, 36864 = 18 * 2048,...) is returned in multiples of 2048 bytes until all data is received (37798 bytes). Here's some remarks and tentative explanation: * i tested the 3 sequences of partial buffers (on linux) and couldn't see any problem (including valgrind test). That's not a certitude there aren't any, but just a strong indication. * large.webp has no alpha channel, and is lossy. So lossless and premultiply is likely not the problem here. * large.webp is originally a 800x600 * the expected/actual PNGs are rescaled to something like 700x500 * the difference comes from the fact that linux / windows rescaler are different One way to be sure would be forcing the test <img> to actually use a 800x600 canvas exactly.
(In reply to comment #33) > * large.webp is originally a 800x600 > * the expected/actual PNGs are rescaled to something like 700x500 > * the difference comes from the fact that linux / windows rescaler are different It's the rescaler. Want to avoid scroll bars in the test output. James created a 700 x 500 px version and suggested offset 4237 to reproduce the original bug.
Created attachment 161135 [details] Patch
Comment on attachment 161135 [details] Patch Rejecting attachment 161135 [details] from commit-queue. Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 1 ERROR: /mnt/git/webkit-commit-queue/LayoutTests/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive). Full output: http://queues.webkit.org/results/13695251
Created attachment 161372 [details] Patch for landing
Comment on attachment 161372 [details] Patch for landing Clearing flags on attachment: 161372 Committed r127101: <http://trac.webkit.org/changeset/127101>