Bug 57960

Summary: [Chromium] Accelerated 2D Canvas is slow to execute putImageData
Product: WebKit Reporter: Justin Novosad <junov>
Component: Layout and RenderingAssignee: Justin Novosad <junov>
Status: RESOLVED WONTFIX    
Severity: Normal CC: caseq, commit-queue, dglazkov, kbr, senorblanco, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Windows 7   
Attachments:
Description Flags
Fixes putImageData performnce for gpu accelerated canvas by drawing a textured quad rather than blitting to a read-back buffer
eric: review-
Response to review comments by Stephen White
none
Fixed chromium linux build and style failures
none
Patch
none
Patch
none
Patch
none
Patch none

Description Justin Novosad 2011-04-06 10:07:11 PDT
In Chromium, with accelerated Canvas enabled, display a web page that relies heavily on putImageData for rendering such as the following Chrome experiment:
http://www.chromeexperiments.com/detail/back-to-visuals/?f=

Notice that it runs much, much slower than with the accelerated canvas disabled.

The current implementation of putImageData forces a readback from the GPU. The readback would not be necessary with a GPU-aware implementation.

Related Chromium bug:
http://code.google.com/p/chromium/issues/detail?id=78176
Comment 1 Stephen White 2011-04-06 10:27:45 PDT
(In reply to comment #0)
> In Chromium, with accelerated Canvas enabled, display a web page that relies heavily on putImageData for rendering such as the following Chrome experiment:
> http://www.chromeexperiments.com/detail/back-to-visuals/?f=
> 
> Notice that it runs much, much slower than with the accelerated canvas disabled.
> 
> The current implementation of putImageData forces a readback from the GPU. The readback would not be necessary with a GPU-aware implementation.
> 
> Related Chromium bug:
> http://code.google.com/p/chromium/issues/detail?id=78176

Another speedup could be gained by doing the unpremultiply step on the GPU (in a shader), but that can wait for another patch.
Comment 2 Justin Novosad 2011-04-06 10:30:40 PDT
Created attachment 88466 [details]
Fixes putImageData performnce for gpu accelerated canvas by drawing a textured quad rather than blitting to a read-back buffer
Comment 3 WebKit Review Bot 2011-04-06 10:33:08 PDT
Attachment 88466 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style']" exit_code: 1

Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:335:  This { should be at the end of the previous line  [whitespace/braces] [4]
Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:336:  Local variables should never be PassRefPtr (see http://webkit.org/coding/RefPtr.html).  [readability/pass_ptr] [5]
Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:337:  Should have a space between // and comment  [whitespace/comments] [4]
Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:347:  An else should appear on the same line as the preceding }  [whitespace/newline] [4]
Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:348:  This { should be at the end of the previous line  [whitespace/braces] [4]
Total errors found: 5 in 4 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Stephen White 2011-04-06 10:44:56 PDT
Comment on attachment 88466 [details]
Fixes putImageData performnce for gpu accelerated canvas by drawing a textured quad rather than blitting to a read-back buffer

View in context: https://bugs.webkit.org/attachment.cgi?id=88466&action=review

This is a good first draft, but I think it could cleaned up.

> Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:344
> +            context->drawImage(image.get(), ColorSpaceDeviceRGB, IntPoint(destX, destY), IntRect(0, 0, numColumns, numRows), CompositeCopy, true);

This is a bit of a mess:  you're setting state on the PlatformContextSkia, the GLES2Canvas, and then calling GraphicsContext::drawImage().  If possible, you should probably restrict yourself to one API level (probably GraphicsContext).  Do the save on there, set the CTM temporarily, etc.

Another alternative would be to avoid messing with the state at all, and use the same path that PlatformContext::uploadSoftwareToHardware() uses, namely gpuCanvas()->drawTexturedRect() (third version).  This one will bypass the CTM, clipping, compositing mode, etc entirely.  You might have to refactor out some of uploadSoftwareToHardware() (for the texture creation, etc) into a new function in PlatformContextSkia.  I'd prefer this approach, if possible, since it avoids a bunch of unnecessary state change.

> Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:348
> +        {

Single-statement clauses don't need braces (the style bot will nag you about this).  FYI, you can run the style check locally with Tools/Scripts/check-webkit-style.
Comment 5 Eric Seidel (no email) 2011-04-06 10:53:40 PDT
Comment on attachment 88466 [details]
Fixes putImageData performnce for gpu accelerated canvas by drawing a textured quad rather than blitting to a read-back buffer

Needs another round.  (per stylebot and stephen). Thanks for the patch!
Comment 6 WebKit Review Bot 2011-04-06 11:15:47 PDT
Attachment 88466 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8344292
Comment 7 Kenneth Russell 2011-04-06 17:52:38 PDT
(In reply to comment #4)
> Single-statement clauses don't need braces (the style bot will nag you about this).  FYI, you can run the style check locally with Tools/Scripts/check-webkit-style.

The 'webkit-patch upload' script also runs the same style checks the bots do. check-webkit-style will report false positives on untouched regions of files containing preexisting style errors.
Comment 8 Justin Novosad 2011-04-07 08:18:56 PDT
Created attachment 88642 [details]
Response to review comments by Stephen White

The hardware rendering path no longer requires a temp buffer for swizzling or premultiplication, so this approach is even faster that the previously proposed patch
Comment 9 WebKit Review Bot 2011-04-07 08:21:33 PDT
Attachment 88642 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:262:  Place brace on its own line for function definitions.  [whitespace/braces] [4]
Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:262:  Missing space before {  [whitespace/braces] [5]
Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:264:  Missing space after ,  [whitespace/comma] [3]
Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:265:  Should have a space between // and comment  [whitespace/comments] [4]
Total errors found: 4 in 5 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 10 WebKit Review Bot 2011-04-07 08:23:17 PDT
Attachment 88642 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/8345589
Comment 11 Justin Novosad 2011-04-07 08:38:13 PDT
Created attachment 88645 [details]
Fixed chromium linux build and style failures
Comment 12 Stephen White 2011-04-07 08:55:58 PDT
View in context: https://bugs.webkit.org/attachment.cgi?id=88642&action=review

OK, I like your new approach (very cool that you're doing the premultiply in hardware), but now that you're addressing this, I'm going to suggest another direction:  implement putImageData() directly in GLES2Canvas (see last comment below).  Sorry that I'm making you do more work, but I think this would be more in line with the design I had in mind for GLES2Canvas.  It would also allow you to leave the GPU_SKIA and software paths untouched.  Let me know what you think.

> Source/WebCore/platform/graphics/chromium/GLES2Canvas.cpp:414
>  {

Rather than add another bool, we should probably create a DrawFlags enum on GLES2Canvas, and "or" the appropriate bits together.  (My fault for the bad precedent; see https://bugs.webkit.org/show_bug.cgi?id=52743).

> Source/WebCore/platform/graphics/chromium/GLES2Canvas.cpp:556
> +}

Now that you've exposed this here, you could probably clean up PlatformContextSkia::uploadSoftwareToHardware a bit (can call your new function).

> Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:266
> +    RefPtr<Texture> texture = platformContext->gpuCanvas()->createTexture(Texture::RGBA8, srcRowPixels, texRect.height());

Since texture creation can be expensive, it'd be nice if we could cache this texture between calls, the way PlatformContextSkia does.  I wonder if it might be better to simply have a GLES2Canvas::putImageData() function that takes a void* (or ByteArray), and manages the texture internally, does the upload, and draw.  You could short-circuit ImageBuffer::put*ImageData() and early-return in the GPU case, and leave the rest of the code the same.  This would require some cut-and-paste from the existing code (the sourceRect and destPoint computation, and all that), but that's ok.

The idea with GLES2Canvas was that it eventually would implement the whole GraphicsContext API, so eventually it should be weaned off any code in platform/graphics/skia.
Comment 13 Kenneth Russell 2011-04-07 13:39:02 PDT
Comment on attachment 88645 [details]
Fixed chromium linux build and style failures

Could you address Stephen's review feedback? Marking r- as he is the expert in this area and it looks like more work is needed.
Comment 14 Justin Novosad 2011-04-11 08:38:24 PDT
Created attachment 89010 [details]
Patch
Comment 15 Stephen White 2011-04-11 10:51:50 PDT
Comment on attachment 89010 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=89010&action=review

> Source/WebCore/platform/graphics/chromium/GLES2Canvas.cpp:44
> +#include "NativeImageSkia.h"

This shouldn't be necessary (and should be protected by #if USE(SKIA) if it is).

> Source/WebCore/platform/graphics/chromium/GLES2Canvas.cpp:410
> +    drawTexturedRect(texture, srcRect, dstRect, m_state->m_ctm, m_state->m_alpha, colorSpace, compositeOp, m_state->clippingEnabled() ? DrawTextureClip : 0);

"DrawTextureClip" is a little ambiguous.  I know what it means, but a future reader might confuse it with "draw the texture's clip" or something.  Same for "DrawTextureMultiply".  What is it multiplying?  Since the enum is scoped to GLES2Canvas, maybe we could just skip the "DrawTexture" prefix and make the enums "EnableClipping" and "MultiplySourceAlpha".

> Source/WebCore/platform/graphics/chromium/GLES2Canvas.cpp:414
> +void GLES2Canvas::drawTexturedRect(Texture* texture, const FloatRect& srcRect, const FloatRect& dstRect, const AffineTransform& transform, float alpha, ColorSpace colorSpace, CompositeOperator compositeOp, unsigned opt)

This should be "drawTextureOptions", as it is in the .h, rather than "opt".  Actually, there seems to be more precedent for Flags in WebKit that Options, so I'd call this DrawTextureFlags.

> Source/WebCore/platform/graphics/chromium/GLES2Canvas.h:62
> +

See above:  How about DrawTextureFlags, consisting of EnableClipping and MultiplySourceAlpha.

> Source/WebCore/platform/graphics/chromium/GLES2Canvas.h:96
> +    // This version is called by the above, by putImageDataHardware, and by the software->hardware uploads.

putImageDataHardware -> putImageData.

> Source/WebCore/platform/graphics/gpu/Texture.cpp:170
> +    bool needTempBuff = swizzle || updateRect.width() != m_tiles.totalSizeY() || m_tiles.numTilesX() > 1;

Hmmm.  This is comparing width against totalSizeY().  Is this correct?  Should we compare both width and height?

> Source/WebCore/platform/graphics/skia/ImageBufferSkia.cpp:342
> +    putImageData(source, sourceSize, sourceRect, destPoint, context()->platformContext(), m_size, Unmultiplied);

Please put the useGPU() check here and change the signature of the GLES2Canvas function to match that of ImageBuffer, as we discussed.
Comment 16 Justin Novosad 2011-04-11 13:01:31 PDT
Comment on attachment 89010 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=89010&action=review

>> Source/WebCore/platform/graphics/gpu/Texture.cpp:170
>> +    bool needTempBuff = swizzle || updateRect.width() != m_tiles.totalSizeY() || m_tiles.numTilesX() > 1;
> 
> Hmmm.  This is comparing width against totalSizeY().  Is this correct?  Should we compare both width and height?

Ooh, good catch, I obviously mean totalSizeX().  No need to check height here.  What we are verifying with this condition is that row stride is equal to row data size.
Comment 17 Kenneth Russell 2011-04-11 15:27:17 PDT
Comment on attachment 89010 [details]
Patch

Needs revision per Stephen's review comments.
Comment 18 Justin Novosad 2011-04-13 11:00:00 PDT
Created attachment 89404 [details]
Patch
Comment 19 Kenneth Russell 2011-04-13 11:13:16 PDT
Stephen, would you do the unofficial review of Justin's latest patch?
Comment 20 Stephen White 2011-04-13 11:24:57 PDT
Comment on attachment 89404 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=89404&action=review

Looks good.

> Source/WebCore/platform/graphics/chromium/GLES2Canvas.cpp:420
> +        // The multiply option hijacks the blend func, so it can't be combine with a compositing op

nit:  combine -> combined
Comment 21 Kenneth Russell 2011-04-13 11:34:03 PDT
Comment on attachment 89404 [details]
Patch

Thanks, Stephen; r=me . Justin, let me know if you want to land this as is (mark it "cq?") or do an update to the comment.
Comment 22 Justin Novosad 2011-04-13 11:48:48 PDT
I will make the correction.  In the mean-time I am rerunning gpu layout tests just to be safe 'cause I hadn't rerun them with the very last code iteration.
Comment 23 Justin Novosad 2011-04-13 11:50:25 PDT
(In reply to comment #22)
> I will make the correction.  In the mean-time I am rerunning gpu layout tests just to be safe 'cause I hadn't rerun them with the very last code iteration.

Abort! I have a layout test failure!
Comment 24 Justin Novosad 2011-04-13 15:29:42 PDT
Created attachment 89479 [details]
Patch
Comment 25 WebKit Review Bot 2011-04-13 15:33:30 PDT
Attachment 89479 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/canv..." exit_code: 1

Source/WebCore/ChangeLog:31:  Line contains tab character.  [whitespace/tab] [5]
Total errors found: 1 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 26 Kenneth Russell 2011-04-13 18:20:41 PDT
Comment on attachment 89479 [details]
Patch

Sorry but could you please fix the style issue? If you use "webkit-patch upload" it should catch these things.
Comment 27 Stephen White 2011-04-14 10:58:33 PDT
Comment on attachment 89479 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=89479&action=review

> LayoutTests/canvas/philip/tests/2d.gradient.interpolate.colouralpha.html:-22
> -_assertPixelApprox(canvas, 75,25, 63,63,191,191, "75,25", "63,63,191,191", 3);

Unfortunately, since this is in canvas/philip, was can't really modify the expected results:   they will get overwritten when the test suite is next updated.  However, since it's just the one test, we can probably log a bug, mark this in test_expectations as failing and deal with it later.  (Justin explained to me offline that this failure is due to skia doing some compositing internally with gradients that does not use rounding).
Comment 28 Justin Novosad 2011-04-14 11:40:22 PDT
Created attachment 89613 [details]
Patch
Comment 29 Kenneth Russell 2011-04-14 12:18:22 PDT
Comment on attachment 89613 [details]
Patch

Stephen, could you re-review?
Comment 30 Stephen White 2011-04-14 13:28:48 PDT
Looks good.
Comment 31 Kenneth Russell 2011-04-14 14:12:18 PDT
Comment on attachment 89613 [details]
Patch

Thanks.
Comment 32 WebKit Commit Bot 2011-04-14 22:41:06 PDT
The commit-queue encountered the following flaky tests while processing attachment 89613 [details]:

http/tests/workers/shared-worker-importScripts.html bug 58634 (author: atwilson@chromium.org)
The commit-queue is continuing to process your patch.
Comment 33 WebKit Commit Bot 2011-04-14 22:43:45 PDT
Comment on attachment 89613 [details]
Patch

Clearing flags on attachment: 89613

Committed r83949: <http://trac.webkit.org/changeset/83949>
Comment 34 WebKit Commit Bot 2011-04-14 22:43:51 PDT
All reviewed patches have been landed.  Closing bug.
Comment 35 WebKit Commit Bot 2011-04-14 23:44:20 PDT
The commit-queue encountered the following flaky tests while processing attachment 89613 [details]:

inspector/debugger/source-frame.html bug 57399 (author: podivilov@chromium.org)
The commit-queue is continuing to process your patch.
Comment 36 Andrey Kosyakov 2011-04-15 01:58:31 PDT
This broke 31 layout tests on chromium linux & windows:

http://build.chromium.org/p/chromium.webkit/builders/Webkit%20Linux/builds/6074

I'm going to rollback, the diffs look quite big.
Comment 37 Andrey Kosyakov 2011-04-15 02:30:05 PDT
Reverted r83949 for reason:

broke 31 tests in chromium win & linux

Committed r83960: <http://trac.webkit.org/changeset/83960>
Comment 38 Justin Novosad 2011-06-09 14:20:54 PDT
This bug is no longer relevant, now that the accelerated 2d canvas renders through skia. Performance of the skia rendering path is good since there is no readback from the GPU when putImageData is called.