Bug 44528

Summary: Improve FEGaussianBlur algorithm to access image buffer memory directly
Product: WebKit Reporter: Alejandro G. Castro <alex>
Component: Layout and RenderingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ariya.hidayat, dglazkov, eric, hyatt, krit, mrobinson, simon.fraser, webkit.review.bot, zimmermann
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Linux   
Bug Depends on:    
Bug Blocks: 68469, 26389, 48174    
Attachments:
Description Flags
Proposed patch
none
Proposed patch
none
Proposed patch
none
Proposed patch krit: review-

Description Alejandro G. Castro 2010-08-24 09:15:58 PDT
Currently the FEGaussianBlur uses CanvasPixelArrays objects to handle pixel operations, this is very slow, we need to add support to get and handle image buffer memory directly, avoiding copies if possible.
Comment 1 Alejandro G. Castro 2010-08-24 09:17:40 PDT
Created attachment 65282 [details]
Proposed patch
Comment 2 Eric Seidel (no email) 2010-08-24 09:31:57 PDT
Attachment 65282 [details] did not build on mac:
Build output: http://queues.webkit.org/results/3768598
Comment 3 Alejandro G. Castro 2010-08-24 09:44:32 PDT
Created attachment 65284 [details]
Proposed patch
Comment 4 WebKit Review Bot 2010-08-24 10:24:02 PDT
Attachment 65284 [details] did not build on chromium:
Build output: http://queues.webkit.org/results/3790427
Comment 5 Alejandro G. Castro 2010-08-24 13:41:06 PDT
Created attachment 65314 [details]
Proposed patch
Comment 6 Alejandro G. Castro 2010-08-27 01:28:25 PDT
After talking to dhyatt I'm going review the patch and create begin/end functions instead of data and dirty.
Comment 7 Alejandro G. Castro 2010-08-27 04:37:23 PDT
Created attachment 65698 [details]
Proposed patch

Replaced data/makeDirty with beginAccessData/endAccessData and reviewed some of the port functions to avoid a copy, probably someone of each port should check those functions are correct for each technology.
Comment 8 Dirk Schulze 2010-10-23 05:22:27 PDT
Comment on attachment 65698 [details]
Proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=65698&action=review

Have you tried to use the original code, but take the WTF::ByteArray instead of the CanvasPixelArray? I'm interested in how much perf win this could bring. Surely not as much as your code, but still. Also it is not needed here to mark the surface as dirty. This is only necessary, if you want to access the context after pixel manipulations. That isn't the case here or on any other filter effect result.
You also ignored, that some platforms may use unpremultiplied colors. Would be great if we can find a way to manage pixel access with premultiplied and unpremultiplied colors, that would be a win for the other filter effects as well.

I know that you were working on that to make CSS shadows faster on Cairo. But it would still be useful for SVG filter effects (that are supported by WebKitGtk as well).

> WebCore/platform/graphics/wx/ImageBufferWx.cpp:72
> +     notImplemented();

indention wrong.
Comment 9 Alejandro G. Castro 2010-10-25 05:13:28 PDT
Thanks for the review krit.

(In reply to comment #8)
> (From update of attachment 65698 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=65698&action=review
> 
> Have you tried to use the original code, but take the WTF::ByteArray instead of the CanvasPixelArray? I'm interested in how much perf win this could bring. Surely not as much as your code, but still. Also it is not needed here to mark the surface as dirty. This is only necessary, if you want to access the context after pixel manipulations. That isn't the case here or on any other filter effect result.
> You also ignored, that some platforms may use unpremultiplied colors. Would be great if we can find a way to manage pixel access with premultiplied and unpremultiplied colors, that would be a win for the other filter effects as well.
> 

Yeah, you are right, I'll try to check if there is any option and check the performance in that case.
Comment 10 Dirk Schulze 2010-10-25 05:46:04 PDT
(In reply to comment #9)
> Thanks for the review krit.
> 
> (In reply to comment #8)
> > (From update of attachment 65698 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=65698&action=review
> > 
> > Have you tried to use the original code, but take the WTF::ByteArray instead of the CanvasPixelArray? I'm interested in how much perf win this could bring. Surely not as much as your code, but still. Also it is not needed here to mark the surface as dirty. This is only necessary, if you want to access the context after pixel manipulations. That isn't the case here or on any other filter effect result.
> > You also ignored, that some platforms may use unpremultiplied colors. Would be great if we can find a way to manage pixel access with premultiplied and unpremultiplied colors, that would be a win for the other filter effects as well.
> > 
> 
> Yeah, you are right, I'll try to check if there is any option and check the performance in that case.

I already checked and tested ByteArray* an unsgined char* and couldn't see a difference in performance. I took the attached SVG on bug 48174 as reference. On Mac put/ and getImageData just take 0.8% of the time, GaussianBlur algo itself took 96% of the time. On my machine I could reduce the rendering time from 31s to 21s just by eliminating the -> operators and accessing the pixel data via ByteArray instead of CanvasPixelArray. 21s is still to much, but I don't see huge improvement by implementing direct pixel data access on ImageBuffer right now. Even so I'd support it!

Do you plan to investigate on this more?
Comment 11 Dirk Schulze 2010-10-25 05:48:41 PDT
(In reply to comment #10)
> Do you plan to investigate on this more?

Oh, I forgot to mention that the patch that makes use of ByteArray on FEGaussianBlur is already in trunk.
Comment 12 Alejandro G. Castro 2010-10-25 06:03:15 PDT
(In reply to comment #10)
>
> [...]
>
> I already checked and tested ByteArray* an unsgined char* and couldn't see a difference in performance. I took the attached SVG on bug 48174 as reference. On Mac put/ and getImageData just take 0.8% of the time, GaussianBlur algo itself took 96% of the time. On my machine I could reduce the rendering time from 31s to 21s just by eliminating the -> operators and accessing the pixel data via ByteArray instead of CanvasPixelArray. 21s is still to much, but I don't see huge improvement by implementing direct pixel data access on ImageBuffer right now. Even so I'd support it!
> 
> Do you plan to investigate on this more?
>

Oh, nice, good to know. If you think it makes sense I can spend time doing some profiling and check how we could do it (integrate contextshadow I guess it is an option). Althougth I'm afraid that this operation is not going to be good enough if we can not make it in specific hardware or reduce the amount of blurring area.
Comment 13 Simon Fraser (smfr) 2010-12-11 19:30:53 PST
See also bug 50881.
Comment 14 Simon Fraser (smfr) 2010-12-11 19:31:46 PST
Maybe obsoleted by the patch in bug 49907?
Comment 15 Alejandro G. Castro 2010-12-12 02:13:56 PST
(In reply to comment #14)
> Maybe obsoleted by the patch in bug 49907?

Yes, the patch must be reviewed, and not sure if after that patch the time spent in these operations are relevant at all. Anyway the option to use directly the memory buffer is open even after that patch, just need to check if it is interesting or not. Sorry for not doing the testing before, I'll try to check it soon and close the bug depending on the results, thanks for the heads up.
Comment 16 Martin Robinson 2012-06-13 17:03:07 PDT
I guess we've abandoned this? Perhaps we can just close the bug?
Comment 17 Alejandro G. Castro 2012-06-14 01:54:07 PDT
(In reply to comment #16)
> I guess we've abandoned this? Perhaps we can just close the bug?

I agree, thanks for the heads-up.