43962 – Improve FEGaussianBlur algorithm performance

RESOLVED DUPLICATE of bug 45599 43962

Improve FEGaussianBlur algorithm performance

https://bugs.webkit.org/show_bug.cgi?id=43962

Summary Improve FEGaussianBlur algorithm performance

Alejandro G. Castro

Reported 2010-08-13 03:46:59 PDT

I created this bug to follow information about Ariya's proposal to improve blur algorithm performance: http://gitorious.org/ofi-labs/x2/blobs/master/graphics/shadowblur/shadowblur.cpp

Attachments
Ariya's proposal implemented in the FEGaussianBlur (7.27 KB, patch) 2010-08-13 03:54 PDT, Alejandro G. Castro	no flags	Details Formatted Diff Diff
Oprofile log with current algorithm (1.02 MB, image/svg+xml) 2010-08-13 10:39 PDT, Alejandro G. Castro	no flags	Details
Oprofile log with ariya's proposed algorithm (886.97 KB, image/svg+xml) 2010-08-13 10:45 PDT, Alejandro G. Castro	no flags	Details
Ariya's proposal implemented in the FEGaussianBlur (8.89 KB, patch) 2010-08-17 00:42 PDT, Alejandro G. Castro	no flags	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Alejandro G. Castro

Comment 1 2010-08-13 03:54:22 PDT

Created attachment 64318 [details] Ariya's proposal implemented in the FEGaussianBlur For testing purposes you can use this patch, use CURRENT_ALGO to choose between the current algorithm or the proposal. I've seen visual errors with some deviation values, but it mostly work. Now we can do some performance tests.

Alejandro G. Castro

Comment 2 2010-08-13 03:56:42 PDT

The patch it is created over the patchset of the bug 39582 applied to the trunk of the svn.

Ariya Hidayat

Comment 3 2010-08-13 09:16:38 PDT

Comment on attachment 64318 [details] Ariya's proposal implemented in the FEGaussianBlur This looks good! :) > #include "GraphicsContext.h" > #include "ImageData.h" > #include <wtf/MathExtras.h> Needs the Sencha copyright (see also Patch 2 in https://bugs.webkit.org/show_bug.cgi?id=34479). > +#define CURRENT_ALGO 0 Have you profiled the different between using CURRENT_ALGO and not using it? > +// Note: image must be RGB32 format > +static void blurHorizontal(unsigned char* image, int imgWidth, int imgHeight, int strideLine, int radius, bool swap = false) The comment does not apply anymore since you prepare 'image' manually by yourself (hence, you know it's guaranteed RGB32 :) > + OwnPtr<ImageBuffer> tmpImageBuffer = ImageBuffer::create(resultImage()->size()); > + > + AffineTransform transform; > + transform.rotate(90); > + RefPtr<Pattern> pattern = Pattern::create(m_in->resultImage()->image(), false, false); > + tmpImageBuffer->context()->concatCTM(transform); > + tmpImageBuffer->context()->translate(0, -imageRect.height()); > + tmpImageBuffer->context()->setFillPattern(pattern.get()); > + tmpImageBuffer->context()->fillRect(imageRect); This works only if every platform graphics stack supports fast 90 degree rotation (aka transposed). Qt supports this hence why I use it. Otherwise, it is faster to transpose the pixels ourselves. Possible idea: provide a default ImageBuffer::transpose and let the platform overrides it if it has a faster version. BTW, track also Hyatt's refactoring on Canvas. This might or might not conflict with some of his changes, but just in case. https://bugs.webkit.org/show_bug.cgi?id=43507

Alejandro G. Castro

Comment 4 2010-08-13 09:59:13 PDT

(In reply to comment #3) > (From update of attachment 64318 [details]) > This looks good! :) > > > #include "GraphicsContext.h" > > #include "ImageData.h" > > #include <wtf/MathExtras.h> > > Needs the Sencha copyright (see also Patch 2 in https://bugs.webkit.org/show_bug.cgi?id=34479). > Oh, sorry, I'll do it. > > +#define CURRENT_ALGO 0 > > Have you profiled the different between using CURRENT_ALGO and not using it? > Not yet, but I'll do it :) > > + tmpImageBuffer->context()->setFillPattern(pattern.get()); > > + tmpImageBuffer->context()->fillRect(imageRect); > > This works only if every platform graphics stack supports fast 90 degree rotation (aka transposed). Qt supports this hence why I use it. > Otherwise, it is faster to transpose the pixels ourselves. > > Possible idea: provide a default ImageBuffer::transpose and let the platform overrides it if it has a faster version. > Uhu, not sure what cairo is doing, I'm going to profile and we can check if it is a problem. > BTW, track also Hyatt's refactoring on Canvas. This might or might not conflict with some of his changes, but just in case. https://bugs.webkit.org/show_bug.cgi?id=43507 Yep, probably we will sabe the two image copies.

Alejandro G. Castro

Comment 5 2010-08-13 10:39:31 PDT

Created attachment 64353 [details] Oprofile log with current algorithm Basically scrolling identi.ca automatically in epiphany for some time (big blur under the main box in that page) FEGaussianBlur::apply 8.16% percent of the time 6.68% of the time in the blur algorithm It is that fast because it is using tiling to create the shadow from a small one.

Alejandro G. Castro

Comment 6 2010-08-13 10:45:48 PDT

Created attachment 64355 [details] Oprofile log with ariya's proposed algorithm Same test scrolling identi.ca automatically in epiphany for some time (big blur under the main box in that page) FEGaussianBlur::apply 23.81% percent of the time divided mainly in GraphicsContext::fillRect 16.13% copying the rotated image FEGaussianBlur::blurHorizontal 5.05% the real algorithm It is again fast because it is using tiling to create the shadow from a small one. Anyway, if we could avoid the fillRects I would say the two algorithms are pretty similar, we could even add the fixed point operation to the current one and improve a little bit more. I hope it helps.

Alejandro G. Castro

Comment 7 2010-08-17 00:42:37 PDT

Created attachment 64558 [details] Ariya's proposal implemented in the FEGaussianBlur Added Sencha copyright

Alejandro G. Castro

Comment 8 2010-09-16 09:39:32 PDT

*** This bug has been marked as a duplicate of bug 45599 ***

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution DUPLICATE

of bug 45599

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware PC

OS OS X 10.5

Product WebKit

Component SVG

Assignee

Nobody

Reported

2010-08-13 03:46 PDT

Modified

2010-09-16 09:39 PDT History

CC List

3 users Show

URL

Keywords

Depends on

40793

Blocks

Dependencies

tree graph