I created this bug to follow information about Ariya's proposal to improve blur algorithm performance: http://gitorious.org/ofi-labs/x2/blobs/master/graphics/shadowblur/shadowblur.cpp
Created attachment 64318 [details] Ariya's proposal implemented in the FEGaussianBlur For testing purposes you can use this patch, use CURRENT_ALGO to choose between the current algorithm or the proposal. I've seen visual errors with some deviation values, but it mostly work. Now we can do some performance tests.
The patch it is created over the patchset of the bug 39582 applied to the trunk of the svn.
Comment on attachment 64318 [details] Ariya's proposal implemented in the FEGaussianBlur This looks good! :) > #include "GraphicsContext.h" > #include "ImageData.h" > #include <wtf/MathExtras.h> Needs the Sencha copyright (see also Patch 2 in https://bugs.webkit.org/show_bug.cgi?id=34479). > +#define CURRENT_ALGO 0 Have you profiled the different between using CURRENT_ALGO and not using it? > +// Note: image must be RGB32 format > +static void blurHorizontal(unsigned char* image, int imgWidth, int imgHeight, int strideLine, int radius, bool swap = false) The comment does not apply anymore since you prepare 'image' manually by yourself (hence, you know it's guaranteed RGB32 :) > + OwnPtr<ImageBuffer> tmpImageBuffer = ImageBuffer::create(resultImage()->size()); > + > + AffineTransform transform; > + transform.rotate(90); > + RefPtr<Pattern> pattern = Pattern::create(m_in->resultImage()->image(), false, false); > + tmpImageBuffer->context()->concatCTM(transform); > + tmpImageBuffer->context()->translate(0, -imageRect.height()); > + tmpImageBuffer->context()->setFillPattern(pattern.get()); > + tmpImageBuffer->context()->fillRect(imageRect); This works only if every platform graphics stack supports fast 90 degree rotation (aka transposed). Qt supports this hence why I use it. Otherwise, it is faster to transpose the pixels ourselves. Possible idea: provide a default ImageBuffer::transpose and let the platform overrides it if it has a faster version. BTW, track also Hyatt's refactoring on Canvas. This might or might not conflict with some of his changes, but just in case. https://bugs.webkit.org/show_bug.cgi?id=43507
(In reply to comment #3) > (From update of attachment 64318 [details]) > This looks good! :) > > > #include "GraphicsContext.h" > > #include "ImageData.h" > > #include <wtf/MathExtras.h> > > Needs the Sencha copyright (see also Patch 2 in https://bugs.webkit.org/show_bug.cgi?id=34479). > Oh, sorry, I'll do it. > > +#define CURRENT_ALGO 0 > > Have you profiled the different between using CURRENT_ALGO and not using it? > Not yet, but I'll do it :) > > + tmpImageBuffer->context()->setFillPattern(pattern.get()); > > + tmpImageBuffer->context()->fillRect(imageRect); > > This works only if every platform graphics stack supports fast 90 degree rotation (aka transposed). Qt supports this hence why I use it. > Otherwise, it is faster to transpose the pixels ourselves. > > Possible idea: provide a default ImageBuffer::transpose and let the platform overrides it if it has a faster version. > Uhu, not sure what cairo is doing, I'm going to profile and we can check if it is a problem. > BTW, track also Hyatt's refactoring on Canvas. This might or might not conflict with some of his changes, but just in case. https://bugs.webkit.org/show_bug.cgi?id=43507 Yep, probably we will sabe the two image copies.
Created attachment 64353 [details] Oprofile log with current algorithm Basically scrolling identi.ca automatically in epiphany for some time (big blur under the main box in that page) FEGaussianBlur::apply 8.16% percent of the time 6.68% of the time in the blur algorithm It is that fast because it is using tiling to create the shadow from a small one.
Created attachment 64355 [details] Oprofile log with ariya's proposed algorithm Same test scrolling identi.ca automatically in epiphany for some time (big blur under the main box in that page) FEGaussianBlur::apply 23.81% percent of the time divided mainly in GraphicsContext::fillRect 16.13% copying the rotated image FEGaussianBlur::blurHorizontal 5.05% the real algorithm It is again fast because it is using tiling to create the shadow from a small one. Anyway, if we could avoid the fillRects I would say the two algorithms are pretty similar, we could even add the fixed point operation to the current one and improve a little bit more. I hope it helps.
Created attachment 64558 [details] Ariya's proposal implemented in the FEGaussianBlur Added Sencha copyright
*** This bug has been marked as a duplicate of bug 45599 ***