I've been playing with FEGaussian blur for CSS shadows, and have some optimizations.
Created attachment 76311 [details] Patch
Attachment 76311 [details] did not build on chromium: Build output: http://queues.webkit.org/results/7000058
Created attachment 76312 [details] Patch
Attachment 76311 [details] did not build on win: Build output: http://queues.webkit.org/results/6918061
Using the fixed-point math (BlurSumShift) from ContextShadow gets us to 2x performance, with some very minor pixel differences.
Comment on attachment 76312 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=76312&action=review > WebCore/ChangeLog:15 > + These changes giave a 1.42x performance improvement. Typo: giave. > WebCore/platform/graphics/filters/FEGaussianBlur.cpp:133 > +inline void boxBlurAlphaOnly(ByteArray* srcPixelArray, ByteArray* dstPixelArray, > + unsigned dx, int dxLeft, int dxRight, int stride, int strideLine, int effectWidth, int effectHeight) Weird alignment here.
How and on what test case you measured the perf impovement? You could also add a new WebCore/manual-test.
I'd also like to see a perf test for gaussian blur in general, not only for this patch. (In reply to comment #5) > Using the fixed-point math (BlurSumShift) from ContextShadow gets us to 2x performance, with some very minor pixel differences. Did your last patch still cause some pixel differences? Alex and others spend a lot of time in making FEGaussianBlur as compatible and comparable as possible with other viewers. Would like to know if the difference is noticeable. Even so, can you update the pixel results please? PS: Thanks for working on filters :-).
The patch posted here is math-identical with the current code. I'll update pixel results if I post another patch with the fixed-point math.
I was measuring perf with some custom code that blurred a 382x382 rectangle.
Attachment 76312 [details] was posted by a committer and has review+, assigning to Simon Fraser for commit.
Any news on this bug? Simon, do you land the patch?
I doubt it is still apply...
Comment on attachment 76312 [details] Patch Clearing review flag on this old patch.
Bah, I pretty much rewrote this patch from scratch today.
(In reply to comment #15) > Bah, I pretty much rewrote this patch from scratch today. Where is this patch? Can we mark this one as duplicated or do you reuse this bug report?
Looks nice. I am just thinking that the following part: // Fill the kernel for (int i = 0; i < maxKernelSize; ++i) { unsigned offset = line + i * stride; unsigned char* srcPtr = srcPixelArray->data() + offset; sumR += *srcPtr++; sumG += *srcPtr++; sumB += *srcPtr++; sumA += *srcPtr; } could be changed to: unsigned char* srcPtr = srcPixelArray->data() + line; for (int i = 0; i < maxKernelSize; ++i, srcPtr += stride) { sumR += srcPtr[0]; sumG += srcPtr[1]; sumB += srcPtr[2]; sumA += srcPtr[3]; } or: unsigned char* srcPtr = srcPixelArray->data() + line; unsigned char* end = srcPtr + maxKernelSize * stride; while (srcPtr < end) { sumR += srcPtr[0]; sumG += srcPtr[1]; sumB += srcPtr[2]; sumA += srcPtr[3]; srcPtr += stride } Something similar could be done in the internal loop.
The real killer right now is bug 120813, which swamps any other optimization twiddling we try to do in this function.
Created attachment 227342 [details] Patch
I observed an improvement of 7 to 10% in SVG perf test MtSaintHelens.
Created attachment 227343 [details] 6 executions (3 vanilla, 3 patched)
Comment on attachment 227342 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=227342&action=review smfr or dino, do you have further input? > Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:94 > +inline void boxBlurAlphaOnly(Uint8ClampedArray* srcPixelArray, Uint8ClampedArray* dstPixelArray, is it really not possible to combine that with the code following? Looks sad that we have a separate code path just for SourceAlpha cases. Ah, looked at the following code again... So you save iterating 4 times through the image data. > Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:142 > + if (alphaImage) { > + return boxBlurAlphaOnly(srcPixelArray, dstPixelArray, dx, dxLeft, dxRight, stride, strideLine, > + effectWidth, effectHeight, maxKernelSize); > + } You can omit braces here. > Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:225 > + no unnecessary new lines > Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:244 > + no unnecessary new lines > Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:259 > + Ditto.
Comment on attachment 227342 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=227342&action=review > Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:95 > +inline void boxBlurAlphaOnly(Uint8ClampedArray* srcPixelArray, Uint8ClampedArray* dstPixelArray, > + unsigned dx, int& dxLeft, int& dxRight, int& stride, int& strideLine, int& effectWidth, int& effectHeight, const int& maxKernelSize) The Uint8ClampedArray* should be Uint8ClampedArray& (in the caller too).
(In reply to comment #23) > (From update of attachment 227342 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=227342&action=review > > > Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:95 > > +inline void boxBlurAlphaOnly(Uint8ClampedArray* srcPixelArray, Uint8ClampedArray* dstPixelArray, > > + unsigned dx, int& dxLeft, int& dxRight, int& stride, int& strideLine, int& effectWidth, int& effectHeight, const int& maxKernelSize) > > The Uint8ClampedArray* should be Uint8ClampedArray& (in the caller too). Adenilson investigates in a potential regression. If it turns out to be a false negative r=me as well. Otherwise we need to investigate the reason first.
Created attachment 227457 [details] filters (css3/svg) and fast tests results in Vanilla (trunk of today)
Created attachment 227459 [details] Patched resultsfilters (css3/svg) and fast tests results
Created attachment 227465 [details] Patch
Comment on attachment 227465 [details] Patch Clearing flags on attachment: 227465 Committed r166084: <http://trac.webkit.org/changeset/166084>
All reviewed patches have been landed. Closing bug.