WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 139310
[Apple] Use Accelerate framework to speed-up FEGaussianBlur
https://bugs.webkit.org/show_bug.cgi?id=139310
Summary
[Apple] Use Accelerate framework to speed-up FEGaussianBlur
Dean Jackson
Reported
2014-12-05 13:19:45 PST
Accelerate.framework provides a better speedup than parallelJobs on Apple platforms, assuming the kernel is compatible. We should use it. <
rdar://problem/18434594
>
Attachments
Patch
(12.88 KB, patch)
2014-12-05 14:01 PST
,
Dean Jackson
no flags
Details
Formatted Diff
Diff
Archive of layout-test-results from webkit-ews-10 for mac-mountainlion-wk2
(572.95 KB, application/zip)
2014-12-05 16:19 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from webkit-ews-12 for mac-mountainlion-wk2
(511.67 KB, application/zip)
2014-12-05 18:06 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from webkit-ews-04 for mac-mountainlion
(588.94 KB, application/zip)
2014-12-05 19:24 PST
,
Build Bot
no flags
Details
Archive of layout-test-results from webkit-ews-07 for mac-mountainlion
(652.90 KB, application/zip)
2014-12-05 21:38 PST
,
Build Bot
no flags
Details
Patch
(15.89 KB, patch)
2014-12-08 15:04 PST
,
Dean Jackson
simon.fraser
: review+
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Dean Jackson
Comment 1
2014-12-05 14:01:26 PST
Created
attachment 242659
[details]
Patch
Tim Horton
Comment 2
2014-12-05 14:28:03 PST
Comment on
attachment 242659
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=242659&action=review
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:33 > +#if PLATFORM(COCOA) && HAVE(ACCELERATE)
Anders thinks this should just be HAVE(ACCELERATE) (and elsewhere)
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:290 > + vImageBoxConvolve_ARGB8888(&effectInBuffer, &effectOutBuffer, 0, 0, 0, radius, radius, 0, kvImageEdgeExtend); > + vImageBoxConvolve_ARGB8888(&effectOutBuffer, &effectInBuffer, 0, 0, 0, radius, radius, 0, kvImageEdgeExtend); > + vImageBoxConvolve_ARGB8888(&effectInBuffer, &effectOutBuffer, 0, 0, 0, radius, radius, 0, kvImageEdgeExtend);
We should consider using a shared temp buffer here instead of allocating three. There are instructs in the docs for figuring out how big to make it.
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:304 > + else { > +#endif
Can we write this differently so we don't have a brace inside an ifdef with its closing brace in a different one?
> ManualTests/blur-filter-timing.html:1 > +<!DOCTYPE html>
Should be in PerformanceTests/Interactive, I think?
> ManualTests/blur-filter-timing.html:13 > + var WIDTH = 600;
why so many caps?
Simon Fraser (smfr)
Comment 3
2014-12-05 14:28:18 PST
Comment on
attachment 242659
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=242659&action=review
> Source/WebCore/ChangeLog:14 > + especially on retina devices. The problem was blur, and that we > + fixed a bug about 9 months ago that computed filters at the
Would be good to reference the exact revision.
> Source/WebCore/ChangeLog:34 > + FWIW, I also tried a CoreImage implementation, but the expense of > + copying back from the GPU was too high. When we have a fully accelerated > + filter chain (without needing compositing) I expect this approach > + will win.
This belongs in the bug, but not in the changelog.
> Source/WebCore/ChangeLog:36 > + There is a manual timing test: ManualTests/blur-filter-timing.html
I think you should add a perf test.
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:274 > + // The accelerated box convolve approximation only works with an odd radius. > + if (radius % 2 != 1) > + radius += 1;
So this doesn't exactly match the other impl? Is this visible?
Build Bot
Comment 4
2014-12-05 16:19:26 PST
Comment on
attachment 242659
[details]
Patch
Attachment 242659
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.appspot.com/results/5812017482956800
New failing tests: svg/filters/svg-gaussianblur-edgeMode-duplicate.svg
Build Bot
Comment 5
2014-12-05 16:19:30 PST
Created
attachment 242673
[details]
Archive of layout-test-results from webkit-ews-10 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-10 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 6
2014-12-05 18:06:39 PST
Comment on
attachment 242659
[details]
Patch
Attachment 242659
[details]
did not pass mac-wk2-ews (mac-wk2): Output:
http://webkit-queues.appspot.com/results/6422366695456768
New failing tests: svg/filters/svg-gaussianblur-edgeMode-duplicate.svg
Build Bot
Comment 7
2014-12-05 18:06:43 PST
Created
attachment 242690
[details]
Archive of layout-test-results from webkit-ews-12 for mac-mountainlion-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: webkit-ews-12 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.5
Build Bot
Comment 8
2014-12-05 19:24:38 PST
Comment on
attachment 242659
[details]
Patch
Attachment 242659
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.appspot.com/results/5611548273803264
New failing tests: svg/filters/svg-gaussianblur-edgeMode-duplicate.svg
Build Bot
Comment 9
2014-12-05 19:24:42 PST
Created
attachment 242695
[details]
Archive of layout-test-results from webkit-ews-04 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-04 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Build Bot
Comment 10
2014-12-05 21:38:32 PST
Comment on
attachment 242659
[details]
Patch
Attachment 242659
[details]
did not pass mac-ews (mac): Output:
http://webkit-queues.appspot.com/results/6322940886908928
New failing tests: svg/filters/svg-gaussianblur-edgeMode-duplicate.svg
Build Bot
Comment 11
2014-12-05 21:38:35 PST
Created
attachment 242701
[details]
Archive of layout-test-results from webkit-ews-07 for mac-mountainlion The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: webkit-ews-07 Port: mac-mountainlion Platform: Mac OS X 10.8.5
Dean Jackson
Comment 12
2014-12-08 11:52:11 PST
Comment on
attachment 242659
[details]
Patch I think Simon meant r-. Either way, updated patch is coming.
Dean Jackson
Comment 13
2014-12-08 12:00:11 PST
Comment on
attachment 242659
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=242659&action=review
>> Source/WebCore/ChangeLog:14 >> + fixed a bug about 9 months ago that computed filters at the > > Would be good to reference the exact revision.
OK.
>> Source/WebCore/ChangeLog:34 >> + will win. > > This belongs in the bug, but not in the changelog.
OK.
>> Source/WebCore/ChangeLog:36 >> + There is a manual timing test: ManualTests/blur-filter-timing.html > > I think you should add a perf test.
OK.
>> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:33 >> +#if PLATFORM(COCOA) && HAVE(ACCELERATE) > > Anders thinks this should just be HAVE(ACCELERATE) (and elsewhere)
OK.
>> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:274 >> + radius += 1; > > So this doesn't exactly match the other impl? Is this visible?
This code comes from an example posted at WWDC by the Accelerate team (it was called UIImage+ImageEffects, but seems to have disappeared from the Apple site). In that example it is emulating a Gaussian blur, and references the SVG specification for how to do it with 3 box blurs.
http://www.w3.org/TR/SVG/filters.html#feGaussianBlurElement
That includes "... if d is odd, use three box-blurs of size 'd', centered on the output pixel." This is also the same approach that our software path uses, so we're ok here.
>> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:290 >> + vImageBoxConvolve_ARGB8888(&effectInBuffer, &effectOutBuffer, 0, 0, 0, radius, radius, 0, kvImageEdgeExtend); > > We should consider using a shared temp buffer here instead of allocating three. There are instructs in the docs for figuring out how big to make it.
Great point.
>> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:304 >> +#endif > > Can we write this differently so we don't have a brace inside an ifdef with its closing brace in a different one?
I tried a few ways to do this, but they were all slightly ugly. <rant>I have no idea how a method could be called platformApplyGeneric :) It makes no sense!</rant> I'll fix this up by splitting both the new code and the old code into different static methods, to see if that makes things cleaner.
>> ManualTests/blur-filter-timing.html:13 >> + var WIDTH = 600; > > why so many caps?
Because that's what I do in JS to represent constants :)
Dean Jackson
Comment 14
2014-12-08 14:57:40 PST
A similar approach could speed up some of the other filters, but blur is the one that has the biggest performance suck. FWIW, I also tried a CoreImage implementation, but the expense of copying back from the GPU was too high. When we have a fully accelerated filter chain (without needing compositing) I expect this approach will win.
Dean Jackson
Comment 15
2014-12-08 15:04:32 PST
Created
attachment 242852
[details]
Patch
Simon Fraser (smfr)
Comment 16
2014-12-08 16:56:02 PST
Comment on
attachment 242852
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=242852&action=review
> Source/WebCore/ChangeLog:17 > + We were getting reports of page slowdown using CSS filters, > + especially on retina devices. The problem was blur, and that we > + fixed a bug about 7 months ago that computed filters at the > + correct resolution (
r168577
). When not composited, we were now operating > + on 4x the pixels, and performing an expensive operation multiple > + times.
This is still a bit of a "sit down, let me tell you a story"
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:52 > +inline void kernelPosition(int boxBlur, unsigned& std, int& dLeft, int& dRight)
std, dLeft, dRight are terrible names.
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:55 > + switch (boxBlur) {
is boxBlur really the step? Could use a better name, like blurStep.
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:322 > + RefPtr<Uint8ClampedArray> tmpBuffer = Uint8ClampedArray::createUninitialized(tmpBufferSize);
Can't the buffer just be a malloc?
Dean Jackson
Comment 17
2014-12-08 17:31:41 PST
Committed
r177000
: <
http://trac.webkit.org/changeset/177000
>
Carlos Alberto Lopez Perez
Comment 18
2014-12-10 05:46:25 PST
(In reply to
comment #17
)
> Committed
r177000
: <
http://trac.webkit.org/changeset/177000
>
The perf test Interactive/blur-filter-timing.html added by this commit is failing on ALL platforms except Apple Yosemite. GTK:
https://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release%20%28Perf%29
Apple MountainLion:
https://build.webkit.org/builders/Apple%20MountainLion%20Release%20%28Perf%29
Apple Mavericks:
https://build.webkit.org/builders/Apple%20Mavericks%20Release%20%28Perf%29
EFL:
https://build.webkit.org/builders/EFL%20Linux%2064-bit%20Release%20WK2%20%28Perf%29
For example, on GTK it fails with: $ Tools/Scripts/run-perf-tests --no-show-results --platform gtk --release -2 Interactive/blur-filter-timing.html Running 1 tests Running Interactive/blur-filter-timing.html (1 of 1) ERROR: layer at (0,0) size 785x673 FAILED Finished: 2.328078 s Also, it seems is testing for a feature that is specific to Mac, so can we make it run only on the platforms where is supposed to work?
David Kilzer (:ddkilzer)
Comment 19
2014-12-23 22:30:23 PST
(In reply to
comment #18
)
> (In reply to
comment #17
) > > Committed
r177000
: <
http://trac.webkit.org/changeset/177000
> > > The perf test Interactive/blur-filter-timing.html added by this commit is > failing on ALL platforms except Apple Yosemite. > > GTK: >
https://build.webkit.org/builders/GTK%20Linux%2064-bit%20Release%20%28Perf%29
> Apple MountainLion: >
https://build.webkit.org/builders/Apple%20MountainLion%20Release%20%28Perf%29
> Apple Mavericks: >
https://build.webkit.org/builders/Apple%20Mavericks%20Release%20%28Perf%29
> EFL: >
https://build.webkit.org/builders/EFL%20Linux%2064
- > bit%20Release%20WK2%20%28Perf%29 > > For example, on GTK it fails with: > > $ Tools/Scripts/run-perf-tests --no-show-results --platform gtk --release -2 > Interactive/blur-filter-timing.html > Running 1 tests > Running Interactive/blur-filter-timing.html (1 of 1) > ERROR: layer at (0,0) size 785x673 > FAILED > Finished: 2.328078 s > > Also, it seems is testing for a feature that is specific to Mac, so can we > make it run only on the platforms where is supposed to work?
Test fixed by:
Bug 139462
: Blur filter performance test doesn't provide results <
https://bugs.webkit.org/show_bug.cgi?id=139462
> <
http://trac.webkit.org/changeset/177088
>
Benjamin Poulain
Comment 20
2014-12-24 13:44:43 PST
Comment on
attachment 242852
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=242852&action=review
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:381 > +#if HAVE(ACCELERATE)
Did you test ARMv7? I'd be curious to know the difference between accelerate and Neon on ARMv7. The amortized perf of accelerate is okay but it tends to be bad on small loads. Here it is even worst because accelerate needs to go multiple time over the buffer. Is that really faster than the NEON intrinsics? ARM64 has AdvSIMD which is very close do Neon. It would be easy to make a fast implementation directly instead of using accelerate.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug