WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
59447
Optimizing gaussian blur filter to ARM-neon SIMD instruction set
https://bugs.webkit.org/show_bug.cgi?id=59447
Summary
Optimizing gaussian blur filter to ARM-neon SIMD instruction set
Zoltan Herczeg
Reported
2011-04-26 04:51:09 PDT
Faster shadow casting.
Attachments
patch
(26.34 KB, patch)
2011-04-26 05:07 PDT
,
Zoltan Herczeg
krit
: review-
Details
Formatted Diff
Diff
clever patch
(33.54 KB, patch)
2011-04-27 00:44 PDT
,
Zoltan Herczeg
no flags
Details
Formatted Diff
Diff
next try
(57.54 KB, patch)
2011-04-27 04:15 PDT
,
Zoltan Herczeg
no flags
Details
Formatted Diff
Diff
mac fix
(63.69 KB, patch)
2011-04-27 07:05 PDT
,
Zoltan Herczeg
no flags
Details
Formatted Diff
Diff
final patch
(65.31 KB, patch)
2011-04-27 23:54 PDT
,
Zoltan Herczeg
no flags
Details
Formatted Diff
Diff
patch
(65.37 KB, patch)
2011-04-28 03:56 PDT
,
Zoltan Herczeg
zimmermann
: review+
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Zoltan Herczeg
Comment 1
2011-04-26 05:07:10 PDT
Created
attachment 91092
[details]
patch More arm magic.
WebKit Review Bot
Comment 2
2011-04-26 05:09:45 PDT
Attachment 91092
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/platform/graphics/filters/arm/FEGaussianBlurNEON.cpp:182: The parameter name "NL" adds no information, so it should be removed. [readability/parameter_name] [5] Total errors found: 1 in 8 files If any of these errors are false positives, please file a bug against check-webkit-style.
Dirk Schulze
Comment 3
2011-04-26 05:40:00 PDT
Comment on
attachment 91092
[details]
patch View in context:
https://bugs.webkit.org/attachment.cgi?id=91092&action=review
> Source/WebCore/ChangeLog:10 > + This patch contains two sub-routines, one for speeding up > + alpha channel only gaussian blur (by 2.5 times) and one > + for speeding up all channel blur (by 4 times).
I wish you could move more code to the NEON files. I fear that we run into the same disaster like SVG and HTML Canvas were three years ago. Everyone is adding his platform dependent code to the main file and no one is able to work with the code. What do you think about a similar design like GraphicsContext. GraphicsContext.cpp/.h interacts as Layer between WebCore and the platform dependent code. E.g. GraphicsContext::setStroke calls setPlatformStroke and so on. In your case the apply function could call platformApply. Every code that could be shared would still be part of apply or other functions in FEGaussianBlur. The rest moves to <platform>/FEGaussianBlur<platform>.cpp This way we could avoid all the #if PLATFORM clauses in FEGaussianBlue.cpp. What do you think?
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:202 > + int widthMultipliedByFour = 4 * paintSize.width();
this can be const
Zoltan Herczeg
Comment 4
2011-04-26 06:06:45 PDT
> I wish you could move more code to the NEON files. I fear that we run into the same disaster like SVG and HTML Canvas were three years ago. Everyone is adding his platform dependent code to the main file and no one is able to work with the code. What do you think about a similar design like GraphicsContext. GraphicsContext.cpp/.h interacts as Layer between WebCore and the platform dependent code. E.g. GraphicsContext::setStroke calls setPlatformStroke and so on. In your case the apply function could call platformApply. Every code that could be shared would still be part of apply or other functions in FEGaussianBlur. The rest moves to <platform>/FEGaussianBlur<platform>.cpp > This way we could avoid all the #if PLATFORM clauses in FEGaussianBlue.cpp. What do you think?
I like the idea although I have no idea how to implement it. We usually replace something in the middle of a big chunk of code, which depend on several local variables. Would be good if we could use inline functions... And perhaps dynamic CPU feature detection (later). So would it be a good idea to move the current code into an inline functions? Or how?
Nikolas Zimmermann
Comment 5
2011-04-26 10:59:28 PDT
(In reply to
comment #4
)
> > I wish you could move more code to the NEON files. I fear that we run into the same disaster like SVG and HTML Canvas were three years ago. Everyone is adding his platform dependent code to the main file and no one is able to work with the code. What do you think about a similar design like GraphicsContext. GraphicsContext.cpp/.h interacts as Layer between WebCore and the platform dependent code. E.g. GraphicsContext::setStroke calls setPlatformStroke and so on. In your case the apply function could call platformApply. Every code that could be shared would still be part of apply or other functions in FEGaussianBlur. The rest moves to <platform>/FEGaussianBlur<platform>.cpp > > This way we could avoid all the #if PLATFORM clauses in FEGaussianBlue.cpp. What do you think?
I think Dirks idea is very wise, we should go that way.
> > I like the idea although I have no idea how to implement it. We usually replace something in the middle of a big chunk of code, which depend on several local variables. Would be good if we could use inline functions... And perhaps dynamic CPU feature detection (later).
I'd say we just add a "void platformApply(xxx)" method, in eg. FEGaussianBlur.h where xxx are the needed arguments, which shall be passed from the apply() method -> apply() should call plaformApply(xxx). In your patch, the first ifdef section (for CPU(NEON)), would go into FEGaussianBlurNEON.cpp, and the #else part into FEGaussianBlurDefault.cpp. The Neon version compiles in FEGaussianBlur.cpp & FEGaussianBlurNeon.cpp, and the non-Neon builds FEGaussiaunBlur.cpp & FEGaussianBlurDefault.cpp.
> So would it be a good idea to move the current code into an inline functions? Or how?
Does my approach make sense?
Eric Seidel (no email)
Comment 6
2011-04-26 15:14:05 PDT
Krit, would you please review this today as part of the reviewathon?
Dirk Schulze
Comment 7
2011-04-26 15:19:46 PDT
Comment on
attachment 91092
[details]
patch I'm more for discussing patches if there is something to discuss than giving r- and discuss it afterwards. But I fear Eric ;-). r- See comments of Niko and me.
Eric Seidel (no email)
Comment 8
2011-04-26 18:28:48 PDT
Bwahahaha! Thanks Dirk! :)
Zoltan Herczeg
Comment 9
2011-04-26 21:56:06 PDT
> > So would it be a good idea to move the current code into an inline functions? Or how? > Does my approach make sense?
I fear the code duplications and performance losses. For example, the non-NEON lighting filter code is a simple loop, since all the utility functions are available. If I would move it to a separate directory I would have to duplicite them, since we only speed-up the internal pixel drawing. The border pixels are still drawn in a common way (and those loops using those common functions). If I would make them non-inline (to avoid code duplication), it would be a huge performance loss. We want to speed up drawing not slowing it down :) I am totally agree about that we need to avoid ugly code, but unlike GraphicsContext, we have a lot of common core code we want to share among the platforms.
Nikolas Zimmermann
Comment 10
2011-04-26 23:46:04 PDT
(In reply to
comment #9
)
> > > So would it be a good idea to move the current code into an inline functions? Or how? > > Does my approach make sense? > > I fear the code duplications and performance losses. For example, the non-NEON lighting filter code is a simple loop, since all the utility functions are available. If I would move it to a separate directory I would have to duplicite them, since we only speed-up the internal pixel drawing. The border pixels are still drawn in a common way (and those loops using those common functions). If I would make them non-inline (to avoid code duplication), it would be a huge performance loss. We want to speed up drawing not slowing it down :)
Okay, but why not leave every code which can be shared in FEGaussianBlur itself? We don't need to move the inner loop and duplicate it. Why shall we make them non-inline? No need to, eh?
Zoltan Herczeg
Comment 11
2011-04-26 23:48:08 PDT
> Okay, but why not leave every code which can be shared in FEGaussianBlur itself? > We don't need to move the inner loop and duplicate it. > Why shall we make them non-inline? No need to, eh?
There are several possible solutions, and I can't choose among them :) If you are online, could we discuss this in #irc?
Zoltan Herczeg
Comment 12
2011-04-27 00:44:06 PDT
Created
attachment 91248
[details]
clever patch How about this solution?
WebKit Review Bot
Comment 13
2011-04-27 00:46:45 PDT
Attachment 91248
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:31: Alphabetical sorting problem. [build/include_order] [4] Total errors found: 1 in 11 files If any of these errors are false positives, please file a bug against check-webkit-style.
Build Bot
Comment 14
2011-04-27 01:24:58 PDT
Attachment 91248
[details]
did not build on win: Build output:
http://queues.webkit.org/results/8508897
Nikolas Zimmermann
Comment 15
2011-04-27 01:30:15 PDT
Comment on
attachment 91248
[details]
clever patch This looks great to me, doesn't build on win/efl though, can you look into that first?
WebKit Review Bot
Comment 16
2011-04-27 01:36:50 PDT
Attachment 91248
[details]
did not build on mac: Build output:
http://queues.webkit.org/results/8511308
Zoltan Herczeg
Comment 17
2011-04-27 04:15:09 PDT
Created
attachment 91266
[details]
next try
WebKit Review Bot
Comment 18
2011-04-27 04:41:26 PDT
Attachment 91266
[details]
did not build on mac: Build output:
http://queues.webkit.org/results/8509942
Zoltan Herczeg
Comment 19
2011-04-27 07:05:45 PDT
Created
attachment 91280
[details]
mac fix
Zoltan Herczeg
Comment 20
2011-04-27 08:15:30 PDT
Wow all build work! I know the Changelog should be updated, but otherwise, this patch is ready for review!
Dirk Schulze
Comment 21
2011-04-27 08:24:12 PDT
(In reply to
comment #20
)
> Wow all build work! I know the Changelog should be updated, but otherwise, this patch is ready for review!
Gtk is not ready yet ;-)
Dirk Schulze
Comment 22
2011-04-27 08:32:12 PDT
Comment on
attachment 91280
[details]
mac fix View in context:
https://bugs.webkit.org/attachment.cgi?id=91280&action=review
Much better! I'd like to let Niko make the final review. He may has more suggestions. The graphics code looks sane to me.
> Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:141 > +inline void FEGaussianBlur::platformApply(ByteArray* srcPixelArray, ByteArray* tmpPixelArray, unsigned kernelSizeX, unsigned kernelSizeY, IntSize& paintSize) > +{ > + // The selection here eventually should happen dynamically on some platforms. > +#if CPU(ARM_NEON) && COMPILER(GCC) > + platformApplyNeon(srcPixelArray, tmpPixelArray, kernelSizeX, kernelSizeY, paintSize); > +#else > + platformApplyGeneric(srcPixelArray, tmpPixelArray, kernelSizeX, kernelSizeY, paintSize); > +#endif > +}
I'd like to see this in the header as well. This way we would have a platform independent cpp file.
> Source/WebCore/platform/graphics/filters/FELighting.cpp:252 > +inline void FELighting::platformApply(LightingData& data, LightSource::PaintingData& paintingData) > +{ > + // The selection here eventually should happen dynamically on some platforms. > +#if CPU(ARM_NEON) && COMPILER(GCC) > + platformApplyNeon(data, paintingData); > +#else > + platformApplyGeneric(data, paintingData); > +#endif > +}
Ditto.
Collabora GTK+ EWS bot
Comment 23
2011-04-27 08:34:29 PDT
Attachment 91280
[details]
did not build on gtk: Build output:
http://queues.webkit.org/results/8507989
Zoltan Herczeg
Comment 24
2011-04-27 23:54:49 PDT
Created
attachment 91437
[details]
final patch If WildFox likes it xD
Early Warning System Bot
Comment 25
2011-04-28 00:10:35 PDT
Attachment 91437
[details]
did not build on qt: Build output:
http://queues.webkit.org/results/8519271
WebKit Review Bot
Comment 26
2011-04-28 00:12:44 PDT
Attachment 91437
[details]
did not build on chromium: Build output:
http://queues.webkit.org/results/8518286
WebKit Review Bot
Comment 27
2011-04-28 00:50:03 PDT
Attachment 91437
[details]
did not build on chromium: Build output:
http://queues.webkit.org/results/8520264
Zoltan Herczeg
Comment 28
2011-04-28 01:11:00 PDT
(In reply to
comment #22
)
> (From update of
attachment 91280
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=91280&action=review
> > Much better! I'd like to let Niko make the final review. He may has more suggestions. The graphics code looks sane to me. > > > Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp:141 > > +inline void FEGaussianBlur::platformApply(ByteArray* srcPixelArray, ByteArray* tmpPixelArray, unsigned kernelSizeX, unsigned kernelSizeY, IntSize& paintSize) > > +{ > > + // The selection here eventually should happen dynamically on some platforms. > > +#if CPU(ARM_NEON) && COMPILER(GCC) > > + platformApplyNeon(srcPixelArray, tmpPixelArray, kernelSizeX, kernelSizeY, paintSize); > > +#else > > + platformApplyGeneric(srcPixelArray, tmpPixelArray, kernelSizeX, kernelSizeY, paintSize); > > +#endif > > +} > > I'd like to see this in the header as well. This way we would have a platform independent cpp file.
Seems some compilers does not like the idea :(
WebKit Review Bot
Comment 29
2011-04-28 01:50:01 PDT
Attachment 91437
[details]
did not build on mac: Build output:
http://queues.webkit.org/results/8520276
Zoltan Herczeg
Comment 30
2011-04-28 03:56:38 PDT
Created
attachment 91465
[details]
patch
Nikolas Zimmermann
Comment 31
2011-04-28 05:11:04 PDT
Comment on
attachment 91465
[details]
patch Looks great to me! As always I can't say anything regarding the assembler part, but I'll trust you since you're the expert in this area! :-)
Zoltan Herczeg
Comment 32
2011-04-28 07:19:59 PDT
Landed in
http://trac.webkit.org/changeset/85180
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug