Summary: | feGaussianBlur CoreImage implementation | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Tim Horton <thorton> | ||||||||||||
Component: | SVG | Assignee: | Tim Horton <thorton> | ||||||||||||
Status: | ASSIGNED --- | ||||||||||||||
Severity: | Normal | CC: | dglazkov, dino, krit, sam, senorblanco, simon.fraser, webkit-bug-importer, webkit.review.bot, zherczeg, zimmermann | ||||||||||||
Priority: | P2 | Keywords: | InRadar | ||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Bug Depends on: | |||||||||||||||
Bug Blocks: | 68469, 68479 | ||||||||||||||
Attachments: |
|
Description
Tim Horton
2011-09-20 14:31:26 PDT
Created attachment 108053 [details]
initial (i.e. no changelogs, some tests fail, etc.) patch
Created attachment 108055 [details]
style fixes
I seem to need to update :-\ Also, some of the blurs in the tests are slightly different, I'm not sure what that's about. It's like... the gamma of the blur, or something, if that makes sense. Comment on attachment 108055 [details] style fixes View in context: https://bugs.webkit.org/attachment.cgi?id=108055&action=review > Source/WebCore/platform/graphics/filters/ci/FECITools.h:1 > +/* I suggest filters/coreimage/, not 'ci' > Source/WebCore/platform/graphics/filters/ci/FECITools.h:36 > +#import <QuartzCore/CIImagePrivate.h> // TODO: WKSI? (this is for kCIFormatRGBA8) Yep. > Source/WebCore/platform/graphics/filters/ci/FECITools.mm:33 > +CIImage* ciImageFromByteArray(ByteArray* src, const IntSize& size) This should return a RetainPtr > Source/WebCore/platform/graphics/filters/ci/FEGaussianBlurCI.mm:38 > + if(kernelSizeX != kernelSizeY) Space after if, brace on new line. Is there a measurable performance gain? If so, what is it? (In reply to comment #6) > Is there a measurable performance gain? If so, what is it? The frame rate bump I'm seeing on a 300x300 ellipse being blurred with stdDeviation=20 appears to be a little over 2x, on otherwise equivalent release builds, on my MacBook Pro. Also appears to be a significant decrease (80% of a core to 40% of a core) in CPU usage. (In reply to comment #4) > Also, some of the blurs in the tests are slightly different, I'm not sure what that's about. It's like... the gamma of the blur, or something, if that makes sense. premultiplied vs non? Looking at this I wonder if we need a way to store the CIFilter between applications. That would save us the compile and linking step (although maybe CI is really smart about it). (In reply to comment #8) > (In reply to comment #4) > > > Also, some of the blurs in the tests are slightly different, I'm not sure what that's about. It's like... the gamma of the blur, or something, if that makes sense. > > premultiplied vs non? Possible, good call, I'll check that. > Looking at this I wonder if we need a way to store the CIFilter between applications. That would save us the compile and linking step (although maybe CI is really smart about it). Hadn't considered that; I had wondered if it would be nice if we could pass the CIImage (as platform-specific userdata ala other parts of WebCore) through the filter chain instead of recreating it for each, but that might get messy, so I'm avoiding it for now. Created attachment 108092 [details]
new patch, with Simon's changes and WKSI update
Comment on attachment 108092 [details] new patch, with Simon's changes and WKSI update Attachment 108092 [details] did not pass cr-mac-ews (chromium): Output: http://queues.webkit.org/results/9772487 Tim, the purpose of platformApply, platformApplyGeneric grouping is dynamicaly detectable features. I.e: platformApply: if (RunOnARM && hasNEON) platformApplyNEON(...) if (hasOpenCLI) platformApplyOpenCLI(...) platformApplyGeneric(...) I know this is not yet implemented, but planned work this way. So the generic code always call platformApply which might call a single function on some platforms, or do feature dependent selection on others. So your man function could look like: platformApply: #if MAC if (kernelX == kernelY) platformApplyCF() #endif platformApplyGeneric(...) (In reply to comment #12) > Tim, the purpose of platformApply, platformApplyGeneric grouping is dynamicaly detectable features. > > I.e: > > platformApply: > if (RunOnARM && hasNEON) > platformApplyNEON(...) > if (hasOpenCLI) > platformApplyOpenCLI(...) > platformApplyGeneric(...) > > I know this is not yet implemented, but planned work this way. So the generic code always call platformApply which might call a single function on some platforms, or do feature dependent selection on others. > > So your man function could look like: > platformApply: > #if MAC > if (kernelX == kernelY) > platformApplyCF() > #endif > platformApplyGeneric(...) Ah, ok. I was quite confused because it seemed both a) different than everywhere else, and b) unnecessary, but it makes sense knowing that there were eventually plans to use it more seriously. I'll switch that part back for now. Created attachment 108237 [details]
example of brokenness
Attaching an example of what's going wrong. Here there's a white rectangle and a black rectangle on a gray background, the two rects overlap and are *together* in a blurred group.
It's not a premultiplication problem here since there's no alpha involved.
Not sure what's up.
Err, maybe I'm just getting the radius wrong. But I'm not sure why. No, nevermind, the radius is fine, obviously, since only tests with overlapping elements blurred together fail. Created attachment 108240 [details]
"example of brokenness" SVG source
This is really interesting. If you check the curve at the bottom-right corner of the white box, the "+patch" image clearly has an extra blurred circle. I mean the blur looks like: | Normal: | / _____/ | +Path: / ___ / \____/ Even if 2x speedup is quite good. I'd expect better results with a OpenCL implementation of filters (with a better usage of the GPU). OpenCL gets also supported by more and more SoCs. I wouldn't be surprised if the next iPhone ships with support for OpenCL. But we can try to go both ways. (In reply to comment #19) > Even if 2x speedup is quite good. I'd expect better results with a OpenCL implementation of filters (with a better usage of the GPU). OpenCL gets also supported by more and more SoCs. I wouldn't be surprised if the next iPhone ships with support for OpenCL. But we can try to go both ways. I would take a bet that we'd get a bigger speedup if we could avoid copying the CIImage back for rendering by CG. However this would require some surgery in SVG rendering. (In reply to comment #20) > (In reply to comment #19) > > Even if 2x speedup is quite good. I'd expect better results with a OpenCL implementation of filters (with a better usage of the GPU). OpenCL gets also supported by more and more SoCs. I wouldn't be surprised if the next iPhone ships with support for OpenCL. But we can try to go both ways. > > I would take a bet that we'd get a bigger speedup if we could avoid copying the CIImage back for rendering by CG. However this would require some surgery in SVG rendering. Maybe yes, but you'd need to do a compete new filters implementation. Means you need a new filter chain implemented in parallel. If you like to do that, it would be very hard to share any filter effect related work with other platforms (even with Safari Win). However, the code to do that is still in the revision history of SVN. You wouldn't need to start from scratch. IIRC it was removed even before I started working on WebKit (more than 5 years ago?). Might be related: https://bugs.webkit.org/show_bug.cgi?id=68899 I've noticed that some of the filters (e.g., FEOffset) use GraphicsContext and ImageBuffer directly. Since several ports already have accelerated implementations of GraphicsContext already, I think it's possible to trivially accelerate these filters today, simply by setting the "Accelerated" flag on ImageBuffer creation. Inspired by that, for the remaining filters, what do people think about either placing the required new APIs (e.g., gaussianBlur) on GraphicsContext itself? If that seems out of place on GraphicsContext (most of these operations would be essentially image-to-image, and would ignore most GraphicsContext state), another alternative would be to provide a platform-independant CoreImage-like API for other ports to implement. For example, we could have a class, call it ImageContext, which wraps a CIContext as GraphicsContext wraps a CGContext. Among other things, I think this would allow the Mac port to do efficient CGImage -> CIImage conversions, and also allow other ports to implement the filters using other backends, with only platform-independent code in the platform/graphics/filters directory. For example, the Skia backend has some image processing capabilities, but does not have a distinction between images used for image processing or drawing, so it would be nice to take advantage of that if possible. |