Before: 1 blend 2083500 pixel image took 8408 us 2 blend 2083500 pixel image took 8576 us 3 blend 2083500 pixel image took 9006 us 4 blend 2083500 pixel image took 9942 us 5 blend 2083500 pixel image took 9592 us After: 1 blend 2083500 pixel image took 6792 us 2 blend 2083500 pixel image took 6592 us 3 blend 2083500 pixel image took 6896 us 4 blend 2083500 pixel image took 8368 us 5 blend 2083500 pixel image took 7974 us 20% speedup
<rdar://problem/10215221>
Created attachment 109310 [details] patch
(In reply to comment #2) > Created an attachment (id=109310) [details] > patch + caching bbox autofill :-\
Attachment 109310 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1 Source/WebCore/platform/graphics/filters/FEBlend.cpp:124: A case label should not be indented, but line up with its switch statement. [whitespace/indent] [4] Total errors found: 1 in 2 files If any of these errors are false positives, please file a bug against check-webkit-style.
Created attachment 109312 [details] style
Comment on attachment 109312 [details] style Putting the switch inside the loop speeds things up by 20%. Could you speed up even more by putting the switch outside the loop? With template functions you could even inline the functions and have the switch statement outside the entire apply function. How much more would that speed things up?
(In reply to comment #6) > (From update of attachment 109312 [details]) > Putting the switch inside the loop speeds things up by 20%. Could you speed up even more by putting the switch outside the loop? With template functions you could even inline the functions and have the switch statement outside the entire apply function. How much more would that speed things up? Probably not much thanks to the advanced branch predictors. But templates probably could be an easy way to do this refactor, and we can see what happens.
Comment on attachment 109312 [details] style Clearing flags on attachment: 109312 Committed r96407: <http://trac.webkit.org/changeset/96407>
All reviewed patches have been landed. Closing bug.
(In reply to comment #7) > (In reply to comment #6) > > (From update of attachment 109312 [details] [details]) > > Putting the switch inside the loop speeds things up by 20%. Could you speed up even more by putting the switch outside the loop? With template functions you could even inline the functions and have the switch statement outside the entire apply function. How much more would that speed things up? > > Probably not much thanks to the advanced branch predictors. But templates probably could be an easy way to do this refactor, and we can see what happens. Seems to be a very small change, if at all; I guess I got it to the point where the compiler could do the rest of the work and hoist the switch out of the loop. I'm testing with release builds; I bet your proposed optimizations would show a bigger change in a debug build.