RESOLVED FIXED 219546
Improve vectorization in SincResampler
https://bugs.webkit.org/show_bug.cgi?id=219546
Summary Improve vectorization in SincResampler
Chris Dumez
Reported 2020-12-04 11:29:39 PST
Improve vectorization in SincResampler.
Attachments
Patch (15.46 KB, patch)
2020-12-04 11:42 PST, Chris Dumez
no flags
Patch (15.84 KB, patch)
2020-12-04 12:12 PST, Chris Dumez
no flags
Patch (15.76 KB, patch)
2020-12-04 13:11 PST, Chris Dumez
no flags
Chris Dumez
Comment 1 2020-12-04 11:42:49 PST
Chris Dumez
Comment 2 2020-12-04 12:12:20 PST
Darin Adler
Comment 3 2020-12-04 12:24:01 PST
Comment on attachment 415442 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=415442&action=review How did you test all four versions? > Source/WebCore/platform/audio/SincResampler.cpp:348 > + vDSP_dotpr(inputP, 1, k1, 1, &sum1, kernelSize); > + vDSP_dotpr(inputP, 1, k2, 1, &sum2, kernelSize); Is there a reason that we don’t write a 4 different implementations of a "dot product" function instead of having the algorithm differ a level above that? > Source/WebCore/platform/audio/SincResampler.cpp:351 > + return static_cast<float>((1.0 - kernelInterpolationFactor) * sum1 + kernelInterpolationFactor * sum2); Is it important do do this as double and then convert back to float? I notice we do that in both versions.
Chris Dumez
Comment 4 2020-12-04 12:32:40 PST
(In reply to Darin Adler from comment #3) > Comment on attachment 415442 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=415442&action=review > > How did you test all four versions? I have an Intel Mac so I was able to test ACCELERATE, SSE and slow implementations. The only implementation I could not test was NEON but this implementation is a straight import from Chromium. CPU(ARM_NEON) does not appear to be true on iPhone unless I tested wrong. > > > Source/WebCore/platform/audio/SincResampler.cpp:348 > > + vDSP_dotpr(inputP, 1, k1, 1, &sum1, kernelSize); > > + vDSP_dotpr(inputP, 1, k2, 1, &sum2, kernelSize); > > Is there a reason that we don’t write a 4 different implementations of a > "dot product" function instead of having the algorithm differ a level above > that? I guess we could do that. I was trying to stay in sync with Chromium here. ACCELERATE() is the only implementation that is not present in Chromium but that I chose to add for better performance on Cocoa platforms. > > > Source/WebCore/platform/audio/SincResampler.cpp:351 > > + return static_cast<float>((1.0 - kernelInterpolationFactor) * sum1 + kernelInterpolationFactor * sum2); > > Is it important do do this as double and then convert back to float? I > notice we do that in both versions. Probably not since our vectorized versions only work with floats. I will test and fix.
Chris Dumez
Comment 5 2020-12-04 13:11:27 PST
EWS
Comment 6 2020-12-04 14:21:45 PST
Committed r270457: <https://trac.webkit.org/changeset/270457> All reviewed patches have been landed. Closing bug and clearing flags on attachment 415453 [details].
Radar WebKit Bug Importer
Comment 7 2020-12-04 14:22:19 PST
Pablo Saavedra
Comment 8 2020-12-09 03:45:44 PST
Note You need to log in before you can comment on or make changes to this bug.