WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
UNCONFIRMED
101187
Reduce the memory usage of ConvolverNode
https://bugs.webkit.org/show_bug.cgi?id=101187
Summary
Reduce the memory usage of ConvolverNode
Xingnan Wang
Reported
2012-11-04 23:18:53 PST
As title.
Attachments
Patch
(21.73 KB, patch)
2012-11-04 23:33 PST
,
Xingnan Wang
no flags
Details
Formatted Diff
Diff
Patch
(13.56 KB, patch)
2012-11-08 21:15 PST
,
Xingnan Wang
no flags
Details
Formatted Diff
Diff
Patch
(13.58 KB, patch)
2012-11-12 01:27 PST
,
Xingnan Wang
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Xingnan Wang
Comment 1
2012-11-04 23:33:36 PST
Created
attachment 172275
[details]
Patch
Xingnan Wang
Comment 2
2012-11-04 23:36:11 PST
(In reply to
comment #1
)
> Created an attachment (id=172275) [details] > Patch
From my test, it reduced around half of memory usage. The patch is only for ffmpeg back-end currently.
Build Bot
Comment 3
2012-11-04 23:41:27 PST
Comment on
attachment 172275
[details]
Patch
Attachment 172275
[details]
did not pass mac-ews (mac): Output:
http://queues.webkit.org/results/14719701
Chris Rogers
Comment 4
2012-11-05 10:08:47 PST
The memory saving looks good, but I'm a bit concerned about the CPU hit, any comments about that?
Xingnan Wang
Comment 5
2012-11-05 22:22:38 PST
(In reply to
comment #4
)
> The memory saving looks good, but I'm a bit concerned about the CPU hit, any comments about that?
There are not obvious difference of CPU usage before and after applying my patch from my test. For the dram machine demo(
http://chromium.googlecode.com/svn/trunk/samples/audio/shiny-drum-machine.html
), the CPU hit nearly doesn't change. And the memory usage reduce ~40M, which it should come from ConvolverNode. The patch should not increase the CPU usage for the convolver, because I just replace multiply of the non-interleaved data with multiply of interleaved data and avoid the interleaving/de-interleaving before and after the "multiply()". But for getting the non-interleaving data(from the two functions real() and imag()), it needs to do the in-place shuffle which may consume some more CPU, but it's not in the hot path.
Chris Rogers
Comment 6
2012-11-07 12:13:05 PST
Comment on
attachment 172275
[details]
Patch Xingnan, thanks for looking into reducing memory usage for ConvolverNode. I think there's a potentially much better way to achieve even much higher memory savings and not impact performance. Even if your tests show little performance hit with the interleaving/deinterleaving, there are a range of processor architectures and chip models that this runs on. We'd have to test them all very carefully. Plus we don't want to take *any* performance hit at all. Here's my idea: FFTConvolver has a "m_frame" member variable which could be shared among all similar FFTConvolver objects of the same size (and which are running on the same thread). The process() method could be changed to take an additional argument which is the FFTFrame (shared). It could even be made more fancy where this additional argument is optional, and the FFTConvolver would lazily create its own "m_frame" if one wasn't provided. As an example, you can see where FFTConvolver is used in HRTFPanner which has four of them. Not only could a single FFTFrame be shared for all four, but it could also be shared among other HRTFPanner objects running in that same thread. Similarly, for ConvolverNode, all the ReverbConvolverStage's of the same size (and running on the same thread) can be given a shared FFTFrame. The memory savings this way could be very significant and would help ALL the different FFT back-ends, instead of just the FFmpeg one.
Xingnan Wang
Comment 7
2012-11-08 21:15:33 PST
Created
attachment 173194
[details]
Patch
Xingnan Wang
Comment 8
2012-11-08 21:26:08 PST
(In reply to
comment #7
)
> Created an attachment (id=173194) [details] > Patch
(In reply to
comment #7
)
> Created an attachment (id=173194) [details] > Patch
That`s a good idea to share the FFTFrame, and it really simplifies the problem. Update the patch which covers your idea and removed the non-interleaved multiply.
WebKit Review Bot
Comment 9
2012-11-09 00:42:35 PST
Comment on
attachment 173194
[details]
Patch
Attachment 173194
[details]
did not pass chromium-ews (chromium-xvfb): Output:
http://queues.webkit.org/results/14786062
New failing tests: inspector-protocol/debugger-terminate-dedicated-worker-while-paused.html
Xingnan Wang
Comment 10
2012-11-12 01:27:31 PST
Created
attachment 173586
[details]
Patch
Xingnan Wang
Comment 11
2012-11-12 18:01:03 PST
(In reply to
comment #10
)
> Created an attachment (id=173586) [details] > Patch
Resubmitted the patch, and made sure my patch did not result in last failing test.
Chris Rogers
Comment 12
2012-11-13 17:54:03 PST
Hi Xingnan, sorry this is taking me a little time to look at. Thanks for trying the new approach, I'll try to review as soon as I can...
Xingnan Wang
Comment 13
2012-11-13 18:10:51 PST
(In reply to
comment #12
)
> Hi Xingnan, sorry this is taking me a little time to look at. Thanks for trying the new approach, I'll try to review as soon as I can...
Chris, that doesn't matter, I am not in a hurry.
Anders Carlsson
Comment 14
2014-02-05 11:07:27 PST
Comment on
attachment 173586
[details]
Patch Clearing review flag on patches from before 2014. If this patch is still relevant, please reset the r? flag.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug