Bug 74592 - Optimize AudioBufferSourceNode process by avoiding interpolation when pitchRate==1
Summary: Optimize AudioBufferSourceNode process by avoiding interpolation when pitchRa...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Audio (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Wei James (wistoch)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-15 00:09 PST by Wei James (wistoch)
Modified: 2012-01-03 17:28 PST (History)
6 users (show)

See Also:


Attachments
Patch (8.71 KB, patch)
2011-12-15 00:19 PST, Wei James (wistoch)
no flags Details | Formatted Diff | Diff
Patch (8.70 KB, patch)
2011-12-15 00:52 PST, Wei James (wistoch)
no flags Details | Formatted Diff | Diff
Patch (7.97 KB, patch)
2011-12-20 18:51 PST, Wei James (wistoch)
no flags Details | Formatted Diff | Diff
Patch (8.04 KB, patch)
2011-12-21 19:00 PST, Wei James (wistoch)
no flags Details | Formatted Diff | Diff
Patch (8.75 KB, patch)
2011-12-22 22:10 PST, Wei James (wistoch)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Wei James (wistoch) 2011-12-15 00:09:27 PST
Optimize AudioBufferSourceNode process by avoiding interpolation when pitchRate==1
Comment 1 Wei James (wistoch) 2011-12-15 00:19:45 PST
Created attachment 119389 [details]
Patch
Comment 2 Wei James (wistoch) 2011-12-15 00:52:01 PST
Created attachment 119391 [details]
Patch
Comment 3 Raymond Toy 2011-12-16 10:25:03 PST
Comment on attachment 119391 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=119391&action=review

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:255
> +    if (!needInterpolation) {

needInterpolation isn't used anywhere else, so remove it, and change to if (pitchRate == 1).

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:258
> +        virtualReadIndex += framesToProcess;

virtualReadIndex isn't used in the if block below.  It would be nice to move this down near line 298 where it's actually used.

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:261
> +            memcpy(destinationL, sourceL + readIndex, sizeof(float) * framesToProcess);

I think sizeof(*sourceL) or sizeof(*destinationL) is better than sizeof(float).  Same applies to other memcpy places below.

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:273
> +            framesToProcess -= framesToEnd;

Maybe add ASSERT(framesToProcess >= 0) here?  Currently, this is always true, but this is an important invariant for the memcpy's below so we don't suddenly copy a huge number of samples to the destination.

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:277
> +                for (unsigned k = 0; k < rounds; ++k) {

Why not just use a while loop:

while (framesToProcess > 0) {
 ...
|

instead of the for loop with rounds?

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:279
> +                    destinationL += deltaFrames;

Is this right?  sourceL and startFrame are not modified in the for loop so we keep copying the same data from sourceL+startFrame to consecutive places in destinationL.

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:298
>          if (virtualReadIndex >= endFrame) {

Maybe move this into the else case of the if above (place near line 295)?  I think this condition can only happen in that case.

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:300
> +            virtualReadIndex -= ((virtualReadIndex - endFrame) / deltaFrames + 1) * deltaFrames;

Is this correct?  If virtualReadIndex - endFrame = 1, we get (1/deltaFrames + 1)*deltaFrames = deltaFrames + 1 (because virtualReadIndex is a float and ignoring roundoff).  I think we want to subtract just deltaFrames in this case.
Comment 4 Wei James (wistoch) 2011-12-17 01:39:58 PST
Comment on attachment 119391 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=119391&action=review

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:255
>> +    if (!needInterpolation) {
> 
> needInterpolation isn't used anywhere else, so remove it, and change to if (pitchRate == 1).

got it. thanks.

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:258
>> +        virtualReadIndex += framesToProcess;
> 
> virtualReadIndex isn't used in the if block below.  It would be nice to move this down near line 298 where it's actually used.

thanks. will modify it.

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:261
>> +            memcpy(destinationL, sourceL + readIndex, sizeof(float) * framesToProcess);
> 
> I think sizeof(*sourceL) or sizeof(*destinationL) is better than sizeof(float).  Same applies to other memcpy places below.

ok.

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:273
>> +            framesToProcess -= framesToEnd;
> 
> Maybe add ASSERT(framesToProcess >= 0) here?  Currently, this is always true, but this is an important invariant for the memcpy's below so we don't suddenly copy a huge number of samples to the destination.

got it.

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:277
>> +                for (unsigned k = 0; k < rounds; ++k) {
> 
> Why not just use a while loop:
> 
> while (framesToProcess > 0) {
>  ...
> |
> 
> instead of the for loop with rounds?

if the deltaFrames is small enough. we have to copy them repeatly. it will be more than once.

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:279
>> +                    destinationL += deltaFrames;
> 
> Is this right?  sourceL and startFrame are not modified in the for loop so we keep copying the same data from sourceL+startFrame to consecutive places in destinationL.

it is for the case that the deltaFrame is small enough. We have to copy the frames frome startFrame to endFrame more than once. we can not just wrap around once to get enought data. 

correct me if I misunderstanding the underlying algorithm.

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:298
>>          if (virtualReadIndex >= endFrame) {
> 
> Maybe move this into the else case of the if above (place near line 295)?  I think this condition can only happen in that case.

the case virtualReadIndex == endFrame will hit both blocks. in this case, we also need to finish the playing. correct me if I am wrong. thanks

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:300
>> +            virtualReadIndex -= ((virtualReadIndex - endFrame) / deltaFrames + 1) * deltaFrames;
> 
> Is this correct?  If virtualReadIndex - endFrame = 1, we get (1/deltaFrames + 1)*deltaFrames = deltaFrames + 1 (because virtualReadIndex is a float and ignoring roundoff).  I think we want to subtract just deltaFrames in this case.

the reason for using this statement is to resolve the issue that deltaFrame may be very small. Substracting deltaFrames once might still exceed endFrame. 

I will check the math again for float issue. thanks
Comment 5 Raymond Toy 2011-12-19 09:17:12 PST
Comment on attachment 119391 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=119391&action=review

>>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:277
>>> +                for (unsigned k = 0; k < rounds; ++k) {
>> 
>> Why not just use a while loop:
>> 
>> while (framesToProcess > 0) {
>>  ...
>> |
>> 
>> instead of the for loop with rounds?
> 
> if the deltaFrames is small enough. we have to copy them repeatly. it will be more than once.

Agreed.  I missed that case.

>>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:279
>>> +                    destinationL += deltaFrames;
>> 
>> Is this right?  sourceL and startFrame are not modified in the for loop so we keep copying the same data from sourceL+startFrame to consecutive places in destinationL.
> 
> it is for the case that the deltaFrame is small enough. We have to copy the frames frome startFrame to endFrame more than once. we can not just wrap around once to get enought data. 
> 
> correct me if I misunderstanding the underlying algorithm.

You are right.

>>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:300
>>> +            virtualReadIndex -= ((virtualReadIndex - endFrame) / deltaFrames + 1) * deltaFrames;
>> 
>> Is this correct?  If virtualReadIndex - endFrame = 1, we get (1/deltaFrames + 1)*deltaFrames = deltaFrames + 1 (because virtualReadIndex is a float and ignoring roundoff).  I think we want to subtract just deltaFrames in this case.
> 
> the reason for using this statement is to resolve the issue that deltaFrame may be very small. Substracting deltaFrames once might still exceed endFrame. 
> 
> I will check the math again for float issue. thanks

I didn't mean that we should always just subtract deltaFrames; just deltaFrames in this example.

Perhaps you meant to truncate (virtualReadIndex - endFrame)/deltaFrames to an integer first before adding one?  This will tell us how many deltaFrames past the endFrame that have been processed.
Comment 6 Chris Rogers 2011-12-19 16:21:14 PST
Comment on attachment 119391 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=119391&action=review

Hi James, it'll be good to get this optimization.  I'm sure we'll hit this case quite a bit (maybe majority of the time).

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:254
> +    bool needInterpolation = (pitchRate != 1);

The entire codeblock within "if (!needInterpolation) {" can be significantly simplified I believe.

In the current patch, there are 4 separate memcpy() sections (line 261, 266, 278, 286).  It should be possible to reduce this down to a single memcpy() section with the appropriate use of a "while()" loop, and by getting rid of the "for()" loop:

                unsigned rounds = framesToProcess / deltaFrames;
                for (unsigned k = 0; k < rounds; ++k) {

For example, you could use a "framesRemaining" variable which is initialized to "framesToProcess":

int framesRemaining = framesToProcess;
while (framesRemaining > ) {
    int framesThisTime = ....;
    memcpy() stuff...
    framesRemaining -= framesThisTime;
}

You'll have to add appropriate checking for end of buffer and looping, but I'm quite sure the whole codeblock can be a lot simpler than it currently is...

>>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:255
>>> +    if (!needInterpolation) {
>> 
>> needInterpolation isn't used anywhere else, so remove it, and change to if (pitchRate == 1).
> 
> got it. thanks.

We can't *just* check (pitchRate == 1).  We also need to check that virtualReadIndex is an integer.  I think in the normal case where this optimization will be used, we're talking about simply loading up a buffer at the default playback rate (of 1) and playing it with noteOn().  In this case virtualReadIndex will be an integer.  But, if we're in the middle of fiddling with the playbackRate (tweaking it with a slider to change pitch), then we could hit a case where playback rate is (temporarily) 1, but the virtualReadIndex is at a sub-sample position.  We don't want to do a simple memcpy then.

So please check both that (pitchRate == 1 ) and that virtualReadIndex is an integer (has no fractional portion).
Comment 7 Wei James (wistoch) 2011-12-20 18:51:33 PST
Created attachment 120134 [details]
Patch
Comment 8 Wei James (wistoch) 2011-12-20 18:59:41 PST
Comment on attachment 119391 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=119391&action=review

thanks for the comments. I have updated the patch according to your feedback. please help to review. thanks

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:254
>> +    bool needInterpolation = (pitchRate != 1);
> 
> The entire codeblock within "if (!needInterpolation) {" can be significantly simplified I believe.
> 
> In the current patch, there are 4 separate memcpy() sections (line 261, 266, 278, 286).  It should be possible to reduce this down to a single memcpy() section with the appropriate use of a "while()" loop, and by getting rid of the "for()" loop:
> 
>                 unsigned rounds = framesToProcess / deltaFrames;
>                 for (unsigned k = 0; k < rounds; ++k) {
> 
> For example, you could use a "framesRemaining" variable which is initialized to "framesToProcess":
> 
> int framesRemaining = framesToProcess;
> while (framesRemaining > ) {
>     int framesThisTime = ....;
>     memcpy() stuff...
>     framesRemaining -= framesThisTime;
> }
> 
> You'll have to add appropriate checking for end of buffer and looping, but I'm quite sure the whole codeblock can be a lot simpler than it currently is...

roger. thanks for the feedback. I have updated the patch according to your comments.

>>>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:255
>>>> +    if (!needInterpolation) {
>>> 
>>> needInterpolation isn't used anywhere else, so remove it, and change to if (pitchRate == 1).
>> 
>> got it. thanks.
> 
> We can't *just* check (pitchRate == 1).  We also need to check that virtualReadIndex is an integer.  I think in the normal case where this optimization will be used, we're talking about simply loading up a buffer at the default playback rate (of 1) and playing it with noteOn().  In this case virtualReadIndex will be an integer.  But, if we're in the middle of fiddling with the playbackRate (tweaking it with a slider to change pitch), then we could hit a case where playback rate is (temporarily) 1, but the virtualReadIndex is at a sub-sample position.  We don't want to do a simple memcpy then.
> 
> So please check both that (pitchRate == 1 ) and that virtualReadIndex is an integer (has no fractional portion).

thanks. added.
Comment 9 Raymond Toy 2011-12-21 09:55:28 PST
Comment on attachment 120134 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=120134&action=review

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:258
> +            int framesThisTime = framesToProcess < framesToEnd ? framesToProcess : framesToEnd;

Could we say min(framesToProcess, framesToEnd)?

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:260
> +            ASSERT(framesThisTime >= 0);

Maybe it's enough to say framesThisTime = max(0, framesThisTime).  I think we're still guaranteed to make progress because this implies framesToEnd <= 0 which means readIndex >= endFrame, and the code at line 273 will make sure we exit the loop or make readIndex eventually less than endFrame.
Comment 10 Eric Seidel (no email) 2011-12-21 11:42:09 PST
Comment on attachment 120134 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=120134&action=review

Once crogers/other audio folk are OK with this I'm happy to pull the r+ trigger.

> Source/WebCore/ChangeLog:8
> +        No new tests. (OOPS!)

Can we test this?

> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:295
> +        while (framesToProcess--) {

This is a horribly long loop/function.  would be better to break this up into static inline helper functions in a second pass.
Comment 11 Wei James (wistoch) 2011-12-21 16:45:25 PST
Comment on attachment 120134 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=120134&action=review

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:258
>> +            int framesThisTime = framesToProcess < framesToEnd ? framesToProcess : framesToEnd;
> 
> Could we say min(framesToProcess, framesToEnd)?

ok. will update it. thanks

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:260
>> +            ASSERT(framesThisTime >= 0);
> 
> Maybe it's enough to say framesThisTime = max(0, framesThisTime).  I think we're still guaranteed to make progress because this implies framesToEnd <= 0 which means readIndex >= endFrame, and the code at line 273 will make sure we exit the loop or make readIndex eventually less than endFrame.

got it. thanks

>> Source/WebCore/webaudio/AudioBufferSourceNode.cpp:295
>> +        while (framesToProcess--) {
> 
> This is a horribly long loop/function.  would be better to break this up into static inline helper functions in a second pass.

this block is from the original source code. I didn't touch it. maybe we can optimize it later. thanks
Comment 12 Wei James (wistoch) 2011-12-21 19:00:41 PST
Created attachment 120256 [details]
Patch
Comment 13 Raymond Toy 2011-12-22 08:53:32 PST
Looks good.
Comment 14 Chris Rogers 2011-12-22 17:31:30 PST
Hi James, sorry it's taken a little while to really look at this.  I've been swamped :)

AudioBufferSourceNode is at the heart of many of the layout tests, so is getting pretty heavy testing Your patch is passing the existing layout tests (especially in webaudio/audiobuffersource-playbackrate.html).  So it shouldn't be necessary to write another layout test.  So in the ChangeLog, I would change:

No new tests. (OOPS!)

to:

Covered by existing webaudio layout tests, especially webaudio/audiobuffersource-playbackrate.html


I also applied and ran many web audio pages using your patch, so I'm confident it's ok.

But I agree with Eric's comment about the method being very long.  There's a very long code section which is copied/pasted: 
The following identical code exists in two different places:


if (!loop()) {
    // If we're not looping, then stop playing when we get to the end.
    m_isPlaying = false;

    if (framesToProcess > 0) {
        // We're not looping and we've reached the end of the sample data, but we still need to provide more output,
        // so generate silence for the remaining.
        memset(destinationL, 0, sizeof(float) * framesToProcess);

        if (isStereo)
            memset(destinationR, 0, sizeof(float) * framesToProcess);
    }

    finish();
    break;
}


Can we please create a separate method (possibly inline if you think that matters) something like this:

// Returns true if we're finished.
bool AudioBufferSourceNode::renderSilenceAndFinishIfNotLooping(float* destinationL, float* destinationR, size_t framesToProcess)
{
    if (!loop()) {
        // If we're not looping, then stop playing when we get to the end.
        m_isPlaying = false;

        if (framesToProcess > 0) {
            // We're not looping and we've reached the end of the sample data, but we still need to provide more output,
            // so generate silence for the remaining.
            memset(destinationL, 0, sizeof(float) * framesToProcess);

            if (destinationR)
                memset(destinationR, 0, sizeof(float) * framesToProcess);
        }

        finish();
        return true;
    }
    return false;
}


And then in the two places where it's used, something like:

    if (renderSilenceAndFinishIfNotLooping(destinationL, destinationR, framesToProcess))
        break;
Comment 15 Wei James (wistoch) 2011-12-22 22:10:34 PST
Created attachment 120435 [details]
Patch
Comment 16 Wei James (wistoch) 2011-12-22 22:12:58 PST
(In reply to comment #14)
> Hi James, sorry it's taken a little while to really look at this.  I've been swamped :)
> 
> AudioBufferSourceNode is at the heart of many of the layout tests, so is getting pretty heavy testing Your patch is passing the existing layout tests (especially in webaudio/audiobuffersource-playbackrate.html).  So it shouldn't be necessary to write another layout test.  So in the ChangeLog, I would change:
> 
> No new tests. (OOPS!)
> 
> to:
> 
> Covered by existing webaudio layout tests, especially webaudio/audiobuffersource-playbackrate.html
> 
> 
> I also applied and ran many web audio pages using your patch, so I'm confident it's ok.
> 
> But I agree with Eric's comment about the method being very long.  There's a very long code section which is copied/pasted: 
> The following identical code exists in two different places:
> 
> 
> if (!loop()) {
>     // If we're not looping, then stop playing when we get to the end.
>     m_isPlaying = false;
> 
>     if (framesToProcess > 0) {
>         // We're not looping and we've reached the end of the sample data, but we still need to provide more output,
>         // so generate silence for the remaining.
>         memset(destinationL, 0, sizeof(float) * framesToProcess);
> 
>         if (isStereo)
>             memset(destinationR, 0, sizeof(float) * framesToProcess);
>     }
> 
>     finish();
>     break;
> }
> 
> 
> Can we please create a separate method (possibly inline if you think that matters) something like this:
> 
> // Returns true if we're finished.
> bool AudioBufferSourceNode::renderSilenceAndFinishIfNotLooping(float* destinationL, float* destinationR, size_t framesToProcess)
> {
>     if (!loop()) {
>         // If we're not looping, then stop playing when we get to the end.
>         m_isPlaying = false;
> 
>         if (framesToProcess > 0) {
>             // We're not looping and we've reached the end of the sample data, but we still need to provide more output,
>             // so generate silence for the remaining.
>             memset(destinationL, 0, sizeof(float) * framesToProcess);
> 
>             if (destinationR)
>                 memset(destinationR, 0, sizeof(float) * framesToProcess);
>         }
> 
>         finish();
>         return true;
>     }
>     return false;
> }
> 
> 
> And then in the two places where it's used, something like:
> 
>     if (renderSilenceAndFinishIfNotLooping(destinationL, destinationR, framesToProcess))
>         break;

thanks. I have updated the patch based on your comments. thanks
Comment 17 Kenneth Russell 2012-01-03 13:22:31 PST
If Chris gives an unofficial LGTM on this I'll be happy to r+ it.
Comment 18 Chris Rogers 2012-01-03 14:35:13 PST
Looks good.  Thanks James!
Comment 19 Kenneth Russell 2012-01-03 15:16:28 PST
Comment on attachment 120435 [details]
Patch

rs=me
Comment 20 WebKit Review Bot 2012-01-03 17:28:41 PST
Comment on attachment 120435 [details]
Patch

Clearing flags on attachment: 120435

Committed r103994: <http://trac.webkit.org/changeset/103994>
Comment 21 WebKit Review Bot 2012-01-03 17:28:46 PST
All reviewed patches have been landed.  Closing bug.