Bug 73182 - Need SSE optimization for functions vfmul and vadd
Summary: Need SSE optimization for functions vfmul and vadd
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Web Audio (show other bugs)
Version: 528+ (Nightly build)
Hardware: All Unspecified
: P2 Normal
Assignee: Xingnan Wang
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-27 19:19 PST by Xingnan Wang
Modified: 2011-12-02 18:44 PST (History)
5 users (show)

See Also:


Attachments
Implement the SSE optimization for vsmul and vadd (5.68 KB, patch)
2011-11-27 21:22 PST, Xingnan Wang
no flags Details | Formatted Diff | Diff
Update the patch (Implement the SSE optimization for vsmul and vadd) (6.89 KB, patch)
2011-11-29 00:29 PST, Xingnan Wang
sam: review-
Details | Formatted Diff | Diff
Update the patch (fix code style issue and pull the if in vsmul out of loop) (7.14 KB, patch)
2011-11-29 22:21 PST, Xingnan Wang
no flags Details | Formatted Diff | Diff
Update the patch (Fix comment issues) (7.16 KB, patch)
2011-11-30 17:31 PST, Xingnan Wang
kbr: review-
Details | Formatted Diff | Diff
Patch (7.34 KB, patch)
2011-12-02 00:39 PST, Wei James (wistoch)
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Xingnan Wang 2011-11-27 19:19:15 PST
The VectorMath::vfmul() and VectorMath::vadd() are widely used in Web Audio process of which performance will enhance by the SSE optimization for the two function.
Comment 1 Xingnan Wang 2011-11-27 21:22:45 PST
Created attachment 116695 [details]
Implement the SSE optimization for vsmul and vadd
Comment 2 Xingnan Wang 2011-11-27 21:34:06 PST
As test from our side we get about 2x improvement on vadd and 2.5x on vsmul, also the performance of processing in ConvolverNode can be 10%~15% increasing.

We test in 
Intel(R) Core(TM) i7 CPU 870  @ 2.93GHz
Ubuntu 11.10
Comment 3 Raymond Toy 2011-11-28 12:15:15 PST
Thanks for your patch.  This looks good, except a few items.

First, there are a few style issues that need to be fixed.  (I can fix those.)

Second, in vsmul, you have an if statement inside while loop.  Can the if statements be lifted out so you get something like:

 if (((size_t)destP & 0x0F) != 0) {
    while (group--) {
      dest = _mm_mul_ps(*pSource, scale_);
      _mm_storeu_ps(destP, dest);
      sourceP += 4;
      destP += 4;
    }
  } else {
    while (group--) {
      pDest = reinterpret_cast<__m128*>(destP);
      *pDest = _mm_mul_ps(*pSource, scale_);
      sourceP += 4;
      destP += 4;
    }
  }

Same comment about the while/if in vadd, although that's a bit more complicated since source2P and destP may or may not be aligned.
Comment 4 Wei James (wistoch) 2011-11-28 16:06:03 PST
(In reply to comment #3)
> Thanks for your patch.  This looks good, except a few items.
> 
> First, there are a few style issues that need to be fixed.  (I can fix those.)
> 
> Second, in vsmul, you have an if statement inside while loop.  Can the if statements be lifted out so you get something like:
> 
>  if (((size_t)destP & 0x0F) != 0) {
>     while (group--) {
>       dest = _mm_mul_ps(*pSource, scale_);
>       _mm_storeu_ps(destP, dest);
>       sourceP += 4;
>       destP += 4;
>     }
>   } else {
>     while (group--) {
>       pDest = reinterpret_cast<__m128*>(destP);
>       *pDest = _mm_mul_ps(*pSource, scale_);
>       sourceP += 4;
>       destP += 4;
>     }
>   }
> 
> Same comment about the while/if in vadd, although that's a bit more complicated since source2P and destP may or may not be aligned.

Raymond, thanks for your comment. 

For the if statement inside while loop, we have thought of it and tested the performance of lifting the if statement out and found the performance difference is minor and the source code is much more complicated. So we decided to leave it be. 

Do you think it is necessary to lift it out? thanks
Comment 5 Raymond Toy 2011-11-28 16:43:19 PST
(In reply to comment #4)
> (In reply to comment #3)
> > Thanks for your patch.  This looks good, except a few items.
> > 
> > First, there are a few style issues that need to be fixed.  (I can fix those.)
> > 
> > Second, in vsmul, you have an if statement inside while loop.  Can the if statements be lifted out so you get something like:
> > 
> >  if (((size_t)destP & 0x0F) != 0) {
> >     while (group--) {
> >       dest = _mm_mul_ps(*pSource, scale_);
> >       _mm_storeu_ps(destP, dest);
> >       sourceP += 4;
> >       destP += 4;
> >     }
> >   } else {
> >     while (group--) {
> >       pDest = reinterpret_cast<__m128*>(destP);
> >       *pDest = _mm_mul_ps(*pSource, scale_);
> >       sourceP += 4;
> >       destP += 4;
> >     }
> >   }
> > 
> > Same comment about the while/if in vadd, although that's a bit more complicated since source2P and destP may or may not be aligned.
> 
> Raymond, thanks for your comment. 
> 
> For the if statement inside while loop, we have thought of it and tested the performance of lifting the if statement out and found the performance difference is minor and the source code is much more complicated. So we decided to leave it be. 
> 
> Do you think it is necessary to lift it out? thanks

How much difference did it make?  I don't think it's necessary to lift it out if the difference is minor.  A comment in the code saying that lifting it out doesn't help would be very nice, though, so that no one else will wonder if it helps or not.
Comment 6 Wei James (wistoch) 2011-11-28 16:47:05 PST
(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #3)
> > > Thanks for your patch.  This looks good, except a few items.
> > > 
> > > First, there are a few style issues that need to be fixed.  (I can fix those.)
> > > 
> > > Second, in vsmul, you have an if statement inside while loop.  Can the if statements be lifted out so you get something like:
> > > 
> > >  if (((size_t)destP & 0x0F) != 0) {
> > >     while (group--) {
> > >       dest = _mm_mul_ps(*pSource, scale_);
> > >       _mm_storeu_ps(destP, dest);
> > >       sourceP += 4;
> > >       destP += 4;
> > >     }
> > >   } else {
> > >     while (group--) {
> > >       pDest = reinterpret_cast<__m128*>(destP);
> > >       *pDest = _mm_mul_ps(*pSource, scale_);
> > >       sourceP += 4;
> > >       destP += 4;
> > >     }
> > >   }
> > > 
> > > Same comment about the while/if in vadd, although that's a bit more complicated since source2P and destP may or may not be aligned.
> > 
> > Raymond, thanks for your comment. 
> > 
> > For the if statement inside while loop, we have thought of it and tested the performance of lifting the if statement out and found the performance difference is minor and the source code is much more complicated. So we decided to leave it be. 
> > 
> > Do you think it is necessary to lift it out? thanks
> 
> How much difference did it make?  I don't think it's necessary to lift it out if the difference is minor.  A comment in the code saying that lifting it out doesn't help would be very nice, though, so that no one else will wonder if it helps or not.

good suggestion. we will test it again and add a comment in the code with the test result. 

thanks
Comment 7 Xingnan Wang 2011-11-29 00:29:30 PST
Created attachment 116909 [details]
Update the patch (Implement the SSE optimization for vsmul and vadd)
Comment 8 Xingnan Wang 2011-11-29 00:37:25 PST
(In reply to comment #7)
> Created an attachment (id=116909) [details]
> Update the patch (Implement the SSE optimization for vsmul and vadd)

Hi Raymond,
    We tested again by moving the if statement out of the loop in vadd and vsmul then we got about 10%~20% improvement in vadd compared with no lifting it outside, but the vsmul`s performance increased less than 1%, so we leave the if inside in vsmul. 

    By the way could you point out the style issues in our code, that should be very helpful for us, thank you.
Comment 9 Raymond Toy 2011-11-29 09:15:12 PST
(In reply to comment #8)
> (In reply to comment #7)
> > Created an attachment (id=116909) [details] [details]
> > Update the patch (Implement the SSE optimization for vsmul and vadd)
> 
> Hi Raymond,
>     We tested again by moving the if statement out of the loop in vadd and vsmul then we got about 10%~20% improvement in vadd compared with no lifting it outside, but the vsmul`s performance increased less than 1%, so we leave the if inside in vsmul. 
> 
>     By the way could you point out the style issues in our code, that should be very helpful for us, thank you.

What exactly do you mean by 10-20% improvement?  Does that mean that if the original non-SSE code took 1 sec, the first SSE code now takes .7 sec (you said 30% improvement previously), and now the new SSE code takes .6 sec (10 percentage points more) or now takes .63 sec (.7 sec - 10%)?  Just want to know what it means.  I agree with you that vsmul should be left as is and vadd should lift the code out.

The style guide is here:  http://www.webkit.org/coding/coding-style.html, if you haven't seen it.  The easiest way to see the style issues is to run Tools/Scripts/check-webkit-style while in the third_party/WebKit directory.  If that doesn't work, I'll manually go and point out the issues in the review.

Note that I do not have commit privileges, so the code will need to be reviewed again by someone who does.

But thanks for doing the test and updating the code!  It's a nice improvement.
Comment 10 Sam Weinig 2011-11-29 09:53:07 PST
Comment on attachment 116909 [details]
Update the patch (Implement the SSE optimization for vsmul and vadd)

View in context: https://bugs.webkit.org/attachment.cgi?id=116909&action=review

> Source/WebCore/platform/audio/VectorMath.cpp:76
> +        // If the sourceP address does not 16-byte align, the first several frames(at most three) should be processed seperately.

Missing space before (.

> Source/WebCore/platform/audio/VectorMath.cpp:77
> +        while ((((size_t)sourceP & 0x0F) != 0) && n) {

Please use static_cast instead of C-style cast.

> Source/WebCore/platform/audio/VectorMath.cpp:87
> +        __m128 *pSource, *pDest;

We prefer one declaration per line and the * goes next to the type.

> Source/WebCore/platform/audio/VectorMath.cpp:91
> +            pSource = reinterpret_cast<__m128*>((float*)sourceP);

Please use static_cast instead of C-style cast.

> Source/WebCore/platform/audio/VectorMath.cpp:93
> +            // There is less than 1% performance enhancement when pulling the if statement out of the loop, so leave it here.

It is not clear why you wouldn't want the 1% performance gain here.

> Source/WebCore/platform/audio/VectorMath.cpp:94
> +            if (((size_t)destP & 0x0F) != 0) {

Please use static_cast instead of C-style cast.

> Source/WebCore/platform/audio/VectorMath.cpp:136
> +        while ((((size_t)source1P & 0x0F) != 0) && n) {

Please use static_cast instead of C-style cast.

> Source/WebCore/platform/audio/VectorMath.cpp:147
> +        __m128 * pSource1, *pSource2, *pDest;
> +        __m128 source2, dest;

We prefer one declaration per line and the * goes next to the type.

> Source/WebCore/platform/audio/VectorMath.cpp:150
> +        bool source2Aligned = !((size_t)source2P & 0x0F);
> +        bool destAligned = !((size_t)destP & 0x0F);

Please use static_cast instead of C-style cast.

> Source/WebCore/platform/audio/VectorMath.cpp:155
> +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> +                pSource2 = reinterpret_cast<__m128*>((float*)source2P);

Please use static_cast instead of C-style cast.

> Source/WebCore/platform/audio/VectorMath.cpp:167
> +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> +                pSource2 = reinterpret_cast<__m128*>((float*)source2P);

Please use static_cast instead of C-style cast.

> Source/WebCore/platform/audio/VectorMath.cpp:178
> +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);

Please use static_cast instead of C-style cast.

> Source/WebCore/platform/audio/VectorMath.cpp:188
> +        } else if (!source2Aligned && !destAligned) // both source2 and dest not aligned 
> +        {

{ should go on the previous line.

> Source/WebCore/platform/audio/VectorMath.cpp:191
> +                source2 = _mm_loadu_ps(source2P);

Please use static_cast instead of C-style cast.
Comment 11 Wei James (wistoch) 2011-11-29 17:12:53 PST
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > Created an attachment (id=116909) [details] [details] [details]
> > > Update the patch (Implement the SSE optimization for vsmul and vadd)
> > 
> > Hi Raymond,
> >     We tested again by moving the if statement out of the loop in vadd and vsmul then we got about 10%~20% improvement in vadd compared with no lifting it outside, but the vsmul`s performance increased less than 1%, so we leave the if inside in vsmul. 
> > 
> >     By the way could you point out the style issues in our code, that should be very helpful for us, thank you.
> 
> What exactly do you mean by 10-20% improvement?  Does that mean that if the original non-SSE code took 1 sec, the first SSE code now takes .7 sec (you said 30% improvement previously), and now the new SSE code takes .6 sec (10 percentage points more) or now takes .63 sec (.7 sec - 10%)?  Just want to know what it means.  I agree with you that vsmul should be left as is and vadd should lift the code out.
> 
> The style guide is here:  http://www.webkit.org/coding/coding-style.html, if you haven't seen it.  The easiest way to see the style issues is to run Tools/Scripts/check-webkit-style while in the third_party/WebKit directory.  If that doesn't work, I'll manually go and point out the issues in the review.
> 
> Note that I do not have commit privileges, so the code will need to be reviewed again by someone who does.
> 
> But thanks for doing the test and updating the code!  It's a nice improvement.

thanks for the information, we will run the script and update the patch again to fix the style issues. 

for the performance detail, for example, if the time for non-SSE code vadd and vsmul are both 100, then the time for first SSE version are 50 and 40. And the time for second version SSE are 44 and 39.8 or so.
Comment 12 Wei James (wistoch) 2011-11-29 17:35:01 PST
(In reply to comment #10)
> (From update of attachment 116909 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=116909&action=review
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:76
> > +        // If the sourceP address does not 16-byte align, the first several frames(at most three) should be processed seperately.
> 
> Missing space before (.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:77
> > +        while ((((size_t)sourceP & 0x0F) != 0) && n) {
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:87
> > +        __m128 *pSource, *pDest;
> 
> We prefer one declaration per line and the * goes next to the type.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:91
> > +            pSource = reinterpret_cast<__m128*>((float*)sourceP);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:93
> > +            // There is less than 1% performance enhancement when pulling the if statement out of the loop, so leave it here.
> 
> It is not clear why you wouldn't want the 1% performance gain here.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:94
> > +            if (((size_t)destP & 0x0F) != 0) {
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:136
> > +        while ((((size_t)source1P & 0x0F) != 0) && n) {
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:147
> > +        __m128 * pSource1, *pSource2, *pDest;
> > +        __m128 source2, dest;
> 
> We prefer one declaration per line and the * goes next to the type.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:150
> > +        bool source2Aligned = !((size_t)source2P & 0x0F);
> > +        bool destAligned = !((size_t)destP & 0x0F);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:155
> > +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> > +                pSource2 = reinterpret_cast<__m128*>((float*)source2P);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:167
> > +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> > +                pSource2 = reinterpret_cast<__m128*>((float*)source2P);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:178
> > +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:188
> > +        } else if (!source2Aligned && !destAligned) // both source2 and dest not aligned 
> > +        {
> 
> { should go on the previous line.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:191
> > +                source2 = _mm_loadu_ps(source2P);
> 
> Please use static_cast instead of C-style cast.

thanks for the detailed explanation, we will fix the style issues. thanks
Comment 13 Xingnan Wang 2011-11-29 22:21:26 PST
Created attachment 117125 [details]
Update the patch (fix code style issue and pull the if in vsmul out of loop)
Comment 14 Xingnan Wang 2011-11-29 22:22:49 PST
(In reply to comment #10)
> (From update of attachment 116909 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=116909&action=review
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:76
> > +        // If the sourceP address does not 16-byte align, the first several frames(at most three) should be processed seperately.
> 
> Missing space before (.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:77
> > +        while ((((size_t)sourceP & 0x0F) != 0) && n) {
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:87
> > +        __m128 *pSource, *pDest;
> 
> We prefer one declaration per line and the * goes next to the type.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:91
> > +            pSource = reinterpret_cast<__m128*>((float*)sourceP);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:93
> > +            // There is less than 1% performance enhancement when pulling the if statement out of the loop, so leave it here.
> 
> It is not clear why you wouldn't want the 1% performance gain here.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:94
> > +            if (((size_t)destP & 0x0F) != 0) {
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:136
> > +        while ((((size_t)source1P & 0x0F) != 0) && n) {
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:147
> > +        __m128 * pSource1, *pSource2, *pDest;
> > +        __m128 source2, dest;
> 
> We prefer one declaration per line and the * goes next to the type.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:150
> > +        bool source2Aligned = !((size_t)source2P & 0x0F);
> > +        bool destAligned = !((size_t)destP & 0x0F);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:155
> > +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> > +                pSource2 = reinterpret_cast<__m128*>((float*)source2P);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:167
> > +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> > +                pSource2 = reinterpret_cast<__m128*>((float*)source2P);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:178
> > +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> 
> Please use static_cast instead of C-style cast.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:188
> > +        } else if (!source2Aligned && !destAligned) // both source2 and dest not aligned 
> > +        {
> 
> { should go on the previous line.
> 
> > Source/WebCore/platform/audio/VectorMath.cpp:191
> > +                source2 = _mm_loadu_ps(source2P);
> 
> Please use static_cast instead of C-style cast.

Hi Sam, 
    Thanks to your review and we update the patch by your comments.
Comment 15 Xingnan Wang 2011-11-29 22:43:20 PST
BTW, I use reinterpret_cast and const_cast not static_cast to instead the C-style cast because the static_cast is not suitable and causes invalid errors.

third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp: In function ‘void WebCore::VectorMath::vsmul(const float*, int, const float*, float*, int, size_t)’:
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:77:36: error: expected ‘(’ before ‘sourceP’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:77:46: error: invalid operands of types ‘const float*’ and ‘int’ to binary ‘operator&’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:77:58: error: expected ‘)’ before ‘{’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:92:68: error: expected ‘(’ before ‘sourceP’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:92:75: error: invalid static_cast from type ‘const float*’ to type ‘float*’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:92:76: error: expected ‘)’ before ‘;’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:95:36: error: expected ‘(’ before ‘destP’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:95:44: error: invalid operands of types ‘float*’ and ‘int’ to binary ‘operator&’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:95:50: error: expected ‘)’ before ‘{’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:86:16: error: unused variable ‘pScale’ [-Werror=unused-variable]
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:88:18: error: unused variable ‘pDest’ [-Werror=unused-variable]
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:89:16: error: unused variable ‘dest’ [-Werror=unused-variable]
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp: In function ‘void WebCore::VectorMath::vadd(const float*, int, const float*, int, float*, int, size_t)’:
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:137:36: error: expected ‘(’ before ‘source1P’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:137:47: error: invalid operands of types ‘const float*’ and ‘int’ to binary ‘operator&’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:137:59: error: expected ‘)’ before ‘{’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:150:52: error: expected ‘(’ before ‘source2P’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:150:63: error: invalid operands of types ‘const float*’ and ‘int’ to binary ‘operator&’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:150:68: error: expected ‘)’ before ‘;’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:151:49: error: expected ‘(’ before ‘destP’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:151:57: error: invalid operands of types ‘float*’ and ‘int’ to binary ‘operator&’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:151:62: error: expected ‘)’ before ‘;’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:155:73: error: expected ‘(’ before ‘source1P’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:155:81: error: invalid static_cast from type ‘const float*’ to type ‘float*’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:155:82: error: expected ‘)’ before ‘;’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:156:73: error: expected ‘(’ before ‘source2P’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:156:81: error: invalid static_cast from type ‘const float*’ to type ‘float*’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:156:82: error: expected ‘)’ before ‘;’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:167:73: error: expected ‘(’ before ‘source1P’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:167:81: error: invalid static_cast from type ‘const float*’ to type ‘float*’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:167:82: error: expected ‘)’ before ‘;’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:168:73: error: expected ‘(’ before ‘source2P’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:168:81: error: invalid static_cast from type ‘const float*’ to type ‘float*’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:168:82: error: expected ‘)’ before ‘;’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:179:73: error: expected ‘(’ before ‘source1P’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:179:81: error: invalid static_cast from type ‘const float*’ to type ‘float*’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:179:82: error: expected ‘)’ before ‘;’ token
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:190:73: error: expected ‘(’ before ‘source1P’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:190:81: error: invalid static_cast from type ‘const float*’ to type ‘float*’
third_party/WebKit/Source/WebCore/platform/audio/VectorMath.cpp:190:82: error: expected ‘)’ before ‘;’ token


(In reply to comment #14)
> (In reply to comment #10)
> > (From update of attachment 116909 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=116909&action=review
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:76
> > > +        // If the sourceP address does not 16-byte align, the first several frames(at most three) should be processed seperately.
> > 
> > Missing space before (.
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:77
> > > +        while ((((size_t)sourceP & 0x0F) != 0) && n) {
> > 
> > Please use static_cast instead of C-style cast.
(size_t)sourceP --> reinterpret_cast<size_t>(sourceP)
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:87
> > > +        __m128 *pSource, *pDest;
> > 
> > We prefer one declaration per line and the * goes next to the type.
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:91
> > > +            pSource = reinterpret_cast<__m128*>((float*)sourceP);
> > 
> > Please use static_cast instead of C-style cast.
(float*)sourceP --> const_cast<float*>sourceP
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:93
> > > +            // There is less than 1% performance enhancement when pulling the if statement out of the loop, so leave it here.
> > 
> > It is not clear why you wouldn't want the 1% performance gain here.
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:94
> > > +            if (((size_t)destP & 0x0F) != 0) {
> > 
> > Please use static_cast instead of C-style cast.
(size_t)destP --> reinterpret_cast<size_t>(destP)
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:136
> > > +        while ((((size_t)source1P & 0x0F) != 0) && n) {
> > 
> > Please use static_cast instead of C-style cast.
(size_t)source1P --> reinterpret_cast<size_t>(source1P)
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:147
> > > +        __m128 * pSource1, *pSource2, *pDest;
> > > +        __m128 source2, dest;
> > 
> > We prefer one declaration per line and the * goes next to the type.
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:150
> > > +        bool source2Aligned = !((size_t)source2P & 0x0F);
> > > +        bool destAligned = !((size_t)destP & 0x0F);
> > 
> > Please use static_cast instead of C-style cast.
(size_t)source2P --> reinterpret_cast<size_t>(source2P)
(size_t)destP --> reinterpret_cast<size_t>(destP)
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:155
> > > +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> > > +                pSource2 = reinterpret_cast<__m128*>((float*)source2P);
> > 
> > Please use static_cast instead of C-style cast.
(float*)source1P --> const_cast<float*>source1P
(float*)source2P --> const_cast<float*>source2P
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:167
> > > +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> > > +                pSource2 = reinterpret_cast<__m128*>((float*)source2P);
> > 
> > Please use static_cast instead of C-style cast.
(float*)source1P --> const_cast<float*>source1P
(float*)source2P --> const_cast<float*>source2P
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:178
> > > +                pSource1 = reinterpret_cast<__m128*>((float*)source1P);
> > 
> > Please use static_cast instead of C-style cast.
(float*)source1P --> const_cast<float*>source1P
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:188
> > > +        } else if (!source2Aligned && !destAligned) // both source2 and dest not aligned 
> > > +        {
> > 
> > { should go on the previous line.
> > 
> > > Source/WebCore/platform/audio/VectorMath.cpp:191
> > > +                source2 = _mm_loadu_ps(source2P);
> > 
> > Please use static_cast instead of C-style cast.
> 
> Hi Sam, 
>     Thanks to your review and we update the patch by your comments.
Comment 16 Raymond Toy 2011-11-30 11:17:31 PST
Comment on attachment 117125 [details]
Update the patch (fix code style issue and pull the if in vsmul out of loop)

View in context: https://bugs.webkit.org/attachment.cgi?id=117125&action=review

Just a few minor grammatical fixes.  Otherwise LGTM.

> platform/audio/VectorMath.cpp:84
> +        // Now the sourceP address aligned and start to apply SSE.

"address aligned" should be "address is aligned"

> platform/audio/VectorMath.cpp:112
> +        // Non-SSE handling for left frames which is less than 4.

"for left frames" should be "for remaining frames"

> platform/audio/VectorMath.cpp:150
> +        // Now the source1P address aligned and start to apply SSE.

"address aligned" -> "address is aligned"

> platform/audio/VectorMath.cpp:185
> +        } else if (!source2Aligned && destAligned) { // source2 not aligend but dest aligned 

"aligend" -> "aligned"

> platform/audio/VectorMath.cpp:209
> +        // Non-SSE handling for left frames which is less than 4.

"for left frames" -> "for remaining frames"
Comment 17 Xingnan Wang 2011-11-30 17:31:07 PST
Created attachment 117304 [details]
Update the patch (Fix comment issues)
Comment 18 Xingnan Wang 2011-11-30 17:32:41 PST
Comment on attachment 117125 [details]
Update the patch (fix code style issue and pull the if in vsmul out of loop)

View in context: https://bugs.webkit.org/attachment.cgi?id=117125&action=review

>> platform/audio/VectorMath.cpp:209
>> +        // Non-SSE handling for left frames which is less than 4.
> 
> "for left frames" -> "for remaining frames"

Hi Raymond,
    Thanks for your review, the patch is updated.
Comment 19 Xingnan Wang 2011-11-30 17:32:50 PST
View in context: https://bugs.webkit.org/attachment.cgi?id=117125&action=review

>> platform/audio/VectorMath.cpp:209
>> +        // Non-SSE handling for left frames which is less than 4.
> 
> "for left frames" -> "for remaining frames"

Hi Raymond,
    Thanks for your review, the patch is updated.
Comment 20 Raymond Toy 2011-12-01 11:43:37 PST
Ken, please review at your convenience.
Comment 21 Kenneth Russell 2011-12-01 14:34:54 PST
Comment on attachment 117304 [details]
Update the patch (Fix comment issues)

View in context: https://bugs.webkit.org/attachment.cgi?id=117304&action=review

Sounds fine especially based on Raymond's and Sam's earlier reviews, but it looks like the patch needs to be updated to top of tree WebKit. If this is correct, please do so and upload a new patch; if it seems to be some other problem, please mark r? again.

> ChangeLog:6
> +        Reviewed by Sam Weinig.

You should leave this line as "Reviewed by NOBODY (OOPS)!" in the future. The commit queue will rewrite it for you.
Comment 22 Wei James (wistoch) 2011-12-02 00:39:53 PST
Created attachment 117583 [details]
Patch
Comment 23 Wei James (wistoch) 2011-12-02 00:45:30 PST
(In reply to comment #21)
> (From update of attachment 117304 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=117304&action=review
> 
> Sounds fine especially based on Raymond's and Sam's earlier reviews, but it looks like the patch needs to be updated to top of tree WebKit. If this is correct, please do so and upload a new patch; if it seems to be some other problem, please mark r? again.
> 
> > ChangeLog:6
> > +        Reviewed by Sam Weinig.
> 
> You should leave this line as "Reviewed by NOBODY (OOPS)!" in the future. The commit queue will rewrite it for you.

Ken, thanks for your comments. patch has been updated to top of tree WebKit. 
I have marked r? and cq?. please help to review and if ok please commit it. thanks.
Comment 24 Kenneth Russell 2011-12-02 18:07:58 PST
Comment on attachment 117583 [details]
Patch

Thanks for updating. r=me
Comment 25 WebKit Review Bot 2011-12-02 18:43:58 PST
Comment on attachment 117583 [details]
Patch

Clearing flags on attachment: 117583

Committed r101894: <http://trac.webkit.org/changeset/101894>
Comment 26 WebKit Review Bot 2011-12-02 18:44:04 PST
All reviewed patches have been landed.  Closing bug.