Bug 54456

Summary: Optimizing lightning filter to ARM-neon SIMD instruction set
Product: WebKit Reporter: Zoltan Herczeg <zherczeg>
Component: SVGAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, benjamin, darin, ggaren, krit, loki, oliver, simon.fraser, webkit.review.bot, zecke, zimmermann
Priority: P2 Keywords: Qt, QtTriaged
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Bug Depends on: 56182    
Bug Blocks:    
Attachments:
Description Flags
patch
none
updated draft patch
none
patch
none
patch
none
patch
none
patch
barraclough: review-
patch
zimmermann: review-
patch zimmermann: review+

Description Zoltan Herczeg 2011-02-15 06:50:37 PST
Do we allow to add platform specific optimizations to WebKit?
Comment 1 Zoltan Herczeg 2011-02-15 06:52:18 PST
Created attachment 82446 [details]
patch

This patch has not finished (I mean it is working, but expects some change before landing), I just want to know your opinion about such optimizations. Where to put them, etc.
Comment 2 WebKit Review Bot 2011-02-15 06:54:19 PST
Attachment 82446 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/platform/graphics/filters/F..." exit_code: 1

Source/WebCore/platform/graphics/filters/FELightingARMNeon.cpp:241:  Extra space before ( in function call  [whitespace/parens] [4]
Source/WebCore/platform/graphics/filters/FELightingARMNeon.cpp:457:  The parameter name "NL" adds no information, so it should be removed.  [readability/parameter_name] [5]
Source/WebCore/platform/graphics/filters/FELighting.cpp:439:  An else should appear on the same line as the preceding }  [whitespace/newline] [4]
Total errors found: 3 in 3 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Dirk Schulze 2011-02-16 03:06:48 PST
Comment on attachment 82446 [details]
patch

There should definitely be a possibility to add platform dependent code paths for filters. This code should be moved into platform dependent folders like we do in WebCore/platform/graphic/. We just need a way to handle this with the platform independent code path. I'd like to see a similar concept like for GraphicsContext. Means we have something like base classes FE* and the platform snippets somewhere else. We should avoid platform-#if s in these base classes as much as possible.
Comment 4 Dirk Schulze 2011-02-16 03:08:28 PST
Btw. Do you have some one who could review the ARM code when the patch is ready for landing? :-P
Comment 5 Zoltan Herczeg 2011-02-16 05:54:02 PST
Created attachment 82621 [details]
updated draft patch
Comment 6 WebKit Review Bot 2011-02-16 05:55:49 PST
Attachment 82621 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/WebCore.pri', u'Source/WebC..." exit_code: 1

Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:192:  Extra space before ( in function call  [whitespace/parens] [4]
Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:408:  The parameter name "NL" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 2 in 6 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 7 Zoltan Herczeg 2011-02-17 03:48:19 PST
Created attachment 82780 [details]
patch

Final patch
Comment 8 WebKit Review Bot 2011-02-17 03:50:30 PST
Attachment 82780 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/WebCore/ChangeLog', u'Source/WebCor..." exit_code: 1

Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:192:  Extra space before ( in function call  [whitespace/parens] [4]
Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:408:  The parameter name "NL" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 2 in 7 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 9 Zoltan Herczeg 2011-02-17 05:29:46 PST
There is some ARM assembly magic here, and I would like to hear your opinion about it.
Comment 10 Zoltan Herczeg 2011-02-21 00:13:37 PST
Any thoughts about this?
Comment 11 Gavin Barraclough 2011-02-21 11:22:46 PST
I sorry, but I'm afraid I don't have time right now to pick through this assembly in detail, to understand what it is actually doing.

I've glanced over, and at a higher level the patch as a whole looks absolutely fine to me.  I'd prefer to leave it for someone who knows this area of code to review, but if no one is available, then this looks fine to me, I'd be happy to r+.
Comment 12 Zoltan Herczeg 2011-02-21 12:21:58 PST
Thank you, Gavin
Comment 13 Gabor Loki 2011-02-21 13:19:38 PST
Comment on attachment 82780 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=82780&action=review

Let me do an informal review.


> Source/WebCore/ChangeLog:8
> +        Neon is the SIMD instruction set for arm. This instruction set

Please, use capital letters for ARM.

> Source/WebCore/platform/graphics/filters/FELighting.cpp:36
> +#if CPU(ARM) && ENABLE(ARM_NEON)

If NEON is enabled the compilers set __ARM_NEON__ define to 1. In this case, you should define the corresponding macro (ENABLE_ARM_NEON) in Platform.h.

> Source/WebCore/platform/graphics/filters/FELighting.cpp:395
> +    char buffer[sizeof(FELightingFloatArgumentsForNeon) + 16];
> +    FELightingFloatArgumentsForNeon* floatArguments = reinterpret_cast<FELightingFloatArgumentsForNeon*>((reinterpret_cast<long>(buffer) + 15) & ~0xf);

I like magic numbers, but others don't. Please, use a static int or make a comment for this alignment trick.

> Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:33
> +#include "config.h"
> +#include "FELightingNeonSIMD.h"
> +
> +#if CPU(ARM) && ENABLE(ARM_NEON)
> +
> +short s_FELightingConstantsForNeon[8 * 3] __attribute__((aligned(0x10))) = {
> +    // Alpha coefficients.

Do you miss the WebCore namespace?

In JavaScriptCore/wtf/Vector.h there is a group of macros for alignment (for all platforms). Well, It may be worthwhile to move those macros to StdLibExtras.h and use it here also.

> Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:193
> +
> +asm (
> +".globl " TOSTRING(neonDrawLighting) NL

I feel this will fall with RVCT. I can help you in this.

> Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.h:45
> +#ifndef FELightingNeonSIMD_h
> +#define FELightingNeonSIMD_h
> +
> +#include <wtf/Platform.h>
> +
> +#if CPU(ARM) && ENABLE(ARM_NEON)
> +
> +// Otherwise: Distant Light.
> +#define FLAG_POINT_LIGHT                 0x01
> +#define FLAG_SPOT_LIGHT                  0x02
> +#define FLAG_CONE_EXPONENT_IS_1          0x04
> +
> +// Otherwise: Diffuse light.
> +#define FLAG_SPECULAR_LIGHT              0x10
> +#define FLAG_DIFFUSE_CONST_IS_1          0x20
> +#define FLAG_SPECULAR_EXPONENT_IS_1      0x40
> +
> +// Must be aligned to 16 bytes.
> +struct FELightingFloatArgumentsForNeon {

Why don't we put these into the WebCore namespace?

> Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.h:84
> +extern short s_FELightingConstantsForNeon[8 * 3];

Use static int or comment for this magic numbers as well.


Well, I need one more day to understand the assembly code, but others looks fine.
Comment 14 Zoltan Herczeg 2011-02-22 04:11:33 PST
Created attachment 83296 [details]
patch

Updated according to Gabor's suggestions
Comment 15 Zoltan Herczeg 2011-02-22 04:13:45 PST
> In JavaScriptCore/wtf/Vector.h there is a group of macros for alignment (for all platforms). Well, It may be worthwhile to move those macros to StdLibExtras.h and use it here also.

I think this should be done in a followup patch. This refactor worth a separate patch.
Comment 16 WebKit Review Bot 2011-02-22 04:14:25 PST
Attachment 83296 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/wtf/Platform.h', u'S..." exit_code: 1

Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:197:  Extra space before ( in function call  [whitespace/parens] [4]
Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:413:  The parameter name "NL" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 2 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 17 Darin Adler 2011-02-22 14:11:09 PST
> In JavaScriptCore/wtf/Vector.h there is a group of macros for alignment (for all platforms). Well, It may be worthwhile to move those macros to StdLibExtras.h and use it here also.

Seems fine to move to a common place. But probably not StdLibExtras.h. That’s for declarations of things that are very closely related to standard library contents, and I don’t think the alignment helpers quite fit that definition. We can put them into a nicely named header.
Comment 18 Gabor Loki 2011-03-01 01:39:39 PST
Comment on attachment 83296 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=83296&action=review

> Source/JavaScriptCore/wtf/Platform.h:352
> +#if defined(__ARM_NEON__)
> +#define ENABLE_ARM_NEON 1
> +#endif

Hmm, maybe it should be WTF_CPU_ARM_NEON instead of ENABLE_ARM_NEON. ENABLE_* macros are mainly those kind of features which can be turn on/off irrespectively of build or host system.

> Source/WebCore/ChangeLog:9
> +        allows to speed-up the lighting filter by 4 times on arm

s/arm/ARM/

> Source/WebCore/platform/graphics/filters/FELighting.cpp:36
> +#if CPU(ARM) && ENABLE(ARM_NEON) && COMPILER(GCC)

If CPU(ARM_NEON) is used the condition will be sorter as well.

> Source/WebCore/platform/graphics/filters/FELighting.h:84
> +#if CPU(ARM) && ENABLE(ARM_NEON)
> +    void drawInteriorPixels(LightingData&, LightSource::PaintingData&);

You missed the COMPILER(GCC) part.


At last I had some time to walk through the assembly code, and I could say it is reasonable and optimal.
So, I would like to propose it to be r+ -ed with those little changes. ;)
Comment 19 Zoltan Herczeg 2011-03-01 02:33:41 PST
Created attachment 84203 [details]
patch
Comment 20 Zoltan Herczeg 2011-03-01 02:34:53 PST
Maciej prefered the CPU(ARM_NEON) form.
Comment 21 WebKit Review Bot 2011-03-01 02:37:00 PST
Attachment 84203 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/wtf/Platform.h', u'S..." exit_code: 1

Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:197:  Extra space before ( in function call  [whitespace/parens] [4]
Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:413:  The parameter name "NL" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 2 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 22 Benjamin Poulain 2011-03-01 02:47:58 PST
What is the reason not to use the intrinsics? I think the code is unmaintainable as it is.
Comment 23 Benjamin Poulain 2011-03-01 04:31:12 PST
Comment on attachment 84203 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=84203&action=review

> Source/WebCore/platform/graphics/filters/FELighting.cpp:312
> +#if CPU(ARM_NEON) && COMPILER(GCC)
> +        drawInteriorPixels(data, paintingData);
> +#else

Stricto sensu, this is not portable.
You can compile with Neon because you target Cortex CPUs, while at the same time you might deploy on some target without the Neon extension. A common Cortex CPU without Neon is the tegra.

What we do in Qt is using the hardware capabilities from the kernel to detect Neon at runtime (HWCAP_NEON).
Comment 24 Benjamin Poulain 2011-03-01 04:32:53 PST
And I think this needs good benchmarks. Adding Holger to have someone who care about that :)
Comment 25 Holger Freyther 2011-03-01 04:39:18 PST
(In reply to comment #24)
> And I think this needs good benchmarks. Adding Holger to have someone who care about that :)

Thanks. and the usual question. With a gcc 4.5/4.6 how much slower is the code if one is using the intrinsics of the compiler? 

For the inline assembly:
  - Any reason to not use GNU AS macros?
  - Instead of harcoding register names or concating strings you can use IIRC the %[NAME] syntax for register names and GCC wil fill in the blanks for you increasing the readability of the code.
Comment 26 Zoltan Herczeg 2011-03-01 04:45:15 PST
> Stricto sensu, this is not portable.
> You can compile with Neon because you target Cortex CPUs, while at the same time you might deploy on some target without the Neon extension. A common Cortex CPU without Neon is the tegra.
> 
> What we do in Qt is using the hardware capabilities from the kernel to detect Neon at runtime (HWCAP_NEON).

I don't understand you. If you tell -mfpu=neon to the compiler, it will generate some neon instructions. If you deploy it on a non-neon architecture, it will stop with a illegal instruction.

If you say -mfpu=vfp, it will not generate neon instructions, and the optimization will be disabled as well.

Runtime detection will not work here, since if you enable neon support, the compiler WILL generate neon instructions (independently from this patch), or if you disable it, it will NOT compile the assembly source code (because they are undefined instructions in that case). (Note: currently it is not supported to have different compiler optimizations for individual files)

If you unsure that the target platform have neon support, just disable it. This is my best advice for you.
Comment 27 Zoltan Herczeg 2011-03-01 04:49:30 PST
> Thanks. and the usual question. With a gcc 4.5/4.6 how much slower is the code if one is using the intrinsics of the compiler? 
> 
> For the inline assembly:
>   - Any reason to not use GNU AS macros?
>   - Instead of harcoding register names or concating strings you can use IIRC the %[NAME] syntax for register names and GCC wil fill in the blanks for you increasing the readability of the code.

I don't know about these. The current style follows the ARM assembly functions already in the WebKit.
Comment 28 Benjamin Poulain 2011-03-01 04:54:43 PST
(In reply to comment #26)
> Runtime detection will not work here, since if you enable neon support, the compiler WILL generate neon instructions (independently from this patch), or if you disable it, it will NOT compile the assembly source code (because they are undefined instructions in that case). (Note: currently it is not supported to have different compiler optimizations for individual files)

It is fine to have Neon code in your binary as long as you don't execute it if you don't have the extension.

It is the same idea as for SS[S]E. You compile some part with SIMD, and jump to that code them when the CPU can execute it.
Comment 29 Zoltan Herczeg 2011-03-02 04:37:00 PST
This work is raised a lot of interesting questions, and it is time to summarize them.

Future work:
- All alignment macros should go to a separate wtf header file.
- Extend Qt build system with allowing compile files with different compiler options.

Note: neither of them is blocking this patch.

Other issues:
- Make the code easier to understand. Naturally, it is still required to know the SVG standard to understand it.
  - I have added several comments.
- Using intrinsics: unfortunately intrinsics offers to access only a small subset of Advanced SIMD features, and nothing more. This is a whole function, including a mixture of ARM, VFP and Advanced SIMD instructions. According to an ARM expert, the normal assembly code is the best soultion for this at the moment.
  - Moreover, at the moment the GCC support for intrinsics is poor. Maybe it will change in the future, but it is not recommended to use at the moment.
- Building for neon. To enable this code, you need to pass -mfpu=neon compiler option, and in that case, you cannot deploy the resulting binary on a non-neon system anyway, since the compiler itself will generate neon opcodes. Thus, the current approach is safe now.

If you have any further questions, please let me know.

Please don't misunderstand me, I don't say we could not improve the things in the future, just that this patch is ready for landing. This function could be made dynamically detectable, or speeding up even further with SMP, but this is not the task of this patch. It just aims to offer a performace increase with -mfpu=neon compiler option.

I soon upload the hopefully final patch.
Comment 30 Zoltan Herczeg 2011-03-02 04:46:15 PST
Created attachment 84401 [details]
patch
Comment 31 WebKit Review Bot 2011-03-02 04:49:36 PST
Attachment 84401 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/wtf/Platform.h', u'S..." exit_code: 1

Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:203:  Extra space before ( in function call  [whitespace/parens] [4]
Source/WebCore/platform/graphics/filters/arm/FELightingNeonSIMD.cpp:453:  The parameter name "NL" adds no information, so it should be removed.  [readability/parameter_name] [5]
Total errors found: 2 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 32 Gabor Loki 2011-03-07 04:49:05 PST
> Other issues:
> - Make the code easier to understand. Naturally, it is still required to know the SVG standard to understand it.
>   - I have added several comments.

Although the asm code itself and the neonized algorithm were also readable before, but the comments are really needed for non-ARM observers.

> - Using intrinsics: unfortunately intrinsics offers to access only a small subset of Advanced SIMD features, and nothing more. This is a whole function, including a mixture of ARM, VFP and Advanced SIMD instructions. According to an ARM expert, the normal assembly code is the best soultion for this at the moment.
>   - Moreover, at the moment the GCC support for intrinsics is poor. Maybe it will change in the future, but it is not recommended to use at the moment.

I totally agree with you. The GCC's intrinsics does not offer accurate controlling of the NEON infrastructure. The most painful parts are the inaccessible register mapping and the instruction mixing.

> I soon upload the hopefully final patch.

I still think this patch is fine for landing.
Gavin, what do you think about it?
Comment 33 Holger Freyther 2011-03-08 01:21:03 PST
(In reply to comment #32)
> > Other issues:
> > - Make the code easier to understand. Naturally, it is still required to know the SVG standard to understand it.
> >   - I have added several comments.
> 
> Although the asm code itself and the neonized algorithm were also readable before, but the comments are really needed for non-ARM observers.

Readable and maintainable is surely subjective. Some people have no issue to read r0-r15 and remember what they are used for, same applies for %1 but then again a variable name is
something we were very used to from high level languages and GAS supports that. Anyway.
 

> I totally agree with you. The GCC's intrinsics does not offer accurate controlling of the NEON infrastructure. The most painful parts are the inaccessible register mapping and the instruction mixing.

Well, again a subjective matter. Projects like pixman have a copy of the code in intrinsics and then the handtuned result (based on the intrinsics version) as GCC still sucks.

> 
> > I soon upload the hopefully final patch.
> 
> I still think this patch is fine for landing.
> Gavin, what do you think about it?

Not so subjective. Could you post any benchmark data? How much faster than the C version? How much faster than a version compile for neon with tree vectorize and such?
Comment 34 Gavin Barraclough 2011-03-08 02:03:27 PST
Comment on attachment 84401 [details]
patch

Okay, I've done my best to review as neither a Neon coding nor 3D lighting expert.  I've followed the asm through the best I can, and it all seems reasonable (and rather impressively tight) to me.  The style seems good, a couple of things that I don't know whether we really have precedent for in WebKit style (e.g. the NL macro), but it all reads well to me, so I'm happy with the asm as is.

I'd like to suggest making some of the setup to this & data structures be designed to be shared with SIMD implementations for other architectures (e.g. SSE), but I don't know how helpful that would be (unclear without attempting the port exactly how much code could be shared anyway, so trying to generalize at this time may lead to wasted effort).  Refactoring as necessary would only be a small part of the task if someone does decide to port to SSE. :o)

One issue.  I don't think you should need to manually align the allocation of FELightingFloatArgumentsForNeon.  There is a macro WTF_ALIGN defined in wtf that I would hope would be able to do this for you.  You probably ought to be able to use this for s_FELightingConstantsForNeon, too.  Oh, and also, in allocating & populating neonData, I think we'd usually just use a literal initialization here (e.g.:

    FELightingPaintingDataForNeon neonData = {
        data.pixels->data(),
        data.widthDecreasedByOne - 1,
        data.heightDecreasedByOne - 1,
        0,
        0,
        0,
        floatArguments,
        s_FELightingConstantsForNeon
    };

).  Oh, and one final thing, I think a comment on 'getPowerCoefficients' to explain what it is trying to achieve wouldn't hurt. :-)  It looks to me like this method could probably be written more efficiently in integer code, using the macros from the std c libs to extract the exponent from a fp number?  I guess this isn't critical to the performance improvement in this patch, but still, the method could probably do with a little explanation.

I'm going to r- because I think you should take a quick look at the WTF_ALIGN macro & I do think getPowerCoefficients deserves a comment, but I agree that this patch looks very close to landing.
Comment 35 Zoltan Herczeg 2011-03-24 06:04:06 PDT
Created attachment 86762 [details]
patch
Comment 36 WebKit Review Bot 2011-03-24 06:05:49 PDT
Attachment 86762 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/wtf/Platform.h', u'S..." exit_code: 1

Source/WebCore/platform/graphics/filters/arm/FELightingNEON.cpp:205:  Extra space before ( in function call  [whitespace/parens] [4]
Source/WebCore/platform/graphics/filters/arm/FELightingNEON.cpp:455:  The parameter name "NL" adds no information, so it should be removed.  [readability/parameter_name] [5]
Source/WebCore/platform/graphics/filters/arm/FELightingNEON.h:27:  #ifndef header guard has wrong style, please use: WTF_FELightingNEON_h  [build/header_guard] [5]
Total errors found: 3 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 37 Zoltan Herczeg 2011-04-08 05:41:45 PDT
Apart from the style fix, shall I do something with the patch?
Comment 38 Nikolas Zimmermann 2011-04-09 00:00:32 PDT
Comment on attachment 86762 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=86762&action=review

I think the patch is great, if Gavin has no more concerns or any of the other folks familiar with the actual code I'd like to see it go in :-) Still some thoughs:

> Source/WebCore/platform/graphics/filters/FELighting.cpp:461
> +    // Set lightnig arguments.

typo: lighting.

>> Source/WebCore/platform/graphics/filters/arm/FELightingNEON.cpp:205
>> +
>> +asm (
> 
> Extra space before ( in function call  [whitespace/parens] [4]

You could add "// NOLINT" here, so it stops reporting this error.

> Source/WebCore/platform/graphics/filters/arm/FELightingNEON.cpp:290
> +    // Multiply the alpha values by the X and Y matricies.

typo: matrices.

>> Source/WebCore/platform/graphics/filters/arm/FELightingNEON.cpp:455
>> +);
> 
> The parameter name "NL" adds no information, so it should be removed.  [readability/parameter_name] [5]

Same here.

>> Source/WebCore/platform/graphics/filters/arm/FELightingNEON.h:27
>> +#ifndef FELightingNeonSIMD_h
> 
> #ifndef header guard has wrong style, please use: WTF_FELightingNEON_h  [build/header_guard] [5]

Seems like a style check bug? Did someone already report it?

> Source/WebCore/platform/graphics/filters/arm/FELightingNEON.h:44
> +#define FLAG_POINT_LIGHT                 0x01
> +#define FLAG_SPOT_LIGHT                  0x02
> +#define FLAG_CONE_EXPONENT_IS_1          0x04
> +
> +// Otherwise: Diffuse light.
> +#define FLAG_SPECULAR_LIGHT              0x10
> +#define FLAG_DIFFUSE_CONST_IS_1          0x20
> +#define FLAG_SPECULAR_EXPONENT_IS_1      0x40

Can't we use enums for this?
enum LightingFlags (
    FlagPointLight = 1 << 0,
..
};

> Source/WebCore/platform/graphics/filters/arm/FELightingNEON.h:86
> +extern short s_FELightingConstantsForNeon[];

No way we can use a "short feLightingConstantsForNeon()" free function here?
Comment 39 Zoltan Herczeg 2011-04-11 00:37:06 PDT
Created attachment 88974 [details]
patch
Comment 40 WebKit Review Bot 2011-04-11 00:40:51 PDT
Attachment 88974 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'Source/JavaScriptCore/wtf/Platform.h', u'S..." exit_code: 1

Source/WebCore/platform/graphics/filters/arm/FELightingNEON.h:27:  #ifndef header guard has wrong style, please use: WTF_FELightingNEON_h  [build/header_guard] [5]
Total errors found: 1 in 8 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 41 Zoltan Herczeg 2011-04-11 01:36:00 PDT
> > Source/WebCore/platform/graphics/filters/arm/FELightingNEON.h:44
> > +#define FLAG_POINT_LIGHT                 0x01
> > +#define FLAG_SPOT_LIGHT                  0x02
> > +#define FLAG_CONE_EXPONENT_IS_1          0x04
> > +
> > +// Otherwise: Diffuse light.
> > +#define FLAG_SPECULAR_LIGHT              0x10
> > +#define FLAG_DIFFUSE_CONST_IS_1          0x20
> > +#define FLAG_SPECULAR_EXPONENT_IS_1      0x40
> 
> Can't we use enums for this?
> enum LightingFlags (
>     FlagPointLight = 1 << 0,
> ..
> };

Unfortunately not. The preprocessor cannot stringize them.
Comment 42 Nikolas Zimmermann 2011-04-14 03:59:27 PDT
Comment on attachment 88974 [details]
patch

r=me. I heard no objections about the assembly since a while, so I think it can go on. Note: I have no clue of the actual asm code, so my review is only about structural things.
If Zoltan says it works I trust him :-)
Comment 43 Zoltan Herczeg 2011-04-14 05:12:51 PDT
Landed in http://trac.webkit.org/changeset/83835