GCC has the ability to perform auto-vectorization, but only does so at -O3 or when explicitly told to. If we add a flag that tells GCC to perform auto-vectorization on specific functions, we can enabled it on those functions alone that would benifit from it, and thereby do not risk the potential binary bloat or performance degradation that can come from using -O3 indiscriminately.
Created attachment 154346 [details] Patch
Comment on attachment 154346 [details] Patch Attachment 154346 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/13343275
(In reply to comment #2) > (From update of attachment 154346 [details]) > Attachment 154346 [details] did not pass chromium-ews (chromium-xvfb): > Output: http://queues.webkit.org/results/13343275 Internal compiler error? Jeebers.. Update your compiler, guys! I guess I can add a check to only enable this on GCC 4.6 or newer.
Created attachment 154368 [details] Patch
I should add that the patches landed for bug #91398 and bug #92123 depends on this patch to take full advantage of their improvements.
This seems like a lot of use of this function. Are each of these a win? If so, how much? Just like LIKELY and UNLIKELY, I would expect we'd only want to use this on hot code paths where we new it to be a win. Otherwise we're just setting ourselves up for overly constrained/confused compilers in the future.
(In reply to comment #6) > This seems like a lot of use of this function. Are each of these a win? If so, how much? Just like LIKELY and UNLIKELY, I would expect we'd only want to use this on hot code paths where we new it to be a win. Otherwise we're just setting ourselves up for overly constrained/confused compilers in the future. I thought this was actually very little use at this point, this is only eight places. Two of the places just has many variant versions of the inner function.
Comment on attachment 154368 [details] Patch Because it disable the inlining, it is not a win-or-neutral situation, you can easily get slower by adding the attribute. I think it is best to enable it on a case by case basis by backing it with the benchmark numbers.
(In reply to comment #8) > (From update of attachment 154368 [details]) > Because it disable the inlining, it is not a win-or-neutral situation, you can easily get slower by adding the attribute. > > I think it is best to enable it on a case by case basis by backing it with the benchmark numbers. Yes. I had some discussion on the gcc mailing list over the issue. The didn't recommend trying to enable optimization for individual functions. Using PRAGMAs will not work either. At best we could identify files and enable extra optimization on a file by file basis. Sorry that I forgot to close the bug.