<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>107498</bug_id>
          
          <creation_ts>2013-01-21 20:46:17 -0800</creation_ts>
          <short_desc>TRANSFORMATION_MATRIX_USE_X86_64_SSE2 broken for 64-bit Windows builds</short_desc>
          <delta_ts>2013-01-22 13:21:37 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>WebCore Misc.</component>
          <version>528+ (Nightly build)</version>
          <rep_platform>Unspecified</rep_platform>
          <op_sys>Unspecified</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Justin Schuh">jschuh</reporter>
          <assigned_to name="Justin Schuh">jschuh</assigned_to>
          <cc>benjamin</cc>
    
    <cc>jamesr</cc>
    
    <cc>webkit.review.bot</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>812568</commentid>
    <comment_count>0</comment_count>
    <who name="Justin Schuh">jschuh</who>
    <bug_when>2013-01-21 20:46:17 -0800</bug_when>
    <thetext>Looks like a simple bug in this conditional macro:

72   #if CPU(X86_64) &amp;&amp; !PLATFORM(WINDOWS)
73   #define TRANSFORMATION_MATRIX_USE_X86_64_SSE2
74   #endif

The problem is that !PLATFORM(WINDOWS) doesn&apos;t do anything. So TRANSFORMATION_MATRIX_USE_X86_64_SSE2 gets defined but the compiler explodes on the alignment specifier. I assume it was meant to be PLATFORM(WIN), but COMPILER(MSVC) is probably more correct.

I could just fix the macro to exclude Windows, but the SSE intrinsics are all supported on Win64. So, is there any reason I can&apos;t just add an #if with a __declspec to set the correct alignment and enable this for Win64? (I ask, because at the moment I&apos;m just a few dependencies short of a build I can actually test with on Win64.)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>812572</commentid>
    <comment_count>1</comment_count>
      <attachid>183881</attachid>
    <who name="Justin Schuh">jschuh</who>
    <bug_when>2013-01-21 20:55:52 -0800</bug_when>
    <thetext>Created attachment 183881
Patch for discussion.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>812599</commentid>
    <comment_count>2</comment_count>
    <who name="Benjamin Poulain">benjamin</who>
    <bug_when>2013-01-21 22:00:53 -0800</bug_when>
    <thetext>I am sorry. I intended to use OS(WINDOWS).

We cannot enable that optimization as is for Windows because of alignment issues. Neither the allocator, nor the stack alignment gives us the right garanties for alignment.

Stricto sensu, it is not correct on Unix either without a custom new() with an aligned allocation. But in practice we know the ABI and how our allocator works on the platforms we support.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>812613</commentid>
    <comment_count>3</comment_count>
    <who name="Justin Schuh">jschuh</who>
    <bug_when>2013-01-21 22:20:30 -0800</bug_when>
    <thetext>Stack and struct/class member alignment can be guaranteed with __declspec(align), as in the patch I posted. And the 64-bit Windows heap guarantees 16-byte alignment, as does Chromium&apos;s 64-bit TCMalloc port.

That said, you&apos;re right that it would still be relying on quirks rather than explicit guarantees. So, if you&apos;re not comfortable with me enabling it for any Windows 64-bit build, do you have any objections to me doing so for just the 64-bit Windows Chromium port? (That is, assuming @jamesr isn&apos;t aware of something preventing this for 64-bit Chromium ports.)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>812618</commentid>
    <comment_count>4</comment_count>
    <who name="James Robinson">jamesr</who>
    <bug_when>2013-01-21 22:37:23 -0800</bug_when>
    <thetext>On Chromium Windows the allocator can be selected at runtime with an environment variable. Last I looked, we supported tcmalloc, jemalloc, and winheap at least. Is this true for the 64 bit build? Do we know the alignment properties of all of the possible allocators?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>812633</commentid>
    <comment_count>5</comment_count>
    <who name="Benjamin Poulain">benjamin</who>
    <bug_when>2013-01-21 23:02:52 -0800</bug_when>
    <thetext>(In reply to comment #3)
&gt; Stack and struct/class member alignment can be guaranteed with __declspec(align), as in the patch I posted.

This is not true on x86, but according to http://msdn.microsoft.com/en-us/library/aa290049(v=vs.71).aspx, you are right for x86_64.

&gt; And the 64-bit Windows heap guarantees 16-byte alignment, as does Chromium&apos;s 64-bit TCMalloc port.
&gt; 
&gt; That said, you&apos;re right that it would still be relying on quirks rather than explicit guarantees. So, if you&apos;re not comfortable with me enabling it for any Windows 64-bit build, do you have any objections to me doing so for just the 64-bit Windows Chromium port? (That is, assuming @jamesr isn&apos;t aware of something preventing this for 64-bit Chromium ports.)

I have no objection against enabling this on Windows. I just don&apos;t know enough about Windows x86_64 to r+.

If you enable this for Windows, can you introduce a new macro to do __attribute__((aligned (16))) || __declspec(align(16))?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>812663</commentid>
    <comment_count>6</comment_count>
    <who name="James Robinson">jamesr</who>
    <bug_when>2013-01-21 23:45:27 -0800</bug_when>
    <thetext>
&gt; 
&gt; &gt; And the 64-bit Windows heap guarantees 16-byte alignment, as does Chromium&apos;s 64-bit TCMalloc port.
&gt; &gt; 
&gt; &gt; That said, you&apos;re right that it would still be relying on quirks rather than explicit guarantees. So, if you&apos;re not comfortable with me enabling it for any Windows 64-bit build, do you have any objections to me doing so for just the 64-bit Windows Chromium port? (That is, assuming @jamesr isn&apos;t aware of something preventing this for 64-bit Chromium ports.)
&gt; 
&gt; I have no objection against enabling this on Windows. I just don&apos;t know enough about Windows x86_64 to r+.
&gt; 
&gt; If you enable this for Windows, can you introduce a new macro to do __attribute__((aligned (16))) || __declspec(align(16))?

__attribute goes after the type, __declspec goes before. You can write a function-like macro for members and locals, but it won&apos;t work for type declarations.I think standard practice is to have both pre and post macros.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>813092</commentid>
    <comment_count>7</comment_count>
    <who name="Justin Schuh">jschuh</who>
    <bug_when>2013-01-22 09:27:47 -0800</bug_when>
    <thetext>(In reply to comment #3)
&gt; I have no objection against enabling this on Windows. I just don&apos;t know enough about Windows x86_64 to r+.

Gotcha. If I do propose turning it on I&apos;ll make sure to find an appropriate reviewer. Sadly, I don&apos;t think anyone other than myself is actually working a Win64 port, so I doubt I can get *that* appropriate of a reviewer.


(In reply to comment #4)
&gt; On Chromium Windows the allocator can be selected at runtime with an environment variable.

Yup, but on win64 we don&apos;t support custom allocators yet so it&apos;s winheap only. That said, the only thing I&apos;m aware of us using the allocator switching for is winheap under instrumentation, so we&apos;re basically safe right now.

&gt; Last I looked, we supported tcmalloc, jemalloc, and winheap at least. Is this true for the 64 bit build? Do we know the alignment properties of all of the possible allocators?

You forgot Windows LFH. :) And yes, they all have a minimum 16-byte alignment on 64-bit. After we get a basic build up and running we&apos;ll eventually enable the custom allocators and test each, just to be sure.


(In reply to comment #6)
(In reply to comment #5)
&gt; &gt; If you enable this for Windows, can you introduce a new macro to do __attribute__((aligned (16))) || __declspec(align(16))?
&gt; 
&gt; __attribute goes after the type, __declspec goes before. You can write a function-like macro for members and locals, but it won&apos;t work for type declarations.I think standard practice is to have both pre and post macros.

So, add something like WTF_ALIGNED_TYPE_START(align) and WTF_ALIGNED_TYPE_END(align) to Alignment.h? Feels a bit horsey, but it&apos;s probably a better solution than anything more clever.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>813305</commentid>
    <comment_count>8</comment_count>
      <attachid>183881</attachid>
    <who name="James Robinson">jamesr</who>
    <bug_when>2013-01-22 12:17:33 -0800</bug_when>
    <thetext>Comment on attachment 183881
Patch for discussion.

I think this is fine. We can add a macro when we get a few more callers - at this point I think it&apos;s a bit premature.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>813319</commentid>
    <comment_count>9</comment_count>
    <who name="Benjamin Poulain">benjamin</who>
    <bug_when>2013-01-22 12:39:50 -0800</bug_when>
    <thetext>I was thinking of something like:

#define WTF_ALIGNED(declaration, alignment) \
    declaration __attribute__((aligned (alignment))); 

&gt; I think this is fine. We can add a macro when we get a few more callers - at this point I think it&apos;s a bit premature.

Fair enough, I don&apos;t have a problem with that.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>813320</commentid>
    <comment_count>10</comment_count>
    <who name="Justin Schuh">jschuh</who>
    <bug_when>2013-01-22 12:43:37 -0800</bug_when>
    <thetext>Since the verdict seems to be that this patch is fine, I&apos;m just going to cq+.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>813348</commentid>
    <comment_count>11</comment_count>
      <attachid>183881</attachid>
    <who name="WebKit Review Bot">webkit.review.bot</who>
    <bug_when>2013-01-22 13:21:34 -0800</bug_when>
    <thetext>Comment on attachment 183881
Patch for discussion.

Clearing flags on attachment: 183881

Committed r140455: &lt;http://trac.webkit.org/changeset/140455&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>813349</commentid>
    <comment_count>12</comment_count>
    <who name="WebKit Review Bot">webkit.review.bot</who>
    <bug_when>2013-01-22 13:21:37 -0800</bug_when>
    <thetext>All reviewed patches have been landed.  Closing bug.</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="1"
              isprivate="0"
          >
            <attachid>183881</attachid>
            <date>2013-01-21 20:55:52 -0800</date>
            <delta_ts>2013-01-22 13:21:34 -0800</delta_ts>
            <desc>Patch for discussion.</desc>
            <filename>bug-107498-20130121205247.patch</filename>
            <type>text/plain</type>
            <size>1669</size>
            <attacher name="Justin Schuh">jschuh</attacher>
            
              <data encoding="base64">SW5kZXg6IFNvdXJjZS9XZWJDb3JlL0NoYW5nZUxvZwo9PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBTb3VyY2UvV2Vi
Q29yZS9DaGFuZ2VMb2cJKHJldmlzaW9uIDE0MDM4MCkKKysrIFNvdXJjZS9XZWJDb3JlL0NoYW5n
ZUxvZwkod29ya2luZyBjb3B5KQpAQCAtMSwzICsxLDE4IEBACisyMDEzLTAxLTIxICBKdXN0aW4g
U2NodWggIDxqc2NodWhAY2hyb21pdW0ub3JnPgorCisgICAgICAgIFRSQU5TRk9STUFUSU9OX01B
VFJJWF9VU0VfWDg2XzY0X1NTRTIgYnJva2VuIGZvciA2NC1iaXQgV2luZG93cyBidWlsZHMKKyAg
ICAgICAgaHR0cHM6Ly9idWdzLndlYmtpdC5vcmcvc2hvd19idWcuY2dpP2lkPTEwNzQ5OAorCisg
ICAgICAgIFJldmlld2VkIGJ5IE5PQk9EWSAoT09QUyEpLgorICAgICAgICAKKyAgICAgICAgRW5h
YmxlIFRSQU5TRk9STUFUSU9OX01BVFJJWF9VU0VfWDg2XzY0X1NTRTIgZm9yIDY0LWJpdCBXaW5k
b3dzLgorCisgICAgICAgIE5vIG5ldyB0ZXN0cy4gQ292ZXJlZCBieSBleGlzdGluZyB0ZXN0cy4K
KworICAgICAgICAqIHBsYXRmb3JtL2dyYXBoaWNzL3RyYW5zZm9ybXMvVHJhbnNmb3JtYXRpb25N
YXRyaXguaDoKKyAgICAgICAgKFdlYkNvcmUpOgorICAgICAgICAoVHJhbnNmb3JtYXRpb25NYXRy
aXgpOgorCiAyMDEzLTAxLTIxICBEaXJrIFNjaHVsemUgIDxkc2NodWx6ZUBhZG9iZS5jb20+CiAK
ICAgICAgICAgQWRkIGJ1aWxkIGZsYWcgZm9yIENhbnZhcydzIFBhdGggb2JqZWN0IChkaXNhYmxl
ZCBieSBkZWZhdWx0KQpJbmRleDogU291cmNlL1dlYkNvcmUvcGxhdGZvcm0vZ3JhcGhpY3MvdHJh
bnNmb3Jtcy9UcmFuc2Zvcm1hdGlvbk1hdHJpeC5oCj09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIFNvdXJjZS9XZWJD
b3JlL3BsYXRmb3JtL2dyYXBoaWNzL3RyYW5zZm9ybXMvVHJhbnNmb3JtYXRpb25NYXRyaXguaAko
cmV2aXNpb24gMTQwMzQ5KQorKysgU291cmNlL1dlYkNvcmUvcGxhdGZvcm0vZ3JhcGhpY3MvdHJh
bnNmb3Jtcy9UcmFuc2Zvcm1hdGlvbk1hdHJpeC5oCSh3b3JraW5nIGNvcHkpCkBAIC02OSwxNSAr
NjksMjAgQEAgY2xhc3MgTGF5b3V0UmVjdDsKIGNsYXNzIEZsb2F0UmVjdDsKIGNsYXNzIEZsb2F0
UXVhZDsKIAotI2lmIENQVShYODZfNjQpICYmICFQTEFURk9STShXSU5ET1dTKQorI2lmIENQVShY
ODZfNjQpCiAjZGVmaW5lIFRSQU5TRk9STUFUSU9OX01BVFJJWF9VU0VfWDg2XzY0X1NTRTIKICNl
bmRpZgogCiBjbGFzcyBUcmFuc2Zvcm1hdGlvbk1hdHJpeCB7CiAgICAgV1RGX01BS0VfRkFTVF9B
TExPQ0FURUQ7CiBwdWJsaWM6CisKICNpZiBDUFUoQVBQTEVfQVJNVjdTKSB8fCBkZWZpbmVkKFRS
QU5TRk9STUFUSU9OX01BVFJJWF9VU0VfWDg2XzY0X1NTRTIpCisjaWYgQ09NUElMRVIoTVNWQykK
KyAgICBfX2RlY2xzcGVjKGFsaWduKDE2KSkgdHlwZWRlZiBkb3VibGUgTWF0cml4NFs0XVs0XTsK
KyNlbHNlCiAgICAgdHlwZWRlZiBkb3VibGUgTWF0cml4NFs0XVs0XSBfX2F0dHJpYnV0ZV9fKChh
bGlnbmVkICgxNikpKTsKKyNlbmRpZgogI2Vsc2UKICAgICB0eXBlZGVmIGRvdWJsZSBNYXRyaXg0
WzRdWzRdOwogI2VuZGlmCg==
</data>

          </attachment>
      

    </bug>

</bugzilla>