WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
105719
Optimize TransformationMatrix::multiply() for x86_64
https://bugs.webkit.org/show_bug.cgi?id=105719
Summary
Optimize TransformationMatrix::multiply() for x86_64
Benjamin Poulain
Reported
2012-12-24 07:17:42 PST
Use the hardware better :)
Attachments
Patch
(9.34 KB, patch)
2012-12-24 10:18 PST
,
Benjamin Poulain
no flags
Details
Formatted Diff
Diff
Patch
(9.46 KB, patch)
2013-01-04 14:40 PST
,
Benjamin Poulain
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Benjamin Poulain
Comment 1
2012-12-24 10:18:08 PST
Created
attachment 180678
[details]
Patch
Benjamin Poulain
Comment 2
2013-01-02 13:21:04 PST
Comment on
attachment 180678
[details]
Patch Clearing flags on attachment: 180678 Committed
r138640
: <
http://trac.webkit.org/changeset/138640
>
Benjamin Poulain
Comment 3
2013-01-02 13:21:06 PST
All reviewed patches have been landed. Closing bug.
Dana Jansens
Comment 4
2013-01-03 08:47:44 PST
Looks like this change is causing crashes:
http://code.google.com/p/chromium/issues/detail?id=168173
Sorry, but it looks like it needs to be reverted. I've also confirmed locally that reverting this fixes the crashes.
WebKit Review Bot
Comment 5
2013-01-03 08:49:49 PST
Re-opened since this is blocked by
bug 106019
Benjamin Poulain
Comment 6
2013-01-03 09:36:35 PST
Did you just revert based on a downstream issue? (this has been discussed recently on the mailing list) Can you give me more information about the problem. As far as I know, no WebKit test failed with the patch. How did you rule out your compiler or project settings? I am honestly more than a little annoyed this was reverted without any information for me to work on.
Dana Jansens
Comment 7
2013-01-03 09:44:54 PST
We ran the test in gdb for quite some time but were unable to get any much information about what was causing the crash. As you can see in the backtrace on the linked bug, the crash in gdb is happening at the entrance to the method. The pointers going into the method are all fine, of course, and gdb didn't have anything interesting to say about the two matrices, they look valid. I can provide the contents of them for you though, if you feel that will help. This seems to point to a problem in the implementation of the method, which is surely going to cause a problem for all ports. These tests just happen to cause it to trigger reliably. This has really nothing to do with "downstream vs upstream" as far as I can tell. I don't know what project settings you're referring to that would change whether multiply() should crash or not given two matrices. I'm sorry we don't have better test coverage on the webkit bots for functions like this, but I don't think it's that unusual for chromium bots to uncover bugs or problems that the webkit bots do not for that reason. We're looking into running the chromium compositor unit tests on the EWS bot, or on the canary waterfall, which would have helped here. (gdb) frame 0 #0 0x00007ffff42dce67 in WebCore::TransformationMatrix::multiply (this=0x7fffffffd4c8, mat=...) at ../../third_party/WebKit/Source/WebCore/platform/graphics/transforms/TransformationMatrix.cpp:977 977 { (gdb) p *this $1 = {m_matrix = {[0] = {[0] = 1, [1] = 0, [2] = 0, [3] = 0}, [1] = {[0] = 0, [1] = 1, [2] = 0, [3] = 0}, [2] = {[0] = 0, [1] = 0, [2] = 1, [3] = 0}, [3] = {[0] = 0, [1] = 0, [2] = 0, [3] = 1}}} (gdb) p mat $2 = (const WebCore::TransformationMatrix &) @0xb64920: {m_matrix = {[0] = {[0] = 1, [1] = 0, [2] = 0, [3] = 0}, [1] = {[0] = 0, [1] = 1, [2] = 0, [3] = 0}, [2] = {[0] = 0, [1] = 0, [2] = 1, [3] = 0}, [3] = {[0] = 2, [1] = 0, [2] = 0, [3] = 1}}}
Benjamin Poulain
Comment 8
2013-01-03 09:49:57 PST
I suspect you simply have an alignment problem. There are many ways to screw that up despite completely correct code. Please give me the disassembly at the point of crash and the content of registers. What compiler are you using?
Dana Jansens
Comment 9
2013-01-03 09:53:26 PST
Compiler: % clang++ --version clang version 3.3 (trunk 170392) Target: x86_64-unknown-linux-gnu Thread model: posix Registers: rax 0xb64920 11946272 rbx 0x0 0 rcx 0x0 0 rdx 0x7fffffffd4c8 140737488344264 rsi 0x7fffffffd4c8 140737488344264 rdi 0x7fffffffd4c8 140737488344264 rbp 0x7fffffffd150 0x7fffffffd150 rsp 0x7fffffffc650 0x7fffffffc650 r8 0x0 0 r9 0xfffffff7 4294967287 r10 0x0 0 r11 0x0 0 r12 0x42dce0 4381920 r13 0x7fffffffdb50 140737488345936 r14 0x0 0 r15 0x0 0 rip 0x7ffff42dce67 0x7ffff42dce67 <WebCore::TransformationMatrix::multiply(WebCore::TransformationMatrix const&)+39> eflags 0x10206 [ PF IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 Dump of assembler code for function WebCore::TransformationMatrix::multiply(WebCore::TransformationMatrix const&): 0x00007ffff42dce40 <+0>: sub $0xad8,%rsp 0x00007ffff42dce47 <+7>: mov %rdi,0x90(%rsp) 0x00007ffff42dce4f <+15>: mov %rsi,0x88(%rsp) 0x00007ffff42dce57 <+23>: mov 0x90(%rsp),%rsi 0x00007ffff42dce5f <+31>: mov %rsi,0x98(%rsp) => 0x00007ffff42dce67 <+39>: movapd (%rsi),%xmm0 0x00007ffff42dce6b <+43>: movapd %xmm0,0x70(%rsp) 0x00007ffff42dce71 <+49>: mov %rsi,%rdi 0x00007ffff42dce74 <+52>: add $0x20,%rdi 0x00007ffff42dce78 <+56>: mov %rdi,0xad0(%rsp) 0x00007ffff42dce80 <+64>: movapd 0x20(%rsi),%xmm0 0x00007ffff42dce85 <+69>: movapd %xmm0,0x60(%rsp) 0x00007ffff42dce8b <+75>: mov %rsi,%rax 0x00007ffff42dce8e <+78>: add $0x40,%rax 0x00007ffff42dce92 <+82>: mov %rax,0xac8(%rsp) 0x00007ffff42dce9a <+90>: movapd 0x40(%rsi),%xmm0 0x00007ffff42dce9f <+95>: movapd %xmm0,0x50(%rsp) 0x00007ffff42dcea5 <+101>: mov %rsi,%rcx 0x00007ffff42dcea8 <+104>: add $0x60,%rcx 0x00007ffff42dceac <+108>: mov %rcx,0xac0(%rsp) 0x00007ffff42dceb4 <+116>: movapd 0x60(%rsi),%xmm0 0x00007ffff42dceb9 <+121>: movapd %xmm0,0x40(%rsp) 0x00007ffff42dcebf <+127>: mov %rsi,%rdx 0x00007ffff42dcec2 <+130>: add $0x10,%rdx 0x00007ffff42dcec6 <+134>: mov %rdx,0xab8(%rsp) 0x00007ffff42dcece <+142>: movapd 0x10(%rsi),%xmm0 0x00007ffff42dced3 <+147>: movapd %xmm0,0x30(%rsp) 0x00007ffff42dced9 <+153>: mov %rsi,%r8 0x00007ffff42dcedc <+156>: add $0x30,%r8 0x00007ffff42dcee0 <+160>: mov %r8,0xab0(%rsp) 0x00007ffff42dcee8 <+168>: movapd 0x30(%rsi),%xmm0 0x00007ffff42dceed <+173>: movapd %xmm0,0x20(%rsp) 0x00007ffff42dcef3 <+179>: mov %rsi,%r9 0x00007ffff42dcef6 <+182>: add $0x50,%r9 0x00007ffff42dcefa <+186>: mov %r9,0xaa8(%rsp) 0x00007ffff42dcf02 <+194>: movapd 0x50(%rsi),%xmm0 0x00007ffff42dcf07 <+199>: movapd %xmm0,0x10(%rsp) 0x00007ffff42dcf0d <+205>: mov %rsi,%r10 0x00007ffff42dcf10 <+208>: add $0x70,%r10 0x00007ffff42dcf14 <+212>: mov %r10,0xaa0(%rsp) 0x00007ffff42dcf1c <+220>: movapd 0x70(%rsi),%xmm0 0x00007ffff42dcf21 <+225>: movapd %xmm0,(%rsp) 0x00007ffff42dcf26 <+230>: mov 0x88(%rsp),%r11 0x00007ffff42dcf2e <+238>: movsd (%r11),%xmm0 0x00007ffff42dcf33 <+243>: movsd %xmm0,0xa98(%rsp) 0x00007ffff42dcf3c <+252>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dcf41 <+257>: movapd %xmm0,0xa80(%rsp) 0x00007ffff42dcf4a <+266>: movapd %xmm0,-0x10(%rsp) 0x00007ffff42dcf50 <+272>: mov 0x88(%rsp),%r11 0x00007ffff42dcf58 <+280>: movsd 0x8(%r11),%xmm0 0x00007ffff42dcf5e <+286>: movsd %xmm0,0xa78(%rsp) 0x00007ffff42dcf67 <+295>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dcf6c <+300>: movapd %xmm0,0xa60(%rsp) 0x00007ffff42dcf75 <+309>: movapd %xmm0,-0x20(%rsp) 0x00007ffff42dcf7b <+315>: mov 0x88(%rsp),%r11 0x00007ffff42dcf83 <+323>: movsd 0x10(%r11),%xmm0 0x00007ffff42dcf89 <+329>: movsd %xmm0,0xa58(%rsp) 0x00007ffff42dcf92 <+338>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dcf97 <+343>: movapd %xmm0,0xa40(%rsp) 0x00007ffff42dcfa0 <+352>: movapd %xmm0,-0x30(%rsp) 0x00007ffff42dcfa6 <+358>: mov 0x88(%rsp),%r11 0x00007ffff42dcfae <+366>: movsd 0x18(%r11),%xmm0 0x00007ffff42dcfb4 <+372>: movsd %xmm0,0xa38(%rsp) 0x00007ffff42dcfbd <+381>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dcfc2 <+386>: movapd %xmm0,0xa20(%rsp) 0x00007ffff42dcfcb <+395>: movapd %xmm0,-0x40(%rsp) 0x00007ffff42dcfd1 <+401>: movapd 0x70(%rsp),%xmm0 0x00007ffff42dcfd7 <+407>: movapd -0x10(%rsp),%xmm1 0x00007ffff42dcfdd <+413>: movapd %xmm0,0xa10(%rsp) 0x00007ffff42dcfe6 <+422>: movapd %xmm1,0xa00(%rsp) 0x00007ffff42dcfef <+431>: movapd 0xa10(%rsp),%xmm0 0x00007ffff42dcff8 <+440>: mulpd %xmm1,%xmm0 0x00007ffff42dcffc <+444>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd002 <+450>: movapd 0x60(%rsp),%xmm0 0x00007ffff42dd008 <+456>: movapd -0x20(%rsp),%xmm1 0x00007ffff42dd00e <+462>: movapd %xmm0,0x9f0(%rsp) 0x00007ffff42dd017 <+471>: movapd %xmm1,0x9e0(%rsp) 0x00007ffff42dd020 <+480>: movapd 0x9f0(%rsp),%xmm0 0x00007ffff42dd029 <+489>: mulpd %xmm1,%xmm0 0x00007ffff42dd02d <+493>: movapd %xmm0,-0x60(%rsp) 0x00007ffff42dd033 <+499>: movapd 0x50(%rsp),%xmm0 0x00007ffff42dd039 <+505>: movapd -0x30(%rsp),%xmm1 0x00007ffff42dd03f <+511>: movapd %xmm0,0x9d0(%rsp) 0x00007ffff42dd048 <+520>: movapd %xmm1,0x9c0(%rsp) 0x00007ffff42dd051 <+529>: movapd 0x9d0(%rsp),%xmm0 0x00007ffff42dd05a <+538>: mulpd %xmm1,%xmm0 0x00007ffff42dd05e <+542>: movapd %xmm0,-0x70(%rsp) 0x00007ffff42dd064 <+548>: movapd 0x40(%rsp),%xmm0 0x00007ffff42dd06a <+554>: movapd -0x40(%rsp),%xmm1 0x00007ffff42dd070 <+560>: movapd %xmm0,0x9b0(%rsp) 0x00007ffff42dd079 <+569>: movapd %xmm1,0x9a0(%rsp) 0x00007ffff42dd082 <+578>: movapd 0x9b0(%rsp),%xmm0 0x00007ffff42dd08b <+587>: mulpd %xmm1,%xmm0 0x00007ffff42dd08f <+591>: movapd %xmm0,-0x80(%rsp) 0x00007ffff42dd095 <+597>: movapd -0x50(%rsp),%xmm0 0x00007ffff42dd09b <+603>: movapd -0x60(%rsp),%xmm1 0x00007ffff42dd0a1 <+609>: movapd %xmm0,0x990(%rsp) 0x00007ffff42dd0aa <+618>: movapd %xmm1,0x980(%rsp) 0x00007ffff42dd0b3 <+627>: movapd 0x990(%rsp),%xmm0 0x00007ffff42dd0bc <+636>: addpd %xmm1,%xmm0 0x00007ffff42dd0c0 <+640>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd0c6 <+646>: movapd -0x70(%rsp),%xmm1 0x00007ffff42dd0cc <+652>: movapd %xmm0,0x970(%rsp) 0x00007ffff42dd0d5 <+661>: movapd %xmm1,0x960(%rsp) 0x00007ffff42dd0de <+670>: movapd 0x970(%rsp),%xmm0 0x00007ffff42dd0e7 <+679>: addpd %xmm1,%xmm0 0x00007ffff42dd0eb <+683>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd0f1 <+689>: movapd -0x80(%rsp),%xmm1 0x00007ffff42dd0f7 <+695>: movapd %xmm0,0x950(%rsp) 0x00007ffff42dd100 <+704>: movapd %xmm1,0x940(%rsp) 0x00007ffff42dd109 <+713>: movapd 0x950(%rsp),%xmm0 0x00007ffff42dd112 <+722>: addpd %xmm1,%xmm0 0x00007ffff42dd116 <+726>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd11c <+732>: mov %rsi,0x938(%rsp) 0x00007ffff42dd124 <+740>: movapd %xmm0,0x920(%rsp) 0x00007ffff42dd12d <+749>: mov 0x938(%rsp),%r11 0x00007ffff42dd135 <+757>: movapd %xmm0,(%r11) 0x00007ffff42dd13a <+762>: movapd 0x30(%rsp),%xmm0 0x00007ffff42dd140 <+768>: movapd -0x10(%rsp),%xmm1 0x00007ffff42dd146 <+774>: movapd %xmm0,0x910(%rsp) 0x00007ffff42dd14f <+783>: movapd %xmm1,0x900(%rsp) 0x00007ffff42dd158 <+792>: movapd 0x910(%rsp),%xmm0 0x00007ffff42dd161 <+801>: mulpd %xmm1,%xmm0 0x00007ffff42dd165 <+805>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd16b <+811>: movapd 0x20(%rsp),%xmm0 0x00007ffff42dd171 <+817>: movapd -0x20(%rsp),%xmm1 0x00007ffff42dd177 <+823>: movapd %xmm0,0x8f0(%rsp) 0x00007ffff42dd180 <+832>: movapd %xmm1,0x8e0(%rsp) 0x00007ffff42dd189 <+841>: movapd 0x8f0(%rsp),%xmm0 0x00007ffff42dd192 <+850>: mulpd %xmm1,%xmm0 0x00007ffff42dd196 <+854>: movapd %xmm0,-0x60(%rsp) 0x00007ffff42dd19c <+860>: movapd 0x10(%rsp),%xmm0 0x00007ffff42dd1a2 <+866>: movapd -0x30(%rsp),%xmm1 0x00007ffff42dd1a8 <+872>: movapd %xmm0,0x8d0(%rsp) 0x00007ffff42dd1b1 <+881>: movapd %xmm1,0x8c0(%rsp) 0x00007ffff42dd1ba <+890>: movapd 0x8d0(%rsp),%xmm0 0x00007ffff42dd1c3 <+899>: mulpd %xmm1,%xmm0 0x00007ffff42dd1c7 <+903>: movapd %xmm0,-0x70(%rsp) 0x00007ffff42dd1cd <+909>: movapd (%rsp),%xmm0 0x00007ffff42dd1d2 <+914>: movapd -0x40(%rsp),%xmm1 0x00007ffff42dd1d8 <+920>: movapd %xmm0,0x8b0(%rsp) 0x00007ffff42dd1e1 <+929>: movapd %xmm1,0x8a0(%rsp) 0x00007ffff42dd1ea <+938>: movapd 0x8b0(%rsp),%xmm0 0x00007ffff42dd1f3 <+947>: mulpd %xmm1,%xmm0 0x00007ffff42dd1f7 <+951>: movapd %xmm0,-0x80(%rsp) 0x00007ffff42dd1fd <+957>: movapd -0x50(%rsp),%xmm0 0x00007ffff42dd203 <+963>: movapd -0x60(%rsp),%xmm1 0x00007ffff42dd209 <+969>: movapd %xmm0,0x890(%rsp) 0x00007ffff42dd212 <+978>: movapd %xmm1,0x880(%rsp) 0x00007ffff42dd21b <+987>: movapd 0x890(%rsp),%xmm0 0x00007ffff42dd224 <+996>: addpd %xmm1,%xmm0 0x00007ffff42dd228 <+1000>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd22e <+1006>: movapd -0x70(%rsp),%xmm1 0x00007ffff42dd234 <+1012>: movapd %xmm0,0x870(%rsp) 0x00007ffff42dd23d <+1021>: movapd %xmm1,0x860(%rsp) 0x00007ffff42dd246 <+1030>: movapd 0x870(%rsp),%xmm0 0x00007ffff42dd24f <+1039>: addpd %xmm1,%xmm0 0x00007ffff42dd253 <+1043>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd259 <+1049>: movapd -0x80(%rsp),%xmm1 0x00007ffff42dd25f <+1055>: movapd %xmm0,0x850(%rsp) 0x00007ffff42dd268 <+1064>: movapd %xmm1,0x840(%rsp) 0x00007ffff42dd271 <+1073>: movapd 0x850(%rsp),%xmm0 0x00007ffff42dd27a <+1082>: addpd %xmm1,%xmm0 0x00007ffff42dd27e <+1086>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd284 <+1092>: mov %rdx,0x838(%rsp) 0x00007ffff42dd28c <+1100>: movapd %xmm0,0x820(%rsp) 0x00007ffff42dd295 <+1109>: mov 0x838(%rsp),%rdx 0x00007ffff42dd29d <+1117>: movapd %xmm0,(%rdx) 0x00007ffff42dd2a1 <+1121>: mov 0x88(%rsp),%rdx 0x00007ffff42dd2a9 <+1129>: movsd 0x20(%rdx),%xmm0 0x00007ffff42dd2ae <+1134>: movsd %xmm0,0x818(%rsp) 0x00007ffff42dd2b7 <+1143>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd2bc <+1148>: movapd %xmm0,0x800(%rsp) 0x00007ffff42dd2c5 <+1157>: movapd %xmm0,-0x10(%rsp) 0x00007ffff42dd2cb <+1163>: mov 0x88(%rsp),%rdx 0x00007ffff42dd2d3 <+1171>: movsd 0x28(%rdx),%xmm0 0x00007ffff42dd2d8 <+1176>: movsd %xmm0,0x7f8(%rsp) 0x00007ffff42dd2e1 <+1185>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd2e6 <+1190>: movapd %xmm0,0x7e0(%rsp) 0x00007ffff42dd2ef <+1199>: movapd %xmm0,-0x20(%rsp) 0x00007ffff42dd2f5 <+1205>: mov 0x88(%rsp),%rdx 0x00007ffff42dd2fd <+1213>: movsd 0x30(%rdx),%xmm0 0x00007ffff42dd302 <+1218>: movsd %xmm0,0x7d8(%rsp) 0x00007ffff42dd30b <+1227>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd310 <+1232>: movapd %xmm0,0x7c0(%rsp) 0x00007ffff42dd319 <+1241>: movapd %xmm0,-0x30(%rsp) 0x00007ffff42dd31f <+1247>: mov 0x88(%rsp),%rdx 0x00007ffff42dd327 <+1255>: movsd 0x38(%rdx),%xmm0 0x00007ffff42dd32c <+1260>: movsd %xmm0,0x7b8(%rsp) 0x00007ffff42dd335 <+1269>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd33a <+1274>: movapd %xmm0,0x7a0(%rsp) 0x00007ffff42dd343 <+1283>: movapd %xmm0,-0x40(%rsp) 0x00007ffff42dd349 <+1289>: movapd 0x70(%rsp),%xmm0 0x00007ffff42dd34f <+1295>: movapd -0x10(%rsp),%xmm1 0x00007ffff42dd355 <+1301>: movapd %xmm0,0x790(%rsp) 0x00007ffff42dd35e <+1310>: movapd %xmm1,0x780(%rsp) 0x00007ffff42dd367 <+1319>: movapd 0x790(%rsp),%xmm0 0x00007ffff42dd370 <+1328>: mulpd %xmm1,%xmm0 0x00007ffff42dd374 <+1332>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd37a <+1338>: movapd 0x60(%rsp),%xmm0 0x00007ffff42dd380 <+1344>: movapd -0x20(%rsp),%xmm1 0x00007ffff42dd386 <+1350>: movapd %xmm0,0x770(%rsp) 0x00007ffff42dd38f <+1359>: movapd %xmm1,0x760(%rsp) 0x00007ffff42dd398 <+1368>: movapd 0x770(%rsp),%xmm0 0x00007ffff42dd3a1 <+1377>: mulpd %xmm1,%xmm0 0x00007ffff42dd3a5 <+1381>: movapd %xmm0,-0x60(%rsp) 0x00007ffff42dd3ab <+1387>: movapd 0x50(%rsp),%xmm0 0x00007ffff42dd3b1 <+1393>: movapd -0x30(%rsp),%xmm1 0x00007ffff42dd3b7 <+1399>: movapd %xmm0,0x750(%rsp) 0x00007ffff42dd3c0 <+1408>: movapd %xmm1,0x740(%rsp) 0x00007ffff42dd3c9 <+1417>: movapd 0x750(%rsp),%xmm0 0x00007ffff42dd3d2 <+1426>: mulpd %xmm1,%xmm0 0x00007ffff42dd3d6 <+1430>: movapd %xmm0,-0x70(%rsp) 0x00007ffff42dd3dc <+1436>: movapd 0x40(%rsp),%xmm0 0x00007ffff42dd3e2 <+1442>: movapd -0x40(%rsp),%xmm1 0x00007ffff42dd3e8 <+1448>: movapd %xmm0,0x730(%rsp) 0x00007ffff42dd3f1 <+1457>: movapd %xmm1,0x720(%rsp) 0x00007ffff42dd3fa <+1466>: movapd 0x730(%rsp),%xmm0 0x00007ffff42dd403 <+1475>: mulpd %xmm1,%xmm0 0x00007ffff42dd407 <+1479>: movapd %xmm0,-0x80(%rsp) 0x00007ffff42dd40d <+1485>: movapd -0x50(%rsp),%xmm0 0x00007ffff42dd413 <+1491>: movapd -0x60(%rsp),%xmm1 0x00007ffff42dd419 <+1497>: movapd %xmm0,0x710(%rsp) 0x00007ffff42dd422 <+1506>: movapd %xmm1,0x700(%rsp) 0x00007ffff42dd42b <+1515>: movapd 0x710(%rsp),%xmm0 0x00007ffff42dd434 <+1524>: addpd %xmm1,%xmm0 0x00007ffff42dd438 <+1528>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd43e <+1534>: movapd -0x70(%rsp),%xmm1 0x00007ffff42dd444 <+1540>: movapd %xmm0,0x6f0(%rsp) 0x00007ffff42dd44d <+1549>: movapd %xmm1,0x6e0(%rsp) 0x00007ffff42dd456 <+1558>: movapd 0x6f0(%rsp),%xmm0 0x00007ffff42dd45f <+1567>: addpd %xmm1,%xmm0 0x00007ffff42dd463 <+1571>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd469 <+1577>: movapd -0x80(%rsp),%xmm1 0x00007ffff42dd46f <+1583>: movapd %xmm0,0x6d0(%rsp) 0x00007ffff42dd478 <+1592>: movapd %xmm1,0x6c0(%rsp) 0x00007ffff42dd481 <+1601>: movapd 0x6d0(%rsp),%xmm0 0x00007ffff42dd48a <+1610>: addpd %xmm1,%xmm0 0x00007ffff42dd48e <+1614>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd494 <+1620>: mov %rdi,0x6b8(%rsp) 0x00007ffff42dd49c <+1628>: movapd %xmm0,0x6a0(%rsp) 0x00007ffff42dd4a5 <+1637>: mov 0x6b8(%rsp),%rdx 0x00007ffff42dd4ad <+1645>: movapd %xmm0,(%rdx) 0x00007ffff42dd4b1 <+1649>: movapd 0x30(%rsp),%xmm0 0x00007ffff42dd4b7 <+1655>: movapd -0x10(%rsp),%xmm1 0x00007ffff42dd4bd <+1661>: movapd %xmm0,0x690(%rsp) 0x00007ffff42dd4c6 <+1670>: movapd %xmm1,0x680(%rsp) 0x00007ffff42dd4cf <+1679>: movapd 0x690(%rsp),%xmm0 0x00007ffff42dd4d8 <+1688>: mulpd %xmm1,%xmm0 0x00007ffff42dd4dc <+1692>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd4e2 <+1698>: movapd 0x20(%rsp),%xmm0 0x00007ffff42dd4e8 <+1704>: movapd -0x20(%rsp),%xmm1 0x00007ffff42dd4ee <+1710>: movapd %xmm0,0x670(%rsp) 0x00007ffff42dd4f7 <+1719>: movapd %xmm1,0x660(%rsp) 0x00007ffff42dd500 <+1728>: movapd 0x670(%rsp),%xmm0 0x00007ffff42dd509 <+1737>: mulpd %xmm1,%xmm0 0x00007ffff42dd50d <+1741>: movapd %xmm0,-0x60(%rsp) 0x00007ffff42dd513 <+1747>: movapd 0x10(%rsp),%xmm0 0x00007ffff42dd519 <+1753>: movapd -0x30(%rsp),%xmm1 0x00007ffff42dd51f <+1759>: movapd %xmm0,0x650(%rsp) 0x00007ffff42dd528 <+1768>: movapd %xmm1,0x640(%rsp) 0x00007ffff42dd531 <+1777>: movapd 0x650(%rsp),%xmm0 0x00007ffff42dd53a <+1786>: mulpd %xmm1,%xmm0 0x00007ffff42dd53e <+1790>: movapd %xmm0,-0x70(%rsp) 0x00007ffff42dd544 <+1796>: movapd (%rsp),%xmm0 0x00007ffff42dd549 <+1801>: movapd -0x40(%rsp),%xmm1 0x00007ffff42dd54f <+1807>: movapd %xmm0,0x630(%rsp) 0x00007ffff42dd558 <+1816>: movapd %xmm1,0x620(%rsp) 0x00007ffff42dd561 <+1825>: movapd 0x630(%rsp),%xmm0 0x00007ffff42dd56a <+1834>: mulpd %xmm1,%xmm0 0x00007ffff42dd56e <+1838>: movapd %xmm0,-0x80(%rsp) 0x00007ffff42dd574 <+1844>: movapd -0x50(%rsp),%xmm0 0x00007ffff42dd57a <+1850>: movapd -0x60(%rsp),%xmm1 0x00007ffff42dd580 <+1856>: movapd %xmm0,0x610(%rsp) 0x00007ffff42dd589 <+1865>: movapd %xmm1,0x600(%rsp) 0x00007ffff42dd592 <+1874>: movapd 0x610(%rsp),%xmm0 0x00007ffff42dd59b <+1883>: addpd %xmm1,%xmm0 0x00007ffff42dd59f <+1887>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd5a5 <+1893>: movapd -0x70(%rsp),%xmm1 0x00007ffff42dd5ab <+1899>: movapd %xmm0,0x5f0(%rsp) 0x00007ffff42dd5b4 <+1908>: movapd %xmm1,0x5e0(%rsp) 0x00007ffff42dd5bd <+1917>: movapd 0x5f0(%rsp),%xmm0 0x00007ffff42dd5c6 <+1926>: addpd %xmm1,%xmm0 0x00007ffff42dd5ca <+1930>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd5d0 <+1936>: movapd -0x80(%rsp),%xmm1 0x00007ffff42dd5d6 <+1942>: movapd %xmm0,0x5d0(%rsp) 0x00007ffff42dd5df <+1951>: movapd %xmm1,0x5c0(%rsp) 0x00007ffff42dd5e8 <+1960>: movapd 0x5d0(%rsp),%xmm0 0x00007ffff42dd5f1 <+1969>: addpd %xmm1,%xmm0 0x00007ffff42dd5f5 <+1973>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd5fb <+1979>: mov %r8,0x5b8(%rsp) 0x00007ffff42dd603 <+1987>: movapd %xmm0,0x5a0(%rsp) 0x00007ffff42dd60c <+1996>: mov 0x5b8(%rsp),%rdx 0x00007ffff42dd614 <+2004>: movapd %xmm0,(%rdx) 0x00007ffff42dd618 <+2008>: mov 0x88(%rsp),%rdx 0x00007ffff42dd620 <+2016>: movsd 0x40(%rdx),%xmm0 0x00007ffff42dd625 <+2021>: movsd %xmm0,0x598(%rsp) 0x00007ffff42dd62e <+2030>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd633 <+2035>: movapd %xmm0,0x580(%rsp) 0x00007ffff42dd63c <+2044>: movapd %xmm0,-0x10(%rsp) 0x00007ffff42dd642 <+2050>: mov 0x88(%rsp),%rdx 0x00007ffff42dd64a <+2058>: movsd 0x48(%rdx),%xmm0 0x00007ffff42dd64f <+2063>: movsd %xmm0,0x578(%rsp) 0x00007ffff42dd658 <+2072>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd65d <+2077>: movapd %xmm0,0x560(%rsp) 0x00007ffff42dd666 <+2086>: movapd %xmm0,-0x20(%rsp) 0x00007ffff42dd66c <+2092>: mov 0x88(%rsp),%rdx 0x00007ffff42dd674 <+2100>: movsd 0x50(%rdx),%xmm0 0x00007ffff42dd679 <+2105>: movsd %xmm0,0x558(%rsp) 0x00007ffff42dd682 <+2114>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd687 <+2119>: movapd %xmm0,0x540(%rsp) 0x00007ffff42dd690 <+2128>: movapd %xmm0,-0x30(%rsp) 0x00007ffff42dd696 <+2134>: mov 0x88(%rsp),%rdx 0x00007ffff42dd69e <+2142>: movsd 0x58(%rdx),%xmm0 0x00007ffff42dd6a3 <+2147>: movsd %xmm0,0x538(%rsp) 0x00007ffff42dd6ac <+2156>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd6b1 <+2161>: movapd %xmm0,0x520(%rsp) 0x00007ffff42dd6ba <+2170>: movapd %xmm0,-0x40(%rsp) 0x00007ffff42dd6c0 <+2176>: movapd 0x70(%rsp),%xmm0 0x00007ffff42dd6c6 <+2182>: movapd -0x10(%rsp),%xmm1 0x00007ffff42dd6cc <+2188>: movapd %xmm0,0x510(%rsp) 0x00007ffff42dd6d5 <+2197>: movapd %xmm1,0x500(%rsp) 0x00007ffff42dd6de <+2206>: movapd 0x510(%rsp),%xmm0 0x00007ffff42dd6e7 <+2215>: mulpd %xmm1,%xmm0 0x00007ffff42dd6eb <+2219>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd6f1 <+2225>: movapd 0x60(%rsp),%xmm0 0x00007ffff42dd6f7 <+2231>: movapd -0x20(%rsp),%xmm1 0x00007ffff42dd6fd <+2237>: movapd %xmm0,0x4f0(%rsp) 0x00007ffff42dd706 <+2246>: movapd %xmm1,0x4e0(%rsp) 0x00007ffff42dd70f <+2255>: movapd 0x4f0(%rsp),%xmm0 0x00007ffff42dd718 <+2264>: mulpd %xmm1,%xmm0 0x00007ffff42dd71c <+2268>: movapd %xmm0,-0x60(%rsp) 0x00007ffff42dd722 <+2274>: movapd 0x50(%rsp),%xmm0 0x00007ffff42dd728 <+2280>: movapd -0x30(%rsp),%xmm1 0x00007ffff42dd72e <+2286>: movapd %xmm0,0x4d0(%rsp) 0x00007ffff42dd737 <+2295>: movapd %xmm1,0x4c0(%rsp) 0x00007ffff42dd740 <+2304>: movapd 0x4d0(%rsp),%xmm0 0x00007ffff42dd749 <+2313>: mulpd %xmm1,%xmm0 0x00007ffff42dd74d <+2317>: movapd %xmm0,-0x70(%rsp) 0x00007ffff42dd753 <+2323>: movapd 0x40(%rsp),%xmm0 0x00007ffff42dd759 <+2329>: movapd -0x40(%rsp),%xmm1 0x00007ffff42dd75f <+2335>: movapd %xmm0,0x4b0(%rsp) 0x00007ffff42dd768 <+2344>: movapd %xmm1,0x4a0(%rsp) 0x00007ffff42dd771 <+2353>: movapd 0x4b0(%rsp),%xmm0 0x00007ffff42dd77a <+2362>: mulpd %xmm1,%xmm0 0x00007ffff42dd77e <+2366>: movapd %xmm0,-0x80(%rsp) 0x00007ffff42dd784 <+2372>: movapd -0x50(%rsp),%xmm0 0x00007ffff42dd78a <+2378>: movapd -0x60(%rsp),%xmm1 0x00007ffff42dd790 <+2384>: movapd %xmm0,0x490(%rsp) 0x00007ffff42dd799 <+2393>: movapd %xmm1,0x480(%rsp) 0x00007ffff42dd7a2 <+2402>: movapd 0x490(%rsp),%xmm0 0x00007ffff42dd7ab <+2411>: addpd %xmm1,%xmm0 0x00007ffff42dd7af <+2415>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd7b5 <+2421>: movapd -0x70(%rsp),%xmm1 0x00007ffff42dd7bb <+2427>: movapd %xmm0,0x470(%rsp) 0x00007ffff42dd7c4 <+2436>: movapd %xmm1,0x460(%rsp) 0x00007ffff42dd7cd <+2445>: movapd 0x470(%rsp),%xmm0 0x00007ffff42dd7d6 <+2454>: addpd %xmm1,%xmm0 0x00007ffff42dd7da <+2458>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd7e0 <+2464>: movapd -0x80(%rsp),%xmm1 0x00007ffff42dd7e6 <+2470>: movapd %xmm0,0x450(%rsp) 0x00007ffff42dd7ef <+2479>: movapd %xmm1,0x440(%rsp) 0x00007ffff42dd7f8 <+2488>: movapd 0x450(%rsp),%xmm0 0x00007ffff42dd801 <+2497>: addpd %xmm1,%xmm0 0x00007ffff42dd805 <+2501>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd80b <+2507>: mov %rax,0x438(%rsp) 0x00007ffff42dd813 <+2515>: movapd %xmm0,0x420(%rsp) 0x00007ffff42dd81c <+2524>: mov 0x438(%rsp),%rax 0x00007ffff42dd824 <+2532>: movapd %xmm0,(%rax) 0x00007ffff42dd828 <+2536>: movapd 0x30(%rsp),%xmm0 0x00007ffff42dd82e <+2542>: movapd -0x10(%rsp),%xmm1 0x00007ffff42dd834 <+2548>: movapd %xmm0,0x410(%rsp) 0x00007ffff42dd83d <+2557>: movapd %xmm1,0x400(%rsp) 0x00007ffff42dd846 <+2566>: movapd 0x410(%rsp),%xmm0 0x00007ffff42dd84f <+2575>: mulpd %xmm1,%xmm0 0x00007ffff42dd853 <+2579>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd859 <+2585>: movapd 0x20(%rsp),%xmm0 0x00007ffff42dd85f <+2591>: movapd -0x20(%rsp),%xmm1 0x00007ffff42dd865 <+2597>: movapd %xmm0,0x3f0(%rsp) 0x00007ffff42dd86e <+2606>: movapd %xmm1,0x3e0(%rsp) 0x00007ffff42dd877 <+2615>: movapd 0x3f0(%rsp),%xmm0 0x00007ffff42dd880 <+2624>: mulpd %xmm1,%xmm0 0x00007ffff42dd884 <+2628>: movapd %xmm0,-0x60(%rsp) 0x00007ffff42dd88a <+2634>: movapd 0x10(%rsp),%xmm0 0x00007ffff42dd890 <+2640>: movapd -0x30(%rsp),%xmm1 0x00007ffff42dd896 <+2646>: movapd %xmm0,0x3d0(%rsp) 0x00007ffff42dd89f <+2655>: movapd %xmm1,0x3c0(%rsp) 0x00007ffff42dd8a8 <+2664>: movapd 0x3d0(%rsp),%xmm0 0x00007ffff42dd8b1 <+2673>: mulpd %xmm1,%xmm0 0x00007ffff42dd8b5 <+2677>: movapd %xmm0,-0x70(%rsp) 0x00007ffff42dd8bb <+2683>: movapd (%rsp),%xmm0 0x00007ffff42dd8c0 <+2688>: movapd -0x40(%rsp),%xmm1 0x00007ffff42dd8c6 <+2694>: movapd %xmm0,0x3b0(%rsp) 0x00007ffff42dd8cf <+2703>: movapd %xmm1,0x3a0(%rsp) 0x00007ffff42dd8d8 <+2712>: movapd 0x3b0(%rsp),%xmm0 0x00007ffff42dd8e1 <+2721>: mulpd %xmm1,%xmm0 0x00007ffff42dd8e5 <+2725>: movapd %xmm0,-0x80(%rsp) 0x00007ffff42dd8eb <+2731>: movapd -0x50(%rsp),%xmm0 0x00007ffff42dd8f1 <+2737>: movapd -0x60(%rsp),%xmm1 0x00007ffff42dd8f7 <+2743>: movapd %xmm0,0x390(%rsp) 0x00007ffff42dd900 <+2752>: movapd %xmm1,0x380(%rsp) 0x00007ffff42dd909 <+2761>: movapd 0x390(%rsp),%xmm0 0x00007ffff42dd912 <+2770>: addpd %xmm1,%xmm0 0x00007ffff42dd916 <+2774>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd91c <+2780>: movapd -0x70(%rsp),%xmm1 0x00007ffff42dd922 <+2786>: movapd %xmm0,0x370(%rsp) 0x00007ffff42dd92b <+2795>: movapd %xmm1,0x360(%rsp) 0x00007ffff42dd934 <+2804>: movapd 0x370(%rsp),%xmm0 0x00007ffff42dd93d <+2813>: addpd %xmm1,%xmm0 0x00007ffff42dd941 <+2817>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd947 <+2823>: movapd -0x80(%rsp),%xmm1 0x00007ffff42dd94d <+2829>: movapd %xmm0,0x350(%rsp) 0x00007ffff42dd956 <+2838>: movapd %xmm1,0x340(%rsp) 0x00007ffff42dd95f <+2847>: movapd 0x350(%rsp),%xmm0 0x00007ffff42dd968 <+2856>: addpd %xmm1,%xmm0 0x00007ffff42dd96c <+2860>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dd972 <+2866>: mov %r9,0x338(%rsp) 0x00007ffff42dd97a <+2874>: movapd %xmm0,0x320(%rsp) 0x00007ffff42dd983 <+2883>: mov 0x338(%rsp),%rax 0x00007ffff42dd98b <+2891>: movapd %xmm0,(%rax) 0x00007ffff42dd98f <+2895>: mov 0x88(%rsp),%rax 0x00007ffff42dd997 <+2903>: movsd 0x60(%rax),%xmm0 0x00007ffff42dd99c <+2908>: movsd %xmm0,0x318(%rsp) 0x00007ffff42dd9a5 <+2917>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd9aa <+2922>: movapd %xmm0,0x300(%rsp) 0x00007ffff42dd9b3 <+2931>: movapd %xmm0,-0x10(%rsp) 0x00007ffff42dd9b9 <+2937>: mov 0x88(%rsp),%rax 0x00007ffff42dd9c1 <+2945>: movsd 0x68(%rax),%xmm0 0x00007ffff42dd9c6 <+2950>: movsd %xmm0,0x2f8(%rsp) 0x00007ffff42dd9cf <+2959>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd9d4 <+2964>: movapd %xmm0,0x2e0(%rsp) 0x00007ffff42dd9dd <+2973>: movapd %xmm0,-0x20(%rsp) 0x00007ffff42dd9e3 <+2979>: mov 0x88(%rsp),%rax 0x00007ffff42dd9eb <+2987>: movsd 0x70(%rax),%xmm0 0x00007ffff42dd9f0 <+2992>: movsd %xmm0,0x2d8(%rsp) 0x00007ffff42dd9f9 <+3001>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dd9fe <+3006>: movapd %xmm0,0x2c0(%rsp) 0x00007ffff42dda07 <+3015>: movapd %xmm0,-0x30(%rsp) 0x00007ffff42dda0d <+3021>: mov 0x88(%rsp),%rax 0x00007ffff42dda15 <+3029>: movsd 0x78(%rax),%xmm0 0x00007ffff42dda1a <+3034>: movsd %xmm0,0x2b8(%rsp) 0x00007ffff42dda23 <+3043>: shufpd $0x0,%xmm0,%xmm0 0x00007ffff42dda28 <+3048>: movapd %xmm0,0x2a0(%rsp) 0x00007ffff42dda31 <+3057>: movapd %xmm0,-0x40(%rsp) 0x00007ffff42dda37 <+3063>: movapd 0x70(%rsp),%xmm0 0x00007ffff42dda3d <+3069>: movapd -0x10(%rsp),%xmm1 0x00007ffff42dda43 <+3075>: movapd %xmm0,0x290(%rsp) 0x00007ffff42dda4c <+3084>: movapd %xmm1,0x280(%rsp) 0x00007ffff42dda55 <+3093>: movapd 0x290(%rsp),%xmm0 0x00007ffff42dda5e <+3102>: mulpd %xmm1,%xmm0 0x00007ffff42dda62 <+3106>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42dda68 <+3112>: movapd 0x60(%rsp),%xmm0 0x00007ffff42dda6e <+3118>: movapd -0x20(%rsp),%xmm1 0x00007ffff42dda74 <+3124>: movapd %xmm0,0x270(%rsp) 0x00007ffff42dda7d <+3133>: movapd %xmm1,0x260(%rsp) 0x00007ffff42dda86 <+3142>: movapd 0x270(%rsp),%xmm0 0x00007ffff42dda8f <+3151>: mulpd %xmm1,%xmm0 0x00007ffff42dda93 <+3155>: movapd %xmm0,-0x60(%rsp) 0x00007ffff42dda99 <+3161>: movapd 0x50(%rsp),%xmm0 0x00007ffff42dda9f <+3167>: movapd -0x30(%rsp),%xmm1 0x00007ffff42ddaa5 <+3173>: movapd %xmm0,0x250(%rsp) 0x00007ffff42ddaae <+3182>: movapd %xmm1,0x240(%rsp) 0x00007ffff42ddab7 <+3191>: movapd 0x250(%rsp),%xmm0 0x00007ffff42ddac0 <+3200>: mulpd %xmm1,%xmm0 0x00007ffff42ddac4 <+3204>: movapd %xmm0,-0x70(%rsp) 0x00007ffff42ddaca <+3210>: movapd 0x40(%rsp),%xmm0 0x00007ffff42ddad0 <+3216>: movapd -0x40(%rsp),%xmm1 0x00007ffff42ddad6 <+3222>: movapd %xmm0,0x230(%rsp) 0x00007ffff42ddadf <+3231>: movapd %xmm1,0x220(%rsp) 0x00007ffff42ddae8 <+3240>: movapd 0x230(%rsp),%xmm0 0x00007ffff42ddaf1 <+3249>: mulpd %xmm1,%xmm0 0x00007ffff42ddaf5 <+3253>: movapd %xmm0,-0x80(%rsp) 0x00007ffff42ddafb <+3259>: movapd -0x50(%rsp),%xmm0 0x00007ffff42ddb01 <+3265>: movapd -0x60(%rsp),%xmm1 0x00007ffff42ddb07 <+3271>: movapd %xmm0,0x210(%rsp) 0x00007ffff42ddb10 <+3280>: movapd %xmm1,0x200(%rsp) 0x00007ffff42ddb19 <+3289>: movapd 0x210(%rsp),%xmm0 0x00007ffff42ddb22 <+3298>: addpd %xmm1,%xmm0 0x00007ffff42ddb26 <+3302>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42ddb2c <+3308>: movapd -0x70(%rsp),%xmm1 0x00007ffff42ddb32 <+3314>: movapd %xmm0,0x1f0(%rsp) 0x00007ffff42ddb3b <+3323>: movapd %xmm1,0x1e0(%rsp) 0x00007ffff42ddb44 <+3332>: movapd 0x1f0(%rsp),%xmm0 0x00007ffff42ddb4d <+3341>: addpd %xmm1,%xmm0 0x00007ffff42ddb51 <+3345>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42ddb57 <+3351>: movapd -0x80(%rsp),%xmm1 0x00007ffff42ddb5d <+3357>: movapd %xmm0,0x1d0(%rsp) 0x00007ffff42ddb66 <+3366>: movapd %xmm1,0x1c0(%rsp) 0x00007ffff42ddb6f <+3375>: movapd 0x1d0(%rsp),%xmm0 0x00007ffff42ddb78 <+3384>: addpd %xmm1,%xmm0 0x00007ffff42ddb7c <+3388>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42ddb82 <+3394>: mov %rcx,0x1b8(%rsp) 0x00007ffff42ddb8a <+3402>: movapd %xmm0,0x1a0(%rsp) 0x00007ffff42ddb93 <+3411>: mov 0x1b8(%rsp),%rax 0x00007ffff42ddb9b <+3419>: movapd %xmm0,(%rax) 0x00007ffff42ddb9f <+3423>: movapd 0x30(%rsp),%xmm0 0x00007ffff42ddba5 <+3429>: movapd -0x10(%rsp),%xmm1 0x00007ffff42ddbab <+3435>: movapd %xmm0,0x190(%rsp) 0x00007ffff42ddbb4 <+3444>: movapd %xmm1,0x180(%rsp) 0x00007ffff42ddbbd <+3453>: movapd 0x190(%rsp),%xmm0 0x00007ffff42ddbc6 <+3462>: mulpd %xmm1,%xmm0 0x00007ffff42ddbca <+3466>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42ddbd0 <+3472>: movapd 0x20(%rsp),%xmm0 0x00007ffff42ddbd6 <+3478>: movapd -0x20(%rsp),%xmm1 0x00007ffff42ddbdc <+3484>: movapd %xmm0,0x170(%rsp) 0x00007ffff42ddbe5 <+3493>: movapd %xmm1,0x160(%rsp) 0x00007ffff42ddbee <+3502>: movapd 0x170(%rsp),%xmm0 0x00007ffff42ddbf7 <+3511>: mulpd %xmm1,%xmm0 0x00007ffff42ddbfb <+3515>: movapd %xmm0,-0x60(%rsp) 0x00007ffff42ddc01 <+3521>: movapd 0x10(%rsp),%xmm0 0x00007ffff42ddc07 <+3527>: movapd -0x30(%rsp),%xmm1 0x00007ffff42ddc0d <+3533>: movapd %xmm0,0x150(%rsp) 0x00007ffff42ddc16 <+3542>: movapd %xmm1,0x140(%rsp) 0x00007ffff42ddc1f <+3551>: movapd 0x150(%rsp),%xmm0 0x00007ffff42ddc28 <+3560>: mulpd %xmm1,%xmm0 0x00007ffff42ddc2c <+3564>: movapd %xmm0,-0x70(%rsp) 0x00007ffff42ddc32 <+3570>: movapd (%rsp),%xmm0 0x00007ffff42ddc37 <+3575>: movapd -0x40(%rsp),%xmm1 0x00007ffff42ddc3d <+3581>: movapd %xmm0,0x130(%rsp) 0x00007ffff42ddc46 <+3590>: movapd %xmm1,0x120(%rsp) 0x00007ffff42ddc4f <+3599>: movapd 0x130(%rsp),%xmm0 0x00007ffff42ddc58 <+3608>: mulpd %xmm1,%xmm0 0x00007ffff42ddc5c <+3612>: movapd %xmm0,-0x80(%rsp) 0x00007ffff42ddc62 <+3618>: movapd -0x50(%rsp),%xmm0 0x00007ffff42ddc68 <+3624>: movapd -0x60(%rsp),%xmm1 0x00007ffff42ddc6e <+3630>: movapd %xmm0,0x110(%rsp) 0x00007ffff42ddc77 <+3639>: movapd %xmm1,0x100(%rsp) 0x00007ffff42ddc80 <+3648>: movapd 0x110(%rsp),%xmm0 0x00007ffff42ddc89 <+3657>: addpd %xmm1,%xmm0 0x00007ffff42ddc8d <+3661>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42ddc93 <+3667>: movapd -0x70(%rsp),%xmm1 0x00007ffff42ddc99 <+3673>: movapd %xmm0,0xf0(%rsp) 0x00007ffff42ddca2 <+3682>: movapd %xmm1,0xe0(%rsp) 0x00007ffff42ddcab <+3691>: movapd 0xf0(%rsp),%xmm0 0x00007ffff42ddcb4 <+3700>: addpd %xmm1,%xmm0 0x00007ffff42ddcb8 <+3704>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42ddcbe <+3710>: movapd -0x80(%rsp),%xmm1 0x00007ffff42ddcc4 <+3716>: movapd %xmm0,0xd0(%rsp) 0x00007ffff42ddccd <+3725>: movapd %xmm1,0xc0(%rsp) 0x00007ffff42ddcd6 <+3734>: movapd 0xd0(%rsp),%xmm0 0x00007ffff42ddcdf <+3743>: addpd %xmm1,%xmm0 0x00007ffff42ddce3 <+3747>: movapd %xmm0,-0x50(%rsp) 0x00007ffff42ddce9 <+3753>: mov %r10,0xb8(%rsp) 0x00007ffff42ddcf1 <+3761>: movapd %xmm0,0xa0(%rsp) 0x00007ffff42ddcfa <+3770>: mov 0xb8(%rsp),%rax 0x00007ffff42ddd02 <+3778>: movapd %xmm0,(%rax) 0x00007ffff42ddd06 <+3782>: mov %rsi,%rax 0x00007ffff42ddd09 <+3785>: add $0xad8,%rsp 0x00007ffff42ddd10 <+3792>: retq
Benjamin Poulain
Comment 10
2013-01-03 10:02:36 PST
There you go, %rsi is unaligned. It is the this pointer here. If it was allocated on the stack, you may have either a bug in your compiler, or someone mess-up the alignment of the stack (likely at library boundaries). If it is allocated on the heap, what allocator are you using?
Dana Jansens
Comment 11
2013-01-03 10:07:09 PST
(In reply to
comment #10
)
> There you go, %rsi is unaligned. > > It is the this pointer here. > > If it was allocated on the stack, you may have either a bug in your compiler, or someone mess-up the alignment of the stack (likely at library boundaries).
It's allocated on the stack, in WebTransformationOperations.cpp, which is inside the WebKit library. (gdb) frame 2 #2 0x00007ffff367dd8d in WebKit::WebTransformOperations::apply (this=0xb63778) at ../../third_party/WebKit/Source/WebCore/platform/chromium/support/WebTransformOperations.cpp:95 91 WebTransformationMatrix WebTransformOperations::apply() const 92 { 93 WebTransformationMatrix toReturn; 94 for (size_t i = 0; i < m_private->operations.size(); ++i) 95 toReturn.multiply(m_private->operations[i].matrix); 96 return toReturn; 97 } (gdb) frame 1 #1 0x00007ffff3680e2d in WebKit::WebTransformationMatrix::multiply (this=0x7fffffffd4c8, t=...) at ../../third_party/WebKit/Source/WebCore/platform/chromium/support/WebTransformationMatrix.cpp:97 95 void WebTransformationMatrix::multiply(const WebTransformationMatrix& t) 96 { 97 m_private.multiply(t.m_private); 98 }
Dana Jansens
Comment 12
2013-01-03 10:09:15 PST
Would registers/asm at frames 1 and 2 help point it out?
Benjamin Poulain
Comment 13
2013-01-03 10:13:51 PST
(In reply to
comment #11
)
> (In reply to
comment #10
) > > There you go, %rsi is unaligned. > > > > It is the this pointer here. > > > > If it was allocated on the stack, you may have either a bug in your compiler, or someone mess-up the alignment of the stack (likely at library boundaries). > > It's allocated on the stack, in WebTransformationOperations.cpp, which is inside the WebKit library.
In the patch, the alignment is specified on 16 bytes on the stack: typedef double Matrix4[4][4] __attribute__((aligned (16))); This is not followed in Chromium for some reason. Maybe? 1) Compiler bug (the assembly you pasted looks like quite aggressive debug code - do you have the bug in release?). 2) One of your libraries specify a different stack alignment?
James Robinson
Comment 14
2013-01-03 11:05:51 PST
(In reply to
comment #13
)
> 2) One of your libraries specify a different stack alignment?
I think this is the issue. Will let you know when I verify. In the future, would you prefer this sort of thing be restricted at compile-time to not run on chromium instead of reverted? We can't really leave a crash in.
James Robinson
Comment 15
2013-01-03 11:36:33 PST
The issue is WebTransformationMatrix is aliasing space for WebCore::TransformationMatrix in different libraries without enforcing the same alignment requirements. This is an ugly hack that isn't needed any more, so I'll just make it not alias at all. This will take a little bit of time (probably not much more than an hour). Benjamin - feel free to reland this patch with the ASM version guarded behind !PLATFORM(CHROMIUM) if you want to land before I get around to this.
Benjamin Poulain
Comment 16
2013-01-03 11:58:36 PST
> In the future, would you prefer this sort of thing be restricted at compile-time to not run on chromium instead of reverted? We can't really leave a crash in.
I think #ifdefing Chromium would have been reasonable given this works everywhere except on that test. I was annoyed because the patch was reverted without any information. Dana promptly provided the missing pieces so I guess it's ok.
> Benjamin - feel free to reland this patch with the ASM version guarded behind !PLATFORM(CHROMIUM) if you want to land before I get around to this.
I'll wait a bit. It would be nice if Chromium could have the optimization too. I will land tomorrow if you don't come back to me. Ping me if I can help. I'll also add a #ifdef for FastMalloc. Stricto sensu, we cannot assume natural alignment on other allocator with malloc.
Benjamin Poulain
Comment 17
2013-01-04 14:40:19 PST
Created
attachment 181377
[details]
Patch
Benjamin Poulain
Comment 18
2013-01-04 16:35:12 PST
Comment on
attachment 181377
[details]
Patch Clearing flags on attachment: 181377 Committed
r138866
: <
http://trac.webkit.org/changeset/138866
>
Benjamin Poulain
Comment 19
2013-01-04 16:35:16 PST
All reviewed patches have been landed. Closing bug.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug