Bug 125275

Summary: FTL should use cvttsd2si directly for double-to-int32 conversions
Product: WebKit Reporter: Filip Pizlo <fpizlo>
Component: JavaScriptCoreAssignee: Filip Pizlo <fpizlo>
Status: RESOLVED FIXED    
Severity: Normal CC: barraclough, ggaren, mark.lam, mhahnenberg, msaboff, nrotem, oliver, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 112840, 125283    
Attachments:
Description Flags
the patch msaboff: review+

Filip Pizlo
Reported 2013-12-04 18:50:20 PST
Patch forthcoming.
Attachments
the patch (25.84 KB, patch)
2013-12-04 22:49 PST, Filip Pizlo
msaboff: review+
Filip Pizlo
Comment 1 2013-12-04 22:49:28 PST
Created attachment 218490 [details] the patch
Nadav Rotem
Comment 2 2013-12-05 09:12:50 PST
Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. I would zero the register before inserting a value into it.
Filip Pizlo
Comment 3 2013-12-05 17:34:35 PST
(In reply to comment #2) > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. For my own edification, can you explain what the downside to my current approach is? Under what conditions would this lead to a partial-register dependency, and why is that bad? I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits. But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway? > I would zero the register before inserting a value into it. Thanks for the feedback, I will do it! To make sure you mean: %1 = insertelement undef, %doubleValue, i32 0 %2 = insertelement %1, 0.0, i32 1 %3 = call @llvm.x86.sse2.cvttsd2si(%2) Right?
Filip Pizlo
Comment 4 2013-12-05 17:41:55 PST
(In reply to comment #3) > (In reply to comment #2) > > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. > > For my own edification, can you explain what the downside to my current approach is? Under what conditions would this lead to a partial-register dependency, and why is that bad? > > I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits. But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway? > > > I would zero the register before inserting a value into it. > > Thanks for the feedback, I will do it! To make sure you mean: > > %1 = insertelement undef, %doubleValue, i32 0 > %2 = insertelement %1, 0.0, i32 1 > %3 = call @llvm.x86.sse2.cvttsd2si(%2) > > Right? OK, I made this change and it seems to work. I will land with this change. In particular, here's what the relevant method looks like now: LValue sensibleDoubleToInt(LValue value) { RELEASE_ASSERT(isX86()); return call( x86SSE2CvtTSD2SIIntrinsic(), insertElement( insertElement(getUndef(vectorType(doubleType, 2)), value, int32Zero), doubleZero, int32One)); }
Filip Pizlo
Comment 5 2013-12-05 17:45:21 PST
Nadav Rotem
Comment 6 2013-12-05 20:36:50 PST
I(In reply to comment #3) > (In reply to comment #2) > > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. > > For my own edification, can you explain what the downside to my current approach is? Under what conditions would this lead to a partial-register dependency, and why is that bad? > > I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits. But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway? > > > I would zero the register before inserting a value into it. > > Thanks for the feedback, I will do it! To make sure you mean: > > %1 = insertelement undef, %doubleValue, i32 0 > %2 = insertelement %1, 0.0, i32 1 > %3 = call @llvm.x86.sse2.cvttsd2si(%2) > > Right? The IR looks right. The assembly looks okay ?
Filip Pizlo
Comment 7 2013-12-08 12:25:14 PST
(In reply to comment #6) > I(In reply to comment #3) > > (In reply to comment #2) > > > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. > > > > For my own edification, can you explain what the downside to my current approach is? Under what conditions would this lead to a partial-register dependency, and why is that bad? > > > > I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits. But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway? > > > > > I would zero the register before inserting a value into it. > > > > Thanks for the feedback, I will do it! To make sure you mean: > > > > %1 = insertelement undef, %doubleValue, i32 0 > > %2 = insertelement %1, 0.0, i32 1 > > %3 = call @llvm.x86.sse2.cvttsd2si(%2) > > > > Right? > > The IR looks right. The assembly looks okay ? Yup! And it looked identical to what it was before. Can you explain what the partial-register dependency problem is?
Nadav Rotem
Comment 8 2013-12-08 18:57:13 PST
> Can you explain what the partial-register dependency problem is? Oops, I confused CVTTSD2SI with CVTTSI2SD. This code looks good. The the problem with the 2SD instruction is that it modifies a part of the XMM register. So the out-of-order engine can't execute the instruction whenever it wants because we need to wait for the upper part of the register to be calculated before we can execute the 2SD instruction. By zeroing the upper part of the register we break this false dependency. Again, this does not apply to this instruction because we only read from the XMM register, not write into it.
Note You need to log in before you can comment on or make changes to this bug.