Bug 125275

Summary:

FTL should use cvttsd2si directly for double-to-int32 conversions

Product:

WebKit

Reporter:

Filip Pizlo <fpizlo>

Component:

JavaScriptCore

Assignee:

Filip Pizlo <fpizlo>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

barraclough, ggaren, mark.lam, mhahnenberg, msaboff, nrotem, oliver, sam

Priority:

Version:

528+ (Nightly build)

Hardware:

All

OS:

All

Bug Depends on:

Bug Blocks:

112840, 125283

Attachments:

Description	Flags
the patch	msaboff: review+

Filip Pizlo

Reported 2013-12-04 18:50:20 PST

Patch forthcoming.

Attachments
the patch (25.84 KB, patch) 2013-12-04 22:49 PST, Filip Pizlo	msaboff: review+	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Filip Pizlo

Comment 1 2013-12-04 22:49:28 PST

Created attachment 218490 [details] the patch

Nadav Rotem

Comment 2 2013-12-05 09:12:50 PST

Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. I would zero the register before inserting a value into it.

Filip Pizlo

Comment 3 2013-12-05 17:34:35 PST

(In reply to comment #2) > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. For my own edification, can you explain what the downside to my current approach is? Under what conditions would this lead to a partial-register dependency, and why is that bad? I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits. But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway? > I would zero the register before inserting a value into it. Thanks for the feedback, I will do it! To make sure you mean: %1 = insertelement undef, %doubleValue, i32 0 %2 = insertelement %1, 0.0, i32 1 %3 = call @llvm.x86.sse2.cvttsd2si(%2) Right?

Filip Pizlo

Comment 4 2013-12-05 17:41:55 PST

(In reply to comment #3) > (In reply to comment #2) > > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. > > For my own edification, can you explain what the downside to my current approach is? Under what conditions would this lead to a partial-register dependency, and why is that bad? > > I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits. But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway? > > > I would zero the register before inserting a value into it. > > Thanks for the feedback, I will do it! To make sure you mean: > > %1 = insertelement undef, %doubleValue, i32 0 > %2 = insertelement %1, 0.0, i32 1 > %3 = call @llvm.x86.sse2.cvttsd2si(%2) > > Right? OK, I made this change and it seems to work. I will land with this change. In particular, here's what the relevant method looks like now: LValue sensibleDoubleToInt(LValue value) { RELEASE_ASSERT(isX86()); return call( x86SSE2CvtTSD2SIIntrinsic(), insertElement( insertElement(getUndef(vectorType(doubleType, 2)), value, int32Zero), doubleZero, int32One)); }

Filip Pizlo

Comment 5 2013-12-05 17:45:21 PST

Landed in http://trac.webkit.org/changeset/160205

Nadav Rotem

Comment 6 2013-12-05 20:36:50 PST

I(In reply to comment #3) > (In reply to comment #2) > > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. > > For my own edification, can you explain what the downside to my current approach is? Under what conditions would this lead to a partial-register dependency, and why is that bad? > > I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits. But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway? > > > I would zero the register before inserting a value into it. > > Thanks for the feedback, I will do it! To make sure you mean: > > %1 = insertelement undef, %doubleValue, i32 0 > %2 = insertelement %1, 0.0, i32 1 > %3 = call @llvm.x86.sse2.cvttsd2si(%2) > > Right? The IR looks right. The assembly looks okay ?

Filip Pizlo

Comment 7 2013-12-08 12:25:14 PST

(In reply to comment #6) > I(In reply to comment #3) > > (In reply to comment #2) > > > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it. If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency. > > > > For my own edification, can you explain what the downside to my current approach is? Under what conditions would this lead to a partial-register dependency, and why is that bad? > > > > I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits. But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway? > > > > > I would zero the register before inserting a value into it. > > > > Thanks for the feedback, I will do it! To make sure you mean: > > > > %1 = insertelement undef, %doubleValue, i32 0 > > %2 = insertelement %1, 0.0, i32 1 > > %3 = call @llvm.x86.sse2.cvttsd2si(%2) > > > > Right? > > The IR looks right. The assembly looks okay ? Yup! And it looked identical to what it was before. Can you explain what the partial-register dependency problem is?

Nadav Rotem

Comment 8 2013-12-08 18:57:13 PST

> Can you explain what the partial-register dependency problem is? Oops, I confused CVTTSD2SI with CVTTSI2SD. This code looks good. The the problem with the 2SD instruction is that it modifies a part of the XMM register. So the out-of-order engine can't execute the instruction whenever it wants because we need to wait for the upper part of the register to be calculated before we can execute the 2SD instruction. By zeroing the upper part of the register we break this false dependency. Again, this does not apply to this instruction because we only read from the XMM register, not write into it.

Note You need to log in before you can comment on or make changes to this bug.