Bug 125275 - FTL should use cvttsd2si directly for double-to-int32 conversions
Summary: FTL should use cvttsd2si directly for double-to-int32 conversions
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Filip Pizlo
URL:
Keywords:
Depends on:
Blocks: 112840 125283
  Show dependency treegraph
 
Reported: 2013-12-04 18:50 PST by Filip Pizlo
Modified: 2013-12-08 18:57 PST (History)
8 users (show)

See Also:


Attachments
the patch (25.84 KB, patch)
2013-12-04 22:49 PST, Filip Pizlo
msaboff: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Pizlo 2013-12-04 18:50:20 PST
Patch forthcoming.
Comment 1 Filip Pizlo 2013-12-04 22:49:28 PST
Created attachment 218490 [details]
the patch
Comment 2 Nadav Rotem 2013-12-05 09:12:50 PST
Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it.  If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency.  I would zero the register before inserting a value into it.
Comment 3 Filip Pizlo 2013-12-05 17:34:35 PST
(In reply to comment #2)
> Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it.  If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency.

For my own edification, can you explain what the downside to my current approach is?  Under what conditions would this lead to a partial-register dependency, and why is that bad?

I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits.  But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway?

> I would zero the register before inserting a value into it.

Thanks for the feedback, I will do it!  To make sure you mean:

%1 = insertelement undef, %doubleValue, i32 0
%2 = insertelement %1, 0.0, i32 1
%3 = call @llvm.x86.sse2.cvttsd2si(%2)

Right?
Comment 4 Filip Pizlo 2013-12-05 17:41:55 PST
(In reply to comment #3)
> (In reply to comment #2)
> > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it.  If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency.
> 
> For my own edification, can you explain what the downside to my current approach is?  Under what conditions would this lead to a partial-register dependency, and why is that bad?
> 
> I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits.  But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway?
> 
> > I would zero the register before inserting a value into it.
> 
> Thanks for the feedback, I will do it!  To make sure you mean:
> 
> %1 = insertelement undef, %doubleValue, i32 0
> %2 = insertelement %1, 0.0, i32 1
> %3 = call @llvm.x86.sse2.cvttsd2si(%2)
> 
> Right?

OK, I made this change and it seems to work.  I will land with this change.  In particular, here's what the relevant method looks like now:

    LValue sensibleDoubleToInt(LValue value)
    {
        RELEASE_ASSERT(isX86());
        return call(
            x86SSE2CvtTSD2SIIntrinsic(),
            insertElement(
                insertElement(getUndef(vectorType(doubleType, 2)), value, int32Zero),
                doubleZero, int32One));
    }
Comment 5 Filip Pizlo 2013-12-05 17:45:21 PST
Landed in http://trac.webkit.org/changeset/160205
Comment 6 Nadav Rotem 2013-12-05 20:36:50 PST
I(In reply to comment #3)
> (In reply to comment #2)
> > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it.  If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency.
> 
> For my own edification, can you explain what the downside to my current approach is?  Under what conditions would this lead to a partial-register dependency, and why is that bad?
> 
> I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits.  But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway?
> 
> > I would zero the register before inserting a value into it.
> 
> Thanks for the feedback, I will do it!  To make sure you mean:
> 
> %1 = insertelement undef, %doubleValue, i32 0
> %2 = insertelement %1, 0.0, i32 1
> %3 = call @llvm.x86.sse2.cvttsd2si(%2)
> 
> Right?

The IR looks right. The assembly looks okay ?
Comment 7 Filip Pizlo 2013-12-08 12:25:14 PST
(In reply to comment #6)
> I(In reply to comment #3)
> > (In reply to comment #2)
> > > Filip, in sensibleDoubleToInt you are creating an undef vector and insert a value into it.  If you don't initialize the vector the RA can pick whatever vector register it wants, potentially causing a partial-register dependency.
> > 
> > For my own edification, can you explain what the downside to my current approach is?  Under what conditions would this lead to a partial-register dependency, and why is that bad?
> > 
> > I was leaving the second element of the vector as undef bits because I thought that this would be the best way to tell LLVM that it's OK to use a register that had garbage in those bits.  But I take it that initializing it to zero would anyway allow LLVM to do this because LLVM can see that the initialization of those bits to zero is dead anyway?
> > 
> > > I would zero the register before inserting a value into it.
> > 
> > Thanks for the feedback, I will do it!  To make sure you mean:
> > 
> > %1 = insertelement undef, %doubleValue, i32 0
> > %2 = insertelement %1, 0.0, i32 1
> > %3 = call @llvm.x86.sse2.cvttsd2si(%2)
> > 
> > Right?
> 
> The IR looks right. The assembly looks okay ?

Yup!  And it looked identical to what it was before.

Can you explain what the partial-register dependency problem is?
Comment 8 Nadav Rotem 2013-12-08 18:57:13 PST
> Can you explain what the partial-register dependency problem is?

Oops, I confused CVTTSD2SI with CVTTSI2SD. This code looks good. The the problem with the 2SD instruction is that it modifies a part of the XMM register. So the out-of-order engine can't execute the instruction whenever it wants because we need to wait for the upper part of the register to be calculated before we can execute the 2SD instruction.  By zeroing the upper part of the register we break this false dependency. Again, this does not apply to this instruction because we only read from the XMM register, not write into it.