Bug 93080 - 64-bit JSValues can be placed in XMM registers on X86
Summary: 64-bit JSValues can be placed in XMM registers on X86
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-03 01:58 PDT by Yuqiang Xian
Modified: 2013-11-05 08:42 PST (History)
4 users (show)

See Also:


Attachments
wip (55.01 KB, patch)
2012-08-03 02:00 PDT, Yuqiang Xian
no flags Details | Formatted Diff | Diff
perf result so far (5.91 KB, text/plain)
2012-08-03 02:03 PDT, Yuqiang Xian
no flags Details
patch (55.00 KB, patch)
2012-08-07 05:59 PDT, Yuqiang Xian
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yuqiang Xian 2012-08-03 01:58:16 PDT
On X86 we have limited number of general purpose registers and only 5 of them are left for allocation. In JSC a 64-bit JSValue will occupy 2 GPRs. Instead of simply spilling the GPRs into memory, we could pack the pair of GPRs holding a JSValue into an XMM register which has smaller usage pressure. The SSE (especially SSE4.1) support makes the packing and extraction of the data cheaper than storing to and reading from memory.
Comment 1 Yuqiang Xian 2012-08-03 02:00:55 PDT
Created attachment 156292 [details]
wip
Comment 2 Yuqiang Xian 2012-08-03 02:03:32 PDT
Created attachment 156293 [details]
perf result so far
Comment 3 Filip Pizlo 2012-08-03 09:25:33 PDT
Comment on attachment 156292 [details]
wip

I like this!  Looks good, so far.
Comment 4 Filip Pizlo 2012-08-03 09:36:03 PDT
Have you considered the following, related, optimization: if a node is known to only be used in contexts that are non-numerical, non-boolean, and the node is never used as the base of a memory access, then we can store the value of the node in an XMM register without every doing pack/unpack, and it may even be profitable to do so on 64-bit.

Example:

a: GetLocal(r42)
b: GetLocal(r43)
c: PutByOffset(@a, @a, @b)
... // no other uses of @b

In this case, @b can be stored in an XMM register since the only thing that PutByOffset is doing is storing it back to memory.

This is likely to also work on any platform that has 64-bit float registers.  It's likely to be most profitable if at the point of the GetLocal we are under register pressure.  But, it will require some changes to how we implement the code for things like PutByOffset - it'll have to be somehow agnostic to the source of the value being either in a GPR or in an XMM.
Comment 5 Yuqiang Xian 2012-08-07 02:03:33 PDT
(In reply to comment #4)
> Have you considered the following, related, optimization: if a node is known to only be used in contexts that are non-numerical, non-boolean, and the node is never used as the base of a memory access, then we can store the value of the node in an XMM register without every doing pack/unpack, and it may even be profitable to do so on 64-bit.
> 
> Example:
> 
> a: GetLocal(r42)
> b: GetLocal(r43)
> c: PutByOffset(@a, @a, @b)
> ... // no other uses of @b
> 
> In this case, @b can be stored in an XMM register since the only thing that PutByOffset is doing is storing it back to memory.
> 

Yes that can be helpful, while I guess we may need more work on the def-use analysis to detect such cases.

> This is likely to also work on any platform that has 64-bit float registers.  

That's true. This also applies to my current change. While I choose to enable it on X86 only now because it's the platform that suffers most from the lack of GPRs. I haven't yet evaluated the benefit on other platforms.

> It's likely to be most profitable if at the point of the GetLocal we are under register pressure.  But, it will require some changes to how we implement the code for things like PutByOffset - it'll have to be somehow agnostic to the source of the value being either in a GPR or in an XMM.

Agree. Even in current JIT (with this patch) we could have the code generator aware of the source of a JSValue. I choose not to do it right now as I want to keep this change simpler and haven't yet evaluated the potential benefit.
Comment 6 Yuqiang Xian 2012-08-07 05:59:45 PDT
Created attachment 156923 [details]
patch

Asking for a review. Thanks.
Comment 7 Yuqiang Xian 2012-08-22 00:32:10 PDT
ping...
Comment 8 Brent Fulgham 2013-10-08 14:14:13 PDT
Did anything ever happen with this? Seems like a reasonable idea.
Comment 9 Filip Pizlo 2013-10-08 14:42:24 PDT
(In reply to comment #8)
> Did anything ever happen with this? Seems like a reasonable idea.

It's not necessarily a good idea.  Moving between GPR's and XMM registers can be very expensive so most compilers do not try to use XMM registers when they run out of GPRs.  It's usually cheaper to spill to the stack.

This could only have been a win on x86-32, but I'm not sure we care about that so much anymore.
Comment 10 Csaba Osztrogonác 2013-11-05 08:42:59 PST
Comment on attachment 156923 [details]
patch

Cleared review? from attachment 156923 [details] so that this bug does not appear in http://webkit.org/pending-review.  If you would like this patch reviewed, please attach it to a new bug (or re-open this bug before marking it for review again).