Bug 19180 - speed up SunSpider by optimizing immediate number cases
Summary: speed up SunSpider by optimizing immediate number cases
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-05-21 22:14 PDT by Darin Adler
Modified: 2012-03-07 17:47 PST (History)
4 users (show)

See Also:


Attachments
patch for < operator (1.20 KB, patch)
2008-05-21 22:16 PDT, Darin Adler
no flags Details | Formatted Diff | Diff
patch for &, |, and ^ operators (4.56 KB, patch)
2008-05-26 13:49 PDT, Darin Adler
no flags Details | Formatted Diff | Diff
Patch for + and - (5.04 KB, patch)
2008-05-27 10:11 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff
Patch for ++ and -- (5.26 KB, patch)
2008-05-28 03:27 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff
Patch for >>, >>>, ==, ===, !=, !== (16.27 KB, patch)
2008-05-28 10:29 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff
Patch for >>, ==, ===, !=, !== (16.03 KB, patch)
2008-05-28 11:47 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff
Patch for >>, ==, ===, !=, !== (16.21 KB, patch)
2008-05-29 23:06 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff
Patch for << and >>> (3.96 KB, patch)
2008-05-30 06:51 PDT, Alexey Proskuryakov
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Darin Adler 2008-05-21 22:14:07 PDT
I want to add immediate number special cases to hot opcodes. It seems to be working.
Comment 1 Darin Adler 2008-05-21 22:16:20 PDT
Created attachment 21285 [details]
patch for < operator
Comment 2 Maciej Stachowiak 2008-05-21 22:17:22 PDT
Comment on attachment 21285 [details]
patch for < operator

r=me
Comment 3 Darin Adler 2008-05-21 23:22:08 PDT
Comment on attachment 21285 [details]
patch for < operator

Clearing since I landed this.
Comment 4 Darin Adler 2008-05-26 13:49:35 PDT
Created attachment 21348 [details]
patch for &, |, and ^ operators
Comment 5 Alexey Proskuryakov 2008-05-27 08:14:21 PDT
I'm going to see if jsAdd can be similarly optimized - about 60% of number additions in SunSpider are for immediates.
Comment 6 Darin Adler 2008-05-27 09:09:48 PDT
(In reply to comment #5)
> I'm going to see if jsAdd can be similarly optimized - about 60% of number
> additions in SunSpider are for immediates.

For what it's worth, I tried that and failed.
Comment 7 Darin Adler 2008-05-27 09:10:02 PDT
Comment on attachment 21348 [details]
patch for &, |, and ^ operators

Clear flag since I landed this patch.
Comment 8 Alexey Proskuryakov 2008-05-27 10:11:24 PDT
Created attachment 21363 [details]
Patch for + and -

Hmm, makes me wonder if I got my bit operations wrong. This passes JSC tests, still rebuilding to run layout tests.

Oliver has a similar patch for ++, which we discussed a little yesterday.
Comment 9 Darin Adler 2008-05-27 11:12:44 PDT
OK. My original patch did ++, --, + and -, but there was a net slowdown for all of them.
Comment 10 Darin Adler 2008-05-27 11:13:36 PDT
Comment on attachment 21363 [details]
Patch for + and -

r=me
Comment 11 Alexey Proskuryakov 2008-05-27 12:55:59 PDT
So, I tried to land my patch, and re-measured performance with ToT, just in case. And it has turned into a 6% regression!

I've triple-checked the numbers, and it is just that: your patch is a 1% win, mine is 1.8% win, but together, they become a 6% regression.
Comment 12 Darin Adler 2008-05-27 13:42:33 PDT
(In reply to comment #11)
> I've triple-checked the numbers, and it is just that: your patch is a 1% win,
> mine is 1.8% win, but together, they become a 6% regression.

Which specific patch do you mean when you say "your patch"?
Comment 13 Alexey Proskuryakov 2008-05-27 13:53:20 PDT
I was talking about the latest one, which I didn't have in my tree when measuring originally, "patch for &, |, and ^ operators".
Comment 14 Alexey Proskuryakov 2008-05-28 02:55:27 PDT
Comment on attachment 21363 [details]
Patch for + and -

Landed this patch, it was a 2% improvement after a slight random change that should have been a slowdown - I've moved VM_CHECK_EXCEPTION() out of else block. Applying the same change to bitops regresses performance, though.

On disassembly, I saw a lot of changes everywhere in Machine::executePrivate, but I can't tell what gcc was thinking. There are also changes in completely unrelated places, such as base64 encoding and registerThread().
Comment 15 Alexey Proskuryakov 2008-05-28 03:27:22 PDT
Created attachment 21384 [details]
Patch for ++ and --
Comment 16 Maciej Stachowiak 2008-05-28 03:29:34 PDT
Comment on attachment 21384 [details]
Patch for ++ and --

r=me
Comment 17 Alexey Proskuryakov 2008-05-28 03:49:47 PDT
Comment on attachment 21384 [details]
Patch for ++ and --

Clearing since I landed this.
Comment 18 Alexey Proskuryakov 2008-05-28 05:19:01 PDT
From the looks of it, right shift should also benefit from such optimization, but I cannot get it to work yet. I do not see any other cases to optimize in this manner.
Comment 19 Alexey Proskuryakov 2008-05-28 10:29:28 PDT
Created attachment 21395 [details]
Patch for >>, >>>, ==, ===, !=, !==
Comment 20 Alexey Proskuryakov 2008-05-28 10:44:11 PDT
Comment on attachment 21395 [details]
Patch for >>, >>>, ==, ===, !=, !==

Oops, this is buggy...
Comment 21 Darin Adler 2008-05-28 10:47:37 PDT
Comment on attachment 21395 [details]
Patch for >>, >>>, ==, ===, !=, !==

 124         return reinterpret_cast<JSValue*>((reinterpret_cast<intptr_t>(val) >> ((reinterpret_cast<uintptr_t>(shift) >> 2) & 0x1f)) | NumberType);

Does this do the correct thing on numbers that have low bits set? What about a 1 shifted right by 1?
Comment 22 Alexey Proskuryakov 2008-05-28 11:47:50 PDT
Created attachment 21398 [details]
Patch for >>, ==, ===, !=, !==

Removed the fast path for >>> for now, as it was quite incorrect for negative numbers, and fixing this results in rather mixed performance changes.

(In reply to comment #21)
> Does this do the correct thing on numbers that have low bits set? 

As far as I can tell, it does.

> What about a 1 shifted right by 1?

That's (7 >> (7 >> 2)) | 3 == (7 >> 1) | 3 == 3, which is correct.
Comment 23 Alexey Proskuryakov 2008-05-29 23:06:28 PDT
Created attachment 21425 [details]
Patch for >>, ==, ===, !=, !==

Renamed v1/v2 to src1/src2, to make code match documentation.

I just realized that there is the same issue on PPC with << - I'll see if a fix for it can be combined with an optimization again, or will file a separate bug.
Comment 24 Oliver Hunt 2008-05-30 00:29:33 PDT
Comment on attachment 21425 [details]
Patch for >>, ==, ===, !=, !==

r=me, assuming layout and jsc tests pass.  And assuming you tested perf on current ToT :D
Comment 25 Alexey Proskuryakov 2008-05-30 02:16:36 PDT
Comment on attachment 21425 [details]
Patch for >>, ==, ===, !=, !==

Clearing, because I landed this.

(In reply to comment #24)
> And assuming you tested perf on current ToT :D

It was a larger speedup today - because r34204 was a random regression, and this patch brings absolute numbers back to what I saw before.
Comment 26 Alexey Proskuryakov 2008-05-30 06:51:03 PDT
Created attachment 21430 [details]
Patch for << and >>>

Randomness strikes again, but a win overall.

Many of these above cases can be further improved by using inline assembly - it is faster to try an arithmetic operation and check carry and overflow flags than to check if it can be run.
Comment 27 Darin Adler 2008-05-30 07:25:57 PDT
Comment on attachment 21430 [details]
Patch for << and >>>

If you fixed a regression on PowerPC, was there a regression test indicating it?

+        (KJS::JSImmediate::toTruncatedUInt32): Added. Same as getTruncatedInt32, but casts the result
+        to unsigned.

I'm concerned about the "to" vs. "get" in the name here. This class seems we use "get" to mean "convert with no possibility of error" and "to" to mean "convert with a possible exception or error".

r=me
Comment 28 Alexey Proskuryakov 2008-05-30 13:12:46 PDT
Comment on attachment 21430 [details]
Patch for << and >>>

Clearing review flag, since I landed this.

(In reply to comment #27)
> If you fixed a regression on PowerPC, was there a regression test indicating
> it?

Yes, it's one of the three Mozilla tests that I claimed to have fixed in the previous patch. Only two were fixed, in fact.

> I'm concerned about the "to" vs. "get" in the name here. This class seems we
> use "get" to mean "convert with no possibility of error" and "to" to mean
> "convert with a possible exception or error".

I'm rather confused about the naming in JSImmediate.h - some "get" functions also return a bool to indicate success. That said, this new function just implements "ToUInt32" algorithm from ECMA-232 for immediates as far as I can tell, and is not really a "get" in the sense that -1 turns into 4294967296. The "get" functions that we have do not cast the result like that. I wasn't sure what the word "truncated" meant here, but since this function does almost the same work as getTruncatedInt32, I kept it.
Comment 29 Darin Adler 2008-05-30 13:34:43 PDT
(In reply to comment #28)
> I'm rather confused about the naming in JSImmediate.h - some "get" functions
> also return a bool to indicate success. That said, this new function just
> implements "ToUInt32" algorithm from ECMA-232 for immediates as far as I can
> tell, and is not really a "get" in the sense that -1 turns into 4294967296. The
> "get" functions that we have do not cast the result like that. I wasn't sure
> what the word "truncated" meant here, but since this function does almost the
> same work as getTruncatedInt32, I kept it.

I can live with this. It's worth redoing the names at some point.

The "truncated" is left over from when immediate numbers could be non-integers. "Truncated" meant that the function was free to discard the fractional portion or values that didn't fit in a 32-bit integer.
Comment 30 Alexey Proskuryakov 2008-06-03 05:27:56 PDT
I think I'm done with this for now - do we need to keep the bug open?
Comment 31 Darin Adler 2008-06-03 09:26:01 PDT
Fine to close this if there are no more obvious cases to optimize.

But <= and >= haven't been done yet. Maybe there are others.
Comment 32 Gavin Barraclough 2012-03-07 17:47:57 PST
The JITs generate immediate arithmetic ops & compares.