Bug 48060 - Speed up op_jeq_null and op_jneq_null in JIT
Summary: Speed up op_jeq_null and op_jneq_null in JIT
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Enhancement
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-21 03:59 PDT by Gabor Loki
Modified: 2010-10-27 13:39 PDT (History)
7 users (show)

See Also:


Attachments
Speed up op_jeq_null and op_jneq_null in JIT (2.60 KB, patch)
2010-10-21 04:10 PDT, Gabor Loki
oliver: review-
oliver: commit-queue-
Details | Formatted Diff | Diff
Speed up op_jeq_null and op_jneq_null in JIT (3.63 KB, patch)
2010-10-27 06:21 PDT, Gabor Loki
no flags Details | Formatted Diff | Diff
Speed up op_jeq_null and op_jneq_null in JIT (3.63 KB, patch)
2010-10-27 06:37 PDT, Gabor Loki
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gabor Loki 2010-10-21 03:59:35 PDT
For both opcodes (op_jeq_null, op_jneq_null) the NullTag and UndefinedTag are checked to control the jump. These values differ only in the lowest bit. So, we can use some arithmetic instructions instead of set and compare instruction sequences.
Comment 1 Gabor Loki 2010-10-21 04:10:52 PDT
Created attachment 71421 [details]
Speed up op_jeq_null and op_jneq_null in JIT

An artificial test says this construct is 1.1x faster on x86 and 1.25x faster on ARM.

SunSpider says nothing on x86, but it is 1.01x faster on ARM.
Comment 2 Oliver Hunt 2010-10-22 13:11:26 PDT
Comment on attachment 71421 [details]
Speed up op_jeq_null and op_jneq_null in JIT

I think we can do better than this.  Currently there's
        enum { Int32Tag =        0xffffffff };
        enum { CellTag =         0xfffffffe };
        enum { TrueTag =         0xfffffffd };
        enum { FalseTag =        0xfffffffc };
        enum { NullTag =         0xfffffffb };
        enum { UndefinedTag =    0xfffffffa };
        enum { EmptyValueTag =   0xfffffff9 };
        enum { DeletedValueTag = 0xfffffff8 };

If we reorder these as:
        enum { NullTag =         0xfffffffb };
        enum { UndefinedTag =    0xfffffffe };
        enum { Int32Tag =        0xfffffffd };
        enum { CellTag =         0xfffffffc };
        enum { TrueTag =         0xfffffffb };
        enum { FalseTag =        0xfffffffa };
        enum { EmptyValueTag =   0xfffffff9 };
        enum { DeletedValueTag = 0xfffffff8 };

Then we don't need any bit logic at all, and can replace it with
addJump(branch32(GreaterOrEqual, regT1, Imm32(JSValue::UndefinedTag)), target);

(or whichever is the unsigned >= condition)

Gavin is there any reason this would not work?
Comment 3 Gabor Loki 2010-10-22 13:29:03 PDT
> If we reorder these as:
>         enum { NullTag =         0xfffffffb };

It should be 0xffffffff (not 0xfffffffb)

>         enum { UndefinedTag =    0xfffffffe };
>         enum { Int32Tag =        0xfffffffd };
>         enum { CellTag =         0xfffffffc };
>         enum { TrueTag =         0xfffffffb };
>         enum { FalseTag =        0xfffffffa };
>         enum { EmptyValueTag =   0xfffffff9 };
>         enum { DeletedValueTag = 0xfffffff8 };

Anyway, it looks good to me, but we have to guard it with an ASSERT as well.
eg.: ASSERT(NullTag == 0xffffffff && UndefinedTag = 0xfffffffe);
or ASSERT(NullTag + 1 == 0 && UndefinedTag + 1 == NullTag);
Well, not a very nice one, but it is just an ASSERT.
Comment 4 Zoltan Herczeg 2010-10-22 14:07:20 PDT
>         enum { Int32Tag =        0xffffffff };
>         enum { CellTag =         0xfffffffe };

If my memory seves my correctly a lot of code depend on these two constants.

Perhaps we can move "down" the "undefined" like constants to the end of the list.
Comment 5 Gavin Barraclough 2010-10-22 14:16:06 PDT
You probably want Geoff or Sam to comment on this, not me.  I only remember a reliance on the values of true & false – (false | 1) == true, but there may well be other ordering issues.
Comment 6 Oliver Hunt 2010-10-22 14:16:49 PDT
(In reply to comment #4)
> >         enum { Int32Tag =        0xffffffff };
> >         enum { CellTag =         0xfffffffe };
> 
> If my memory seves my correctly a lot of code depend on these two constants.
> 
> Perhaps we can move "down" the "undefined" like constants to the end of the list.

Adding Sam and Geoff as they know the most about these tags.  To my knowledge though there aren't any real constraints on these values so we could look at maybe making use of these bits in more interesting ways
Comment 7 Gavin Barraclough 2010-10-22 14:20:29 PDT
If we do sort this, would be nice to get rid of DeletedValueTag.

We don't use a special tag for this in JSVALUE64 – the value just needs to not be normally possible – e.g. tag = CellTag payload = 1 or -1 should be fine.
Comment 8 Oliver Hunt 2010-10-22 14:26:28 PDT
(In reply to comment #4)
> >         enum { Int32Tag =        0xffffffff };
> >         enum { CellTag =         0xfffffffe };
> 
> If my memory seves my correctly a lot of code depend on these two constants.
> 
> Perhaps we can move "down" the "undefined" like constants to the end of the list.

I was going to say we couldn't do this, and then i forgot why, now i remembered -- the goal i had was to reduce the null + undefined check to a single branch with no additional logic.  If we moved them down rather than up we'd need to do a lower+upper bound check as we currently identify doubles simply by doing a < DeletedValueTag check.
Comment 9 Zoltan Herczeg 2010-10-22 23:58:20 PDT
> >         enum { Int32Tag =        0xffffffff };

I remember an optimization:

if ((val1.tag() & val2.tag()) == 0xffffffff) -> both are integer.

And CellTag has only one zero bit at the end of the bit stream:

ALWAYS_INLINE JIT::Jump JIT::emitJumpIfBothJSCells(RegisterID reg1, RegisterID reg2, RegisterID scratch)
{
    move(reg1, scratch);
    orPtr(reg2, scratch);
    return emitJumpIfJSCell(scratch);
}
Comment 10 Gabor Loki 2010-10-27 01:04:02 PDT
>         enum { NullTag =         0xffffffff };
>         enum { UndefinedTag =    0xfffffffe };
> 
> Then we don't need any bit logic at all, and can replace it with
> addJump(branch32(GreaterOrEqual, regT1, Imm32(JSValue::UndefinedTag)), target);

I took a closer look into emit_op_jneq_null and emit_op_jeq_null and there is no check for non-double values. Both functions check MasqueradesAsUndefined bit and NullTag or UndefinedTag to determine the source is null (or not). So, this suggestion will not work. Although we can insert an unsigned comparison to the LowestTag to check for double, but this case will add new comparisons which will influence the execution speed in different ways for different control flow.
Comment 11 Gabor Loki 2010-10-27 05:58:33 PDT
Ohh, the GreaterThanOrEqual comment misled me. No more comparison is needed (ignore my last comment). The patch is coming.
Comment 12 Gabor Loki 2010-10-27 06:08:25 PDT
> ALWAYS_INLINE JIT::Jump JIT::emitJumpIfBothJSCells(RegisterID reg1, RegisterID reg2, RegisterID scratch)

It was used by JSValue32 which is obsolete now. So, it can be safely removed.
Comment 13 Gabor Loki 2010-10-27 06:21:59 PDT
Created attachment 72025 [details]
Speed up op_jeq_null and op_jneq_null in JIT
Comment 14 WebKit Review Bot 2010-10-27 06:23:11 PDT
Attachment 72025 [details] did not pass style-queue:

Failed to run "['WebKitTools/Scripts/check-webkit-style']" exit_code: 1
JavaScriptCore/jit/JITOpcodes32_64.cpp:926:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
JavaScriptCore/jit/JITOpcodes32_64.cpp:950:  Tests for true/false, null/non-null, and zero/non-zero should all be done without equality comparisons.  [readability/comparison_to_zero] [5]
Total errors found: 2 in 3 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 15 Gabor Loki 2010-10-27 06:37:52 PDT
Created attachment 72026 [details]
Speed up op_jeq_null and op_jneq_null in JIT

Style issue is fixed.
Comment 16 Oliver Hunt 2010-10-27 09:49:28 PDT
Comment on attachment 72026 [details]
Speed up op_jeq_null and op_jneq_null in JIT

Do you have any performance numbers?
Comment 17 Gabor Loki 2010-10-27 12:54:48 PDT
> Do you have any performance numbers?

Same as above (on ARM):
Artificial test for j[n]?eq_null: 1.25x faster
SunSpider: 1.01x faster
  Nonrelated tests are faster, for example:
    nsieve-bits: 1.027x faster
    cordic: 1.053x faster
  No other significant change. There is no significant slowdown.

I am pretty sure that the benchmarks do not hit these cases very often.
Comment 18 Oliver Hunt 2010-10-27 12:56:32 PDT
Comment on attachment 72026 [details]
Speed up op_jeq_null and op_jneq_null in JIT

can we do similar for x86_64?
Comment 19 WebKit Commit Bot 2010-10-27 13:32:32 PDT
Comment on attachment 72026 [details]
Speed up op_jeq_null and op_jneq_null in JIT

Clearing flags on attachment: 72026

Committed r70699: <http://trac.webkit.org/changeset/70699>
Comment 20 WebKit Commit Bot 2010-10-27 13:32:39 PDT
All reviewed patches have been landed.  Closing bug.
Comment 21 Gabor Loki 2010-10-27 13:39:48 PDT
(JSValue32_64 on x86)
The same artificial test: 1.12x as fast
SunSpider still says nothing overall.
  3d-cude: 1.025x as fast
  base64: 1.019x as fast
  fasta: 1.011x as fast
  tagcloud: 1.005x as fast
  unpack-code: 1.004x as slow
No other significant change.