Bug 90982 - LLInt: Add op_get_by_id_proto
Summary: LLInt: Add op_get_by_id_proto
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Andy Wingo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-11 08:23 PDT by Andy Wingo
Modified: 2012-07-11 12:10 PDT (History)
3 users (show)

See Also:


Attachments
Patch (11.12 KB, patch)
2012-07-11 08:31 PDT, Andy Wingo
no flags Details | Formatted Diff | Diff
Patch (13.27 KB, patch)
2012-07-11 11:51 PDT, Andy Wingo
no flags Details | Formatted Diff | Diff
Patch (13.69 KB, patch)
2012-07-11 12:07 PDT, Andy Wingo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andy Wingo 2012-07-11 08:23:30 PDT
The patch to be attached adds support for op_get_by_it_proto to the llint.

Before, the v8 benchmark results:

Richards: 783
DeltaBlue: 713
Crypto: 1397
RayTrace: 2415
EarleyBoyer: 3344
RegExp: 389
Splay: 3936
NavierStokes: 2015
----
Score (version 7): 1449

After:

Richards: 835
DeltaBlue: 778
Crypto: 1597
RayTrace: 2516
EarleyBoyer: 3340
RegExp: 398
Splay: 4311
NavierStokes: 2019
----
Score (version 7): 1531
Comment 1 Andy Wingo 2012-07-11 08:31:36 PDT
Created attachment 151713 [details]
Patch
Comment 2 Andy Wingo 2012-07-11 08:33:55 PDT
I note that those benchmark results are --useJIT=no.
Comment 3 Filip Pizlo 2012-07-11 10:21:27 PDT
Comment on attachment 151713 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=151713&action=review

> Source/JavaScriptCore/bytecode/CodeBlock.cpp:2049
> +                if (!curInstruction[4].u.structure || Heap::isMarked(curInstruction[4].u.structure.get()))
> +                    break;

What if the main structure is marked but the prototype structure isn't?  I guess this should be:

ASSERT(!!curInstruction[4].u.structure == !!curInstruction[5].u.structure); // The structures should either be both 0 or both non-0.
if (!curInstruction[4].u.structure)
    break;
if (Heap::isMarked(curInstruction[4].u.structure.get()) && Heap::isMarked(curInstruction[5].u.structure.get()))
    break;

Though, I think it would be even better if the clearing you're doing below flipped this back to op_get_by_id - that's the style I've been using for put_by_id.

> Source/JavaScriptCore/llint/LowLevelInterpreter.asm:508
> +_llint_op_get_by_id_proto:
> +    traceExecution()
> +    callSlowPath(_llint_slow_path_get_by_id_proto)
> +    dispatch(9)

So you get a speed-up even though this isn't inline?  Nifty.  We should inline it at some point.
Comment 4 Filip Pizlo 2012-07-11 10:22:21 PDT
(In reply to comment #3)
> (From update of attachment 151713 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=151713&action=review
> 
> > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2049
> > +                if (!curInstruction[4].u.structure || Heap::isMarked(curInstruction[4].u.structure.get()))
> > +                    break;
> 
> What if the main structure is marked but the prototype structure isn't?  I guess this should be:
> 
> ASSERT(!!curInstruction[4].u.structure == !!curInstruction[5].u.structure); // The structures should either be both 0 or both non-0.
> if (!curInstruction[4].u.structure)
>     break;
> if (Heap::isMarked(curInstruction[4].u.structure.get()) && Heap::isMarked(curInstruction[5].u.structure.get()))
>     break;

To clarify, please fix this before landing - otherwise we could get annoying GC crashies.

> 
> Though, I think it would be even better if the clearing you're doing below flipped this back to op_get_by_id - that's the style I've been using for put_by_id.
> 
> > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:508
> > +_llint_op_get_by_id_proto:
> > +    traceExecution()
> > +    callSlowPath(_llint_slow_path_get_by_id_proto)
> > +    dispatch(9)
> 
> So you get a speed-up even though this isn't inline?  Nifty.  We should inline it at some point.
Comment 5 Andy Wingo 2012-07-11 11:06:34 PDT
(In reply to comment #3)
> (From update of attachment 151713 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=151713&action=review
> 
> > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2049
> > +                if (!curInstruction[4].u.structure || Heap::isMarked(curInstruction[4].u.structure.get()))
> > +                    break;
> 
> What if the main structure is marked but the prototype structure isn't?

How is that possible?  Wouldn't a marked structure imply that its prototype is marked?  Or are these dead marks and not live marks?  Apologies for the naivete here.

> Though, I think it would be even better if the clearing you're doing below flipped this back to op_get_by_id - that's the style I've been using for put_by_id.

I wonder though, is there any benefit to flipping back to get_by_id just because the op_get_by_id_proto failed?  Consider: you fail your prediction so you go through the slow_path_get_by_id.  If it's going to predict something else, it will set the opcode and structure/offset fields appropriately.  If not, the cache is not updated.

It seems that the likelihood of a future get_by_id_proto cache hit after a miss is somewhat likely.  I was thinking of just leaving the state as get_by_id_proto, given that it dispatches fairly quickly to get_by_id if things are not right.

> > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:508
> > +_llint_op_get_by_id_proto:
> > +    traceExecution()
> > +    callSlowPath(_llint_slow_path_get_by_id_proto)
> > +    dispatch(9)
> 
> So you get a speed-up even though this isn't inline?  Nifty.  We should inline it at some point.

I think the speedup is mostly because we cut the number of dynamic dispatches in half (~20% of the time to ~10% of the time).  I was playing around with inline vs not inline for a couple other opcodes (get_string_length and get_array_length) and didn't see much difference there.

Incidentally, op_get_by_val is unfortunate in some ways because of the pointer-chasing you have to do just to determine that it is an array (getting at the jsArrayClassInfo through the global data through the code block).  In C++ it is more direct.  Would be nice to improve that if possible.  But that's another bug I guess!
Comment 6 Andy Wingo 2012-07-11 11:51:50 PDT
Created attachment 151743 [details]
Patch
Comment 7 Filip Pizlo 2012-07-11 11:52:24 PDT
(In reply to comment #5)
> (In reply to comment #3)
> > (From update of attachment 151713 [details] [details])
> > View in context: https://bugs.webkit.org/attachment.cgi?id=151713&action=review
> > 
> > > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2049
> > > +                if (!curInstruction[4].u.structure || Heap::isMarked(curInstruction[4].u.structure.get()))
> > > +                    break;
> > 
> > What if the main structure is marked but the prototype structure isn't?
> 
> How is that possible?  Wouldn't a marked structure imply that its prototype is marked?  Or are these dead marks and not live marks?  Apologies for the naivete here.

You have two structures: curInstruction[4].u.structure and curInstruction[5].u.structure.  Let's called them S and T for short.

S points to a prototype P, and that prototype P at some point in time had structure T.  But in the future it may have structure T'.  That's the whole point of get_by_id_proto caching - to quickly check that P still has structure T and not some other structure T'.

Hence, you're right that S marks P and P will mark whatever structure it has, but that structure may no longer be T.

So if you don't mark T yourself, you'll have a get_by_id_proto cache that is checking if P has a garbage structure.  Then you may find yourself in the following situation:

0) P points to T (starting condition)
1) P changes structure to T'.  T is no longer reachable.
2) GC happens. T is freed.
3) P changed structure to T''.  T'' is allocated in the memory formerly occupied by T.
4) get_by_id_proto succeeds even though it should have failed

In particular, T'' may no longer have the property that the get_by_id_proto is trying to read.  The T->T' or T'->T'' transition may have deleted the property you were trying to read.

> 
> > Though, I think it would be even better if the clearing you're doing below flipped this back to op_get_by_id - that's the style I've been using for put_by_id.
> 
> I wonder though, is there any benefit to flipping back to get_by_id just because the op_get_by_id_proto failed?  Consider: you fail your prediction so you go through the slow_path_get_by_id.  If it's going to predict something else, it will set the opcode and structure/offset fields appropriately.  If not, the cache is not updated.

There is no performance benefit.  There's also no disadvantage, either.  This is a once-per GC event.  It happens rarely.

> 
> It seems that the likelihood of a future get_by_id_proto cache hit after a miss is somewhat likely.  I was thinking of just leaving the state as get_by_id_proto, given that it dispatches fairly quickly to get_by_id if things are not right.

This is a matter of style.  I'm asking you to stick to the style we have (cleared caches reverting to the bare form of the instruction) rather than introducing a new style.

> 
> > > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:508
> > > +_llint_op_get_by_id_proto:
> > > +    traceExecution()
> > > +    callSlowPath(_llint_slow_path_get_by_id_proto)
> > > +    dispatch(9)
> > 
> > So you get a speed-up even though this isn't inline?  Nifty.  We should inline it at some point.
> 
> I think the speedup is mostly because we cut the number of dynamic dispatches in half (~20% of the time to ~10% of the time).  I was playing around with inline vs not inline for a couple other opcodes (get_string_length and get_array_length) and didn't see much difference there.
> 
> Incidentally, op_get_by_val is unfortunate in some ways because of the pointer-chasing you have to do just to determine that it is an array (getting at the jsArrayClassInfo through the global data through the code block).  In C++ it is more direct.  Would be nice to improve that if possible.  But that's another bug I guess!
Comment 8 Filip Pizlo 2012-07-11 11:53:25 PDT
Comment on attachment 151743 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=151743&action=review

> Source/JavaScriptCore/bytecode/CodeBlock.cpp:2050
> +                ASSERT(Heap::isMarked(curInstruction[5].u.structure.get()));

I believe this assertion is wrong.  See my previous comment.
Comment 9 Andy Wingo 2012-07-11 11:56:15 PDT
(In reply to comment #8)
> (From update of attachment 151743 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=151743&action=review
> 
> > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2050
> > +                ASSERT(Heap::isMarked(curInstruction[5].u.structure.get()));
> 
> I believe this assertion is wrong.  See my previous comment.

Indeed, thanks for the review.  Will fix.
Comment 10 Andy Wingo 2012-07-11 12:01:35 PDT
One more comment while I bake the patch:

(In reply to comment #7)
> > I wonder though, is there any benefit to flipping back to get_by_id just because the op_get_by_id_proto failed?  Consider: you fail your prediction so you go through the slow_path_get_by_id.  If it's going to predict something else, it will set the opcode and structure/offset fields appropriately.  If not, the cache is not updated.
> 
> There is no performance benefit.  There's also no disadvantage, either.  This is a once-per GC event.  It happens rarely.

There are other reasons to miss the inline cache, for example passing in an object with a different prototype.  If property access on that object won't be cached for whatever reason, there is no need to clear the get_by_id_proto cache.  But I won't insist on this point, it is a rare thing.
Comment 11 Andy Wingo 2012-07-11 12:07:05 PDT
Created attachment 151747 [details]
Patch
Comment 12 Andy Wingo 2012-07-11 12:10:30 PDT
I need to head out, but I'm getting a couple fast/js failures with this patch:

Regressions: Unexpected text diff mismatch : (2)
  fast/js/dictionary-no-cache.html = TEXT
  fast/js/dictionary-prototype-caching.html = TEXT

Clearing r? so it doesn't go through accidentally.  Thanks for the review though!