NEW 90982
LLInt: Add op_get_by_id_proto
https://bugs.webkit.org/show_bug.cgi?id=90982
Summary LLInt: Add op_get_by_id_proto
Andy Wingo
Reported 2012-07-11 08:23:30 PDT
The patch to be attached adds support for op_get_by_it_proto to the llint. Before, the v8 benchmark results: Richards: 783 DeltaBlue: 713 Crypto: 1397 RayTrace: 2415 EarleyBoyer: 3344 RegExp: 389 Splay: 3936 NavierStokes: 2015 ---- Score (version 7): 1449 After: Richards: 835 DeltaBlue: 778 Crypto: 1597 RayTrace: 2516 EarleyBoyer: 3340 RegExp: 398 Splay: 4311 NavierStokes: 2019 ---- Score (version 7): 1531
Attachments
Patch (11.12 KB, patch)
2012-07-11 08:31 PDT, Andy Wingo
no flags
Patch (13.27 KB, patch)
2012-07-11 11:51 PDT, Andy Wingo
no flags
Patch (13.69 KB, patch)
2012-07-11 12:07 PDT, Andy Wingo
no flags
Andy Wingo
Comment 1 2012-07-11 08:31:36 PDT
Andy Wingo
Comment 2 2012-07-11 08:33:55 PDT
I note that those benchmark results are --useJIT=no.
Filip Pizlo
Comment 3 2012-07-11 10:21:27 PDT
Comment on attachment 151713 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=151713&action=review > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2049 > + if (!curInstruction[4].u.structure || Heap::isMarked(curInstruction[4].u.structure.get())) > + break; What if the main structure is marked but the prototype structure isn't? I guess this should be: ASSERT(!!curInstruction[4].u.structure == !!curInstruction[5].u.structure); // The structures should either be both 0 or both non-0. if (!curInstruction[4].u.structure) break; if (Heap::isMarked(curInstruction[4].u.structure.get()) && Heap::isMarked(curInstruction[5].u.structure.get())) break; Though, I think it would be even better if the clearing you're doing below flipped this back to op_get_by_id - that's the style I've been using for put_by_id. > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:508 > +_llint_op_get_by_id_proto: > + traceExecution() > + callSlowPath(_llint_slow_path_get_by_id_proto) > + dispatch(9) So you get a speed-up even though this isn't inline? Nifty. We should inline it at some point.
Filip Pizlo
Comment 4 2012-07-11 10:22:21 PDT
(In reply to comment #3) > (From update of attachment 151713 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=151713&action=review > > > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2049 > > + if (!curInstruction[4].u.structure || Heap::isMarked(curInstruction[4].u.structure.get())) > > + break; > > What if the main structure is marked but the prototype structure isn't? I guess this should be: > > ASSERT(!!curInstruction[4].u.structure == !!curInstruction[5].u.structure); // The structures should either be both 0 or both non-0. > if (!curInstruction[4].u.structure) > break; > if (Heap::isMarked(curInstruction[4].u.structure.get()) && Heap::isMarked(curInstruction[5].u.structure.get())) > break; To clarify, please fix this before landing - otherwise we could get annoying GC crashies. > > Though, I think it would be even better if the clearing you're doing below flipped this back to op_get_by_id - that's the style I've been using for put_by_id. > > > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:508 > > +_llint_op_get_by_id_proto: > > + traceExecution() > > + callSlowPath(_llint_slow_path_get_by_id_proto) > > + dispatch(9) > > So you get a speed-up even though this isn't inline? Nifty. We should inline it at some point.
Andy Wingo
Comment 5 2012-07-11 11:06:34 PDT
(In reply to comment #3) > (From update of attachment 151713 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=151713&action=review > > > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2049 > > + if (!curInstruction[4].u.structure || Heap::isMarked(curInstruction[4].u.structure.get())) > > + break; > > What if the main structure is marked but the prototype structure isn't? How is that possible? Wouldn't a marked structure imply that its prototype is marked? Or are these dead marks and not live marks? Apologies for the naivete here. > Though, I think it would be even better if the clearing you're doing below flipped this back to op_get_by_id - that's the style I've been using for put_by_id. I wonder though, is there any benefit to flipping back to get_by_id just because the op_get_by_id_proto failed? Consider: you fail your prediction so you go through the slow_path_get_by_id. If it's going to predict something else, it will set the opcode and structure/offset fields appropriately. If not, the cache is not updated. It seems that the likelihood of a future get_by_id_proto cache hit after a miss is somewhat likely. I was thinking of just leaving the state as get_by_id_proto, given that it dispatches fairly quickly to get_by_id if things are not right. > > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:508 > > +_llint_op_get_by_id_proto: > > + traceExecution() > > + callSlowPath(_llint_slow_path_get_by_id_proto) > > + dispatch(9) > > So you get a speed-up even though this isn't inline? Nifty. We should inline it at some point. I think the speedup is mostly because we cut the number of dynamic dispatches in half (~20% of the time to ~10% of the time). I was playing around with inline vs not inline for a couple other opcodes (get_string_length and get_array_length) and didn't see much difference there. Incidentally, op_get_by_val is unfortunate in some ways because of the pointer-chasing you have to do just to determine that it is an array (getting at the jsArrayClassInfo through the global data through the code block). In C++ it is more direct. Would be nice to improve that if possible. But that's another bug I guess!
Andy Wingo
Comment 6 2012-07-11 11:51:50 PDT
Filip Pizlo
Comment 7 2012-07-11 11:52:24 PDT
(In reply to comment #5) > (In reply to comment #3) > > (From update of attachment 151713 [details] [details]) > > View in context: https://bugs.webkit.org/attachment.cgi?id=151713&action=review > > > > > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2049 > > > + if (!curInstruction[4].u.structure || Heap::isMarked(curInstruction[4].u.structure.get())) > > > + break; > > > > What if the main structure is marked but the prototype structure isn't? > > How is that possible? Wouldn't a marked structure imply that its prototype is marked? Or are these dead marks and not live marks? Apologies for the naivete here. You have two structures: curInstruction[4].u.structure and curInstruction[5].u.structure. Let's called them S and T for short. S points to a prototype P, and that prototype P at some point in time had structure T. But in the future it may have structure T'. That's the whole point of get_by_id_proto caching - to quickly check that P still has structure T and not some other structure T'. Hence, you're right that S marks P and P will mark whatever structure it has, but that structure may no longer be T. So if you don't mark T yourself, you'll have a get_by_id_proto cache that is checking if P has a garbage structure. Then you may find yourself in the following situation: 0) P points to T (starting condition) 1) P changes structure to T'. T is no longer reachable. 2) GC happens. T is freed. 3) P changed structure to T''. T'' is allocated in the memory formerly occupied by T. 4) get_by_id_proto succeeds even though it should have failed In particular, T'' may no longer have the property that the get_by_id_proto is trying to read. The T->T' or T'->T'' transition may have deleted the property you were trying to read. > > > Though, I think it would be even better if the clearing you're doing below flipped this back to op_get_by_id - that's the style I've been using for put_by_id. > > I wonder though, is there any benefit to flipping back to get_by_id just because the op_get_by_id_proto failed? Consider: you fail your prediction so you go through the slow_path_get_by_id. If it's going to predict something else, it will set the opcode and structure/offset fields appropriately. If not, the cache is not updated. There is no performance benefit. There's also no disadvantage, either. This is a once-per GC event. It happens rarely. > > It seems that the likelihood of a future get_by_id_proto cache hit after a miss is somewhat likely. I was thinking of just leaving the state as get_by_id_proto, given that it dispatches fairly quickly to get_by_id if things are not right. This is a matter of style. I'm asking you to stick to the style we have (cleared caches reverting to the bare form of the instruction) rather than introducing a new style. > > > > Source/JavaScriptCore/llint/LowLevelInterpreter.asm:508 > > > +_llint_op_get_by_id_proto: > > > + traceExecution() > > > + callSlowPath(_llint_slow_path_get_by_id_proto) > > > + dispatch(9) > > > > So you get a speed-up even though this isn't inline? Nifty. We should inline it at some point. > > I think the speedup is mostly because we cut the number of dynamic dispatches in half (~20% of the time to ~10% of the time). I was playing around with inline vs not inline for a couple other opcodes (get_string_length and get_array_length) and didn't see much difference there. > > Incidentally, op_get_by_val is unfortunate in some ways because of the pointer-chasing you have to do just to determine that it is an array (getting at the jsArrayClassInfo through the global data through the code block). In C++ it is more direct. Would be nice to improve that if possible. But that's another bug I guess!
Filip Pizlo
Comment 8 2012-07-11 11:53:25 PDT
Comment on attachment 151743 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=151743&action=review > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2050 > + ASSERT(Heap::isMarked(curInstruction[5].u.structure.get())); I believe this assertion is wrong. See my previous comment.
Andy Wingo
Comment 9 2012-07-11 11:56:15 PDT
(In reply to comment #8) > (From update of attachment 151743 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=151743&action=review > > > Source/JavaScriptCore/bytecode/CodeBlock.cpp:2050 > > + ASSERT(Heap::isMarked(curInstruction[5].u.structure.get())); > > I believe this assertion is wrong. See my previous comment. Indeed, thanks for the review. Will fix.
Andy Wingo
Comment 10 2012-07-11 12:01:35 PDT
One more comment while I bake the patch: (In reply to comment #7) > > I wonder though, is there any benefit to flipping back to get_by_id just because the op_get_by_id_proto failed? Consider: you fail your prediction so you go through the slow_path_get_by_id. If it's going to predict something else, it will set the opcode and structure/offset fields appropriately. If not, the cache is not updated. > > There is no performance benefit. There's also no disadvantage, either. This is a once-per GC event. It happens rarely. There are other reasons to miss the inline cache, for example passing in an object with a different prototype. If property access on that object won't be cached for whatever reason, there is no need to clear the get_by_id_proto cache. But I won't insist on this point, it is a rare thing.
Andy Wingo
Comment 11 2012-07-11 12:07:05 PDT
Andy Wingo
Comment 12 2012-07-11 12:10:30 PDT
I need to head out, but I'm getting a couple fast/js failures with this patch: Regressions: Unexpected text diff mismatch : (2) fast/js/dictionary-no-cache.html = TEXT fast/js/dictionary-prototype-caching.html = TEXT Clearing r? so it doesn't go through accidentally. Thanks for the review though!
Note You need to log in before you can comment on or make changes to this bug.