Bug 21319 - Avoid restoring the caller's 'r' value in op_ret
Summary: Avoid restoring the caller's 'r' value in op_ret
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Maciej Stachowiak
Depends on:
Blocks: 20812
  Show dependency treegraph
Reported: 2008-10-02 16:04 PDT by Cameron Zwarich (cpst)
Modified: 2008-10-17 02:06 PDT (History)
4 users (show)

See Also:

avoid memory writes of callFrame as much as possible (69.82 KB, patch)
2008-10-14 09:43 PDT, Maciej Stachowiak
darin: review+
Details | Formatted Diff | Diff
take the low road (42.26 KB, patch)
2008-10-17 02:04 PDT, Maciej Stachowiak
zwarich: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Cameron Zwarich (cpst) 2008-10-02 16:04:01 PDT
Currently, op_ret restores the caller's 'r' value. This can be removed if we directly get 'r' from %edi in each of the CTI helper functions. There will be a few hiccups with exceptions, but this shouldn't be too hard.
Comment 1 Cameron Zwarich (cpst) 2008-10-04 11:13:07 PDT
I am unassigning this.
Comment 2 Maciej Stachowiak 2008-10-14 09:23:58 PDT
Rather than directly getting r from edi, I changed the CTI functions to expect it to be passed in as a parameter, which given the "fastcall" calling convention ends up passing in edx.

This avoids the memory writes of the call frame in op_call and op_ret, and likely also some meory reads in various CTI functions. The CTI arg slot for the call frame is now only used on entry and while an exception is in flight.

Comment 3 Maciej Stachowiak 2008-10-14 09:43:27 PDT
Created attachment 24342 [details]
avoid memory writes of callFrame as much as possible
Comment 4 Cameron Zwarich (cpst) 2008-10-14 09:46:32 PDT
I will do some additional performance testing of this patch (just to be sure) and then review it.
Comment 5 Darin Adler 2008-10-14 09:54:48 PDT
Comment on attachment 24342 [details]
avoid memory writes of callFrame as much as possible

Great change! Not only faster, but cleaner code too.

-    "call " SYMBOL_STRING(_ZN3JSC7Machine12cti_vm_throwEPPv) "\n"
+    "movl 0x38(%esp), %edx" "\n"
+    "call " SYMBOL_STRING(_ZN3JSC7Machine12cti_vm_throwEPPvPNS_9ExecStateE) "\n"

+            mov edi, [esp + 0x38];

I'd like the see a comment like this one:

    "movl 0x38(%esp), %edi" "\n" // Ox38 = 0x0E * 4, 0x0E = CTI_ARGS_callFrame (see assertion above)

on those lines of code.

It would make this patch smaller and easier to review if the emitCTICall() change was done first.

+    m_jit.movl_mr(8 + sizeof(void*), X86::ecx, X86::edx);

+    m_jit.movl_mr(8 + sizeof(void*), X86::ecx, X86::edx);

I don't understand the meaning of the magic number 8 here.

+        m_jit.cmpl_i32r(reinterpret_cast<intptr_t> (jsNull()), X86::edx);

Strange space after ">" on this line of code.

-        doSetReturnAddressVMThrowTrampoline(&CTI_RETURN_ADDRESS); \
+        ARG_setCallFrame(callFrame); \
+        doSetReturnAddressVMThrowTrampoline(&CTI_RETURN_ADDRESS);      \

Unexpected change in whitespace here.

+void Machine::cti_op_put_by_id_generic(void** args, CallFrame* callFrame)

Maybe we should use a type other than void** such as CTIArgs* so we don't have to do all the typecasting in the ARG_ macros.
Comment 6 Maciej Stachowiak 2008-10-14 10:06:04 PDT
We discussed using a type other than void** for the args parameter - if it were a class pointer that knew how to do the right kinds of lookup, it could allow getting rid of the macros entirely. That seems best done in a separate change.

The emitCTICall() change could in theory be done separately but is only a tiny part of this change, so doesn't seem worth it.

I applied all your other suggestions.
Comment 7 Maciej Stachowiak 2008-10-17 02:04:12 PDT
Created attachment 24455 [details]
take the low road

Since we can't rely on the fastcall calling convention after all, take another approach.
Comment 8 Cameron Zwarich (cpst) 2008-10-17 02:06:39 PDT
Comment on attachment 24455 [details]
take the low road

r=me (with ChangeLog of course)