Bug 165282

Summary: WebAssembly: handle and optimize wasm export => wasm import calls
Product: WebKit Reporter: JF Bastien <jfbastien>
Component: JavaScriptCoreAssignee: JF Bastien <jfbastien>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, fpizlo, ggaren, jfbastien, keith_miller, mark.lam, msaboff, saam, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 165118    
Bug Blocks: 161709, 162952, 166442, 166444, 166462, 166486, 166625, 166623    
Attachments:
Description Flags
WP
jfbastien: commit-queue-
WIP
jfbastien: commit-queue-
WIP
jfbastien: commit-queue-
WIP
jfbastien: commit-queue-
WIP
jfbastien: commit-queue-
WIP
jfbastien: commit-queue-
WIP
jfbastien: commit-queue-
WIP
jfbastien: commit-queue-
WIP
jfbastien: commit-queue-
patch
none
patch
saam: review+, saam: commit-queue-
patch none

Description JF Bastien 2016-12-01 15:15:55 PST
The spec mentions Function Exotic Object. These are the WebAssembly exports:
https://github.com/WebAssembly/design/blob/master/JS.md#exported-function-exotic-objects

When a WebAssembly function import is a Function Exotic Object then extra checks need to occur for signature match:
https://github.com/WebAssembly/design/blob/master/JS.md#webassemblyinstance-constructor

This is conveniently the amount of information we need to optimize through wasm ↔ wasm calls.

Right now (bug #165118) I'm ignoring the requirement, we should fix it later. One tricky thing will be to make sure that Instance lifetime is kept! Importing an Instance's export means keeping that entire Instance alive, but the importObject would only have given use the Instance's `exports` object so we need to map it back to its parent, and put it in a Vector<WriteBarrier> on the importing Instance.
Comment 1 Radar WebKit Bug Importer 2016-12-20 14:29:54 PST
<rdar://problem/29760322>
Comment 2 JF Bastien 2016-12-22 10:52:12 PST
Created attachment 297658 [details]
WP

Start this patch. Currently implemented:

Changes to the WebAssembly implementation:

  - Add a new JSType for WebAssemblyFunction, and use it when creating its
    structure. This will be used to quickly detect from wasm whether the import
    call is to another wasm module, or whether it's to JS.
  - Use cell instead of objects to detect wasm->wasm, the previous code was wrong.
  - Improve diagnostic printout for wasm stubs: print the full signature.
  - Properly check that wasm -> wasm calls have matching signatures. Add testing for this.
  - Test some of the wasm -> wasm calls, not working yet, and more tests need to
    get implemented.


Support changes:

  - Add a Proxy to Builder.js, which intercepts unknown property lookups. This
    creates way better error messages on typos than 'undefined is not a
    function', which happens semi-frequently as I typo opcode names (and which
    one is a typo is hard to find because we chain builders).
  - Add limited support for var{u}int64 (only the 32-bit values).
Comment 3 JF Bastien 2016-12-26 14:20:33 PST
Created attachment 297773 [details]
WIP

An update. This is still missing:
  - The wasm->wasm stub in WasmBinding.cpp, which is why the wasm-to-wasm.js test is currently disabled.
  - Currently crashing when trying to access memory. I'll debug this next.


wasm->wasm

Changes to the WebAssembly implementation:

  - Add a new JSType for WebAssemblyFunction, and use it when creating its
    structure. This will be used to quickly detect from wasm whether the import
    call is to another wasm module, or whether it's to JS.
  - Use cell instead of objects to detect wasm->wasm, the previous code was wrong.
  - Improve diagnostic printout for wasm stubs: print the full signature.
  - Properly check that wasm -> wasm calls have matching signatures. Add testing for this.
  - Test some of the wasm -> wasm calls, not working yet, and more tests need to
    get implemented.
  - Generate two stubs from the import stub generator: one for wasm->JS and one
    for wasm -> wasm. This is done at Module time. Which is called will only be
    known at Instance time, once we've received the import object. We want to
    avoid codegen at Instance time, so having both around is great.
  - Restore the WebAssembly global state (VM top Instance, and pinned registers)
    after call / call_indirect, and in the JS->wasm entry stub.

Support changes:

  - Add a Proxy to Builder.js, which intercepts unknown property lookups. This
    creates way better error messages on typos than 'undefined is not a
    function', which happens semi-frequently as I typo opcode names (and which
    one is a typo is hard to find because we chain builders).
  - Add limited support for var{u}int64 (only the 32-bit values).
  - Get rid of much of the function index space: we already have all of its
    information elsewhere, and as-is it provides no extra efficiency.
  - Store Wasm::Memory directly on JSWebAssemblyMemory, instead of through a
    unique_ptr. This removes an indirection and useless allocation.
  - Remove VM'a WebAssembly memory / size, and obtain them from Instance
    instead. This requires an extra indirection, but is cleaner.
Comment 4 JF Bastien 2016-12-26 17:54:57 PST
Created attachment 297775 [details]
WIP

- Fix B3 codegen of calls when the return type is Void.
- Pass in Instance properly when calling from JS.

Other fixes still necessary:
- Memory seems borked. $r12 is 0 in places it shouldn't I must be restoring the registers wrong.
- The element-data.js test also segfaults, probably the same issue.
Comment 5 JF Bastien 2016-12-26 19:55:20 PST
Created attachment 297777 [details]
WIP

Working on this on and off... I realized what an obvious bug was, fixed it. There are still bugs though. Here are some notes so i know where to start again. The codegen leading up to wasm->js entry looks like:

Generated JIT code for Wasm JS entrypoint:
    Code at [0x2d8702a1a440, 0x2d8702a1a4a0):
      0x2d8702a1a440: push %rbp
      0x2d8702a1a441: mov %rsp, %rbp
      0x2d8702a1a444: add $0xffffffffffffffe0, %rsp
      0x2d8702a1a448: mov $0x0, %rax
      0x2d8702a1a452: mov %rax, 0x18(%rbp)
      0x2d8702a1a456: mov $0x0, 0x10(%rbp)
      0x2d8702a1a45e: cmp $0x2, 0x20(%rbp)
      0x2d8702a1a462: jb 0x2d8702a1a49e
      0x2d8702a1a468: mov $0x1053f5578, %rax # address of vm.topJSWebAssemblyInstance. I've checked that this is correct.
      0x2d8702a1a472: mov (%rax), %rax       # Load it (it was saved by the C++ stub).
      0x2d8702a1a475: mov 0x28(%rax), %rcx   # Load WriteBarrier<JSWebAssemblyMemory> m_memory;
      0x2d8702a1a479: mov 0x18(%rcx), %rax   # From Wasm::Memory load void* m_memory.
      0x2d8702a1a47d: mov 0x20(%rcx), %ecx   # From Wasm::Memory load uint32_t m_size.
      0x2d8702a1a480: mov %rax, %r12         # Move to pinned memory register.
      0x2d8702a1a483: mov %rcx, %rbx         # Move to pinned size register.
      0x2d8702a1a486: mov 0x28(%rbp), %edi
      0x2d8702a1a489: mov 0x30(%rbp), %esi
      0x2d8702a1a48c: mov $0x2d8702a1a380, %r11
      0x2d8702a1a496: call *%r11
      0x2d8702a1a499: mov %rbp, %rsp
      0x2d8702a1a49c: pop %rbp
      0x2d8702a1a49d: ret 
      0x2d8702a1a49e: int3 

Or at least that's my understanding.

The following tests fail:
  Segmentation fault: 11  ../../current-debug/bin/jsc --useWebAssembly=1 -m .//function-tests/i32-load.js
  Segmentation fault: 11  ../../current-debug/bin/jsc --useWebAssembly=1 -m .//function-tests/i32-load8-s.js
  Segmentation fault: 11  ../../current-debug/bin/jsc --useWebAssembly=1 -m .//js-api/element-data.js
  Segmentation fault: 11  ../../current-debug/bin/jsc --useWebAssembly=1 -m .//js-api/test_Data.js
  Segmentation fault: 11  ../../current-debug/bin/jsc --useWebAssembly=1 -m .//js-api/test_memory.js
Comment 6 JF Bastien 2016-12-26 21:57:54 PST
Created attachment 297779 [details]
WIP

I found the bug! Everything was good, except I was destroying the memory before using it. Bad idea. All the tests now pass.

This patch is now missing:
 - WasmBinding.cpp:wasmToWasm implementation (this should be pretty straightforward).
 - Un-comment associated test wasm-to-wasm.js.
 - ChangeLog.

And then we should be good to go! Early reviews welcome.
Comment 7 Saam Barati 2016-12-27 15:41:01 PST
Comment on attachment 297779 [details]
WIP

View in context: https://bugs.webkit.org/attachment.cgi?id=297779&action=review

LGTM, but the some of the important bits aren't done yet, so I'll take a look again when you implement them.

> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:747
> +                    CCallHelpers::Call call = jit.call();

You need to store to the VM::topWasmInstance before this call, right?

> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:749
> +                        bool toJS = false; // Maybe external wasm -> wasm.

Style nit: I'm guilty of using a boolean for these things too, but an enum with two values is almost always better than a boolean.

> Source/JavaScriptCore/wasm/WasmBinding.cpp:260
> +    jit.emitFunctionPrologue();
> +    jit.store64(JIT::TrustedImm32(0), JIT::Address(GPRInfo::callFrameRegister, CallFrameSlot::codeBlock * static_cast<int>(sizeof(Register)))); // FIXME Stop using 0 as codeBlocks. https://bugs.webkit.org/show_bug.cgi?id=165321
> +    jit.storePtr(JIT::TrustedImmPtr(vm->webAssemblyToJSCallee.get()), JIT::Address(GPRInfo::callFrameRegister, CallFrameSlot::callee * static_cast<int>(sizeof(Register))));
> +
> +    // FIXME perform a stack check before updating SP. https://bugs.webkit.org/show_bug.cgi?id=165546
> +
> +    unsigned numberOfParameters = argCount + 1; // There is a "this" argument.
> +    unsigned numberOfRegsForCall = CallFrame::headerSizeInRegisters + numberOfParameters;
> +    unsigned numberOfBytesForCall = numberOfRegsForCall * sizeof(Register) - sizeof(CallerFrameAndPC);
> +    const unsigned stackOffset = WTF::roundUpToMultipleOf(stackAlignmentBytes(), numberOfBytesForCall);
> +    jit.subPtr(MacroAssembler::TrustedImm32(stackOffset), MacroAssembler::stackPointerRegister);
> +    JIT::Address calleeFrame = CCallHelpers::Address(MacroAssembler::stackPointerRegister, -static_cast<ptrdiff_t>(sizeof(CallerFrameAndPC)));
> +
> +    // FIXME implement this.
> +    jit.breakpoint();

Are you going to implement this as a tail call?

> Source/JavaScriptCore/wasm/WasmFormat.h:190
> +    Vector<SignatureIndex> moduleSignatureIndicesToUniquedSignatureIndices;

Nit: This is only used during compilation, so it shouldn't live on ModuleInformation which lives as long as a WebAssembly.Module lives.

> Source/JavaScriptCore/wasm/WasmMemory.h:50
> +        memset(&other, 0, sizeof(other));

Style: I think it's lest of a footgun to just store to the fields of 'other' to null them out.

> Source/JavaScriptCore/wasm/WasmSignature.cpp:55
> +String Signature::toString() const
> +{
> +    String result(makeString(returnType()));
> +    result.append(" (");
> +    for (SignatureArgCount arg = 0; arg < argumentCount(); ++arg) {
> +        if (arg)
> +            result.append(", ");
> +        result.append(makeString(argument(arg)));
> +    }
> +    result.append(')');
> +    return result;
> +}

Since you made this, you should just make ::dump() print the result of this function.

> Source/JavaScriptCore/wasm/js/WebAssemblyInstanceConstructor.cpp:122
> +                    return JSValue::encode(throwException(exec, throwScope, createJSWebAssemblyLinkError(exec, vm, ASCIILiteral("imported function's signature mismatches the provided WebAssembly function's signature"))));

Nit: "mismatches" => "does not match" or "doesn't match"
Comment 8 JF Bastien 2016-12-29 17:43:04 PST
Created attachment 297845 [details]
WIP

Rebase, address comments. Still need to do the rest of the patch.

> > Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:747
> > +                    CCallHelpers::Call call = jit.call();
> 
> You need to store to the VM::topWasmInstance before this call, right?

I don't think so: every function starts by creating a Value* for its Instance. I expect B3 to delete this materialization if it goes unused, and I expect it to be spilled or kept in callee-saved if it is live across a call (as I'm doing here).

The only inefficiency here is in leaf functions (assuming we don't inline current_memory / grow_memory), which we could annotate and save from the useless materialization if needed.

The generated code looks like:

  mov $0x1055f5578, %r14 # &vm.topJSWebAssemblyInstance
  mov (%r14), %r13       # topJSWebAssemblyInstance
  mov 0x48(%r13), %rax
  cmp $0x36, 0x5(%rax)
  jz 0x32a43381c018
  mov $0x0, %r11         # Patched at runtime
  call *%r11
  mov %r13, (%r14)       # Restore vm.topJSWebAssemblyInstance after a call
  mov 0x28(%r13), %rcx
  mov 0x18(%rcx), %rax   # void* m_memory
  mov 0x20(%rcx), %rcx   # uint64_t m_size
  mov %rax, %r12         # Pinned memory register
  mov %rcx, %rbx         # Pinned size register

Do I have this right?


> > Source/JavaScriptCore/wasm/WasmBinding.cpp:260
> > +    jit.emitFunctionPrologue();
> > +    jit.store64(JIT::TrustedImm32(0), JIT::Address(GPRInfo::callFrameRegister, CallFrameSlot::codeBlock * static_cast<int>(sizeof(Register)))); // FIXME Stop using 0 as codeBlocks. https://bugs.webkit.org/show_bug.cgi?id=165321
> > +    jit.storePtr(JIT::TrustedImmPtr(vm->webAssemblyToJSCallee.get()), JIT::Address(GPRInfo::callFrameRegister, CallFrameSlot::callee * static_cast<int>(sizeof(Register))));
> > +
> > +    // FIXME perform a stack check before updating SP. https://bugs.webkit.org/show_bug.cgi?id=165546
> > +
> > +    unsigned numberOfParameters = argCount + 1; // There is a "this" argument.
> > +    unsigned numberOfRegsForCall = CallFrame::headerSizeInRegisters + numberOfParameters;
> > +    unsigned numberOfBytesForCall = numberOfRegsForCall * sizeof(Register) - sizeof(CallerFrameAndPC);
> > +    const unsigned stackOffset = WTF::roundUpToMultipleOf(stackAlignmentBytes(), numberOfBytesForCall);
> > +    jit.subPtr(MacroAssembler::TrustedImm32(stackOffset), MacroAssembler::stackPointerRegister);
> > +    JIT::Address calleeFrame = CCallHelpers::Address(MacroAssembler::stackPointerRegister, -static_cast<ptrdiff_t>(sizeof(CallerFrameAndPC)));
> > +
> > +    // FIXME implement this.
> > +    jit.breakpoint();
> 
> Are you going to implement this as a tail call?

I think I can since the Instance is saved / restored by the caller. It only needs to:
  - Adjust vm.topJSWebAssemblyInstance.
  - Adjust pinned registers.
I don't think this function requires any stack space.
Comment 9 Saam Barati 2016-12-29 18:44:19 PST
(In reply to comment #8)
> Created attachment 297845 [details]
> WIP
> 
> Rebase, address comments. Still need to do the rest of the patch.
> 
> > > Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:747
> > > +                    CCallHelpers::Call call = jit.call();
> > 
> > You need to store to the VM::topWasmInstance before this call, right?
> 
> I don't think so: every function starts by creating a Value* for its
> Instance. I expect B3 to delete this materialization if it goes unused, and
> I expect it to be spilled or kept in callee-saved if it is live across a
> call (as I'm doing here).
You could be calling cross Instance here, right? If so, the cross Instance callee wants the topInstabcePointer to be correct.
> 
> The only inefficiency here is in leaf functions (assuming we don't inline
> current_memory / grow_memory), which we could annotate and save from the
> useless materialization if needed.
> 
> The generated code looks like:
> 
>   mov $0x1055f5578, %r14 # &vm.topJSWebAssemblyInstance
>   mov (%r14), %r13       # topJSWebAssemblyInstance
>   mov 0x48(%r13), %rax
>   cmp $0x36, 0x5(%rax)
>   jz 0x32a43381c018
>   mov $0x0, %r11         # Patched at runtime
>   call *%r11
>   mov %r13, (%r14)       # Restore vm.topJSWebAssemblyInstance after a call
>   mov 0x28(%r13), %rcx
>   mov 0x18(%rcx), %rax   # void* m_memory
>   mov 0x20(%rcx), %rcx   # uint64_t m_size
>   mov %rax, %r12         # Pinned memory register
>   mov %rcx, %rbx         # Pinned size register
> 
> Do I have this right?
> 
> 
> > > Source/JavaScriptCore/wasm/WasmBinding.cpp:260
> > > +    jit.emitFunctionPrologue();
> > > +    jit.store64(JIT::TrustedImm32(0), JIT::Address(GPRInfo::callFrameRegister, CallFrameSlot::codeBlock * static_cast<int>(sizeof(Register)))); // FIXME Stop using 0 as codeBlocks. https://bugs.webkit.org/show_bug.cgi?id=165321
> > > +    jit.storePtr(JIT::TrustedImmPtr(vm->webAssemblyToJSCallee.get()), JIT::Address(GPRInfo::callFrameRegister, CallFrameSlot::callee * static_cast<int>(sizeof(Register))));
> > > +
> > > +    // FIXME perform a stack check before updating SP. https://bugs.webkit.org/show_bug.cgi?id=165546
> > > +
> > > +    unsigned numberOfParameters = argCount + 1; // There is a "this" argument.
> > > +    unsigned numberOfRegsForCall = CallFrame::headerSizeInRegisters + numberOfParameters;
> > > +    unsigned numberOfBytesForCall = numberOfRegsForCall * sizeof(Register) - sizeof(CallerFrameAndPC);
> > > +    const unsigned stackOffset = WTF::roundUpToMultipleOf(stackAlignmentBytes(), numberOfBytesForCall);
> > > +    jit.subPtr(MacroAssembler::TrustedImm32(stackOffset), MacroAssembler::stackPointerRegister);
> > > +    JIT::Address calleeFrame = CCallHelpers::Address(MacroAssembler::stackPointerRegister, -static_cast<ptrdiff_t>(sizeof(CallerFrameAndPC)));
> > > +
> > > +    // FIXME implement this.
> > > +    jit.breakpoint();
> > 
> > Are you going to implement this as a tail call?
> 
> I think I can since the Instance is saved / restored by the caller. It only
> needs to:
>   - Adjust vm.topJSWebAssemblyInstance.
>   - Adjust pinned registers.
> I don't think this function requires any stack space.
Comment 10 JF Bastien 2016-12-29 23:54:22 PST
Created attachment 297848 [details]
WIP

Here's an updated WIP patch with the wasm->wasm thunk implemented. There seems to be an issue, maybe related to what Saam is asking.

All wasm function start with the following:

    m_instanceValue = m_currentBlock->appendNew<MemoryValue>(m_proc, Load, pointerType(), Origin(),
        m_currentBlock->appendNew<ConstPtrValue>(m_proc, Origin(), &m_vm.topJSWebAssemblyInstance));

I was therefore thinking that this Value* would be available
throughout the function: either it's in a callee-saved register,
or it's spilled before a call. In the code below it seems to be
in $rbx, and the address &vm.topJSWebAssemblyInstance in $r12.
Now that's surprising, because those are the pinned registers!
It's OK though because these functions don't have a Memory, so
B3IrGenerator starts with info.hasMemory() and doesn't pin these.

So it's clear that the registers which are usually pinned are getting
clobbered, but I'm not sure what the best approach is. A few thoughts:

1) In the wasm -> wasm thunk I don't know whether I'm calling
   another Module which uses Memory or not, so I have to restore its
   pinned registers just in case (or generate a bunch more,
   non-straight-line, code).
2) The thunk really only has one scratch register, and uses the
   usually-pinned registers because it assumes the caller doesn't
   care (and sure enough, pinning them unconditionally fixes the issues).
3) Modules which don't have a Memory could mark the usually-pinned
   registers as clobbered in addCall / addCallIndirect. They would still
   be free to use, but the caller would have to be careful around calls.


Here's generated wasm code, which calls another instance's wasm
function, from wasm-to-wasm-js:

Generated JIT code for WebAssembly function[0] I32 (I32):
        Code at [0x4cb604238460, 0x4cb604238500):
          0x4cb604238460: push %rbp
          0x4cb604238461: mov %rsp, %rbp
          0x4cb604238464: add $0xffffffffffffffd0, %rsp
          0x4cb604238468: mov %rbx, -0x10(%rbp)
          0x4cb60423846c: mov %r12, -0x8(%rbp)
          0x4cb604238470: mov $0x0, %rax
          0x4cb60423847a: mov %rax, 0x18(%rbp)
          0x4cb60423847e: mov $0x0, 0x10(%rbp)
          0x4cb604238486: mov $0x104df5578, %r12         # &vm.topJSWebAssemblyInstance
          0x4cb604238490: mov (%r12), %rbx               # m_instanceValue
          0x4cb604238494: mov %edi, %eax
          0x4cb604238496: mov $0xc0febeef00000000, %rdi
          0x4cb6042384a0: or %rax, %rdi
          0x4cb6042384a3: mov 0x48(%rbx), %rax
          0x4cb6042384a7: cmp $0x36, 0x5(%rax)
 +------- 0x4cb6042384ab: jz 0x4cb6042384d8
 |        0x4cb6042384b1: mov $0x0, %r11
 |        0x4cb6042384bb: call *%r11
 |    +-> 0x4cb6042384be: mov %rbx, (%r12)               # Crash here, because $r12 is now 0 (so is $rbx)
 |    |   0x4cb6042384c2: mov %rax, %rcx
 |    |   0x4cb6042384c5: shr $0x20, %rcx
 |    |   0x4cb6042384c9: xor %ecx, %eax
 |    |   0x4cb6042384cb: mov -0x10(%rbp), %rbx
 |    |   0x4cb6042384cf: mov -0x8(%rbp), %r12
 |    |   0x4cb6042384d3: mov %rbp, %rsp
 |    |   0x4cb6042384d6: pop %rbp
 |    |   0x4cb6042384d7: ret 
 +----|-> 0x4cb6042384d8: mov $0x0, %r11
      |   0x4cb6042384e2: call *%r11
      |   0x4cb6042384e5: int3                           # My breakpoint, for debugging
      +-- 0x4cb6042384e6: jmp 0x4cb6042384be


Here's the annotated thunk, so that WasmBinding.cpp:wasmToWasm is easier to understand:

Generated JIT code for WebAssembly->WebAssembly import[0]:
    Code at [0x4cb60421bea0, 0x4cb60421bee0):
      0x4cb60421bea0: mov $0x104df5578, %r10            # &vm.topJSWebAssemblyInstance
      0x4cb60421beaa: mov (%r10), %r10                  # Get the caller's Instance
      0x4cb60421bead: mov 0x48(%r10), %r10              # Import function
      0x4cb60421beb1: mov 0x28(%r10), %r12              # Callee's Instance
      0x4cb60421beb5: mov $0x104df5578, %r11            # &vm.topJSWebAssemblyInstance (again)
      0x4cb60421bebf: mov %r12, (%r11)                  # Set the callee's instance
      0x4cb60421bec2: mov 0x28(%r12), %r12              # Callee's JSWebAssemblyMemory
      0x4cb60421bec7: mov 0x20(%r12), %rbx              # Callee's memory size (in its pinned register)
      0x4cb60421becc: mov 0x18(%r12), %r12              # Callee's memory address (in its pinned register)
      0x4cb60421bed1: mov 0x30(%r10), %r10              # Callee code address
      0x4cb60421bed5: jmp *%r10

----------

Updated change notes:

Changes to the WebAssembly implementation:

  - Add a new JSType for WebAssemblyFunction, and use it when creating its
    structure. This will be used to quickly detect from wasm whether the import
    call is to another wasm module, or whether it's to JS.
  - Use cell instead of objects to detect wasm->wasm, the previous code was wrong.
  - Improve diagnostic printout for wasm stubs: print the full signature.
  - Properly check that wasm -> wasm calls have matching signatures. Add testing for this.
  - Test some of the wasm -> wasm calls, not working yet, and more tests need to
    get implemented.
  - Generate two stubs from the import stub generator: one for wasm->JS and one
    for wasm -> wasm. This is done at Module time. Which is called will only be
    known at Instance time, once we've received the import object. We want to
    avoid codegen at Instance time, so having both around is great.
  - Restore the WebAssembly global state (VM top Instance, and pinned registers)
    after call / call_indirect, and in the JS->wasm entry stub.
  - Pinned registers are now a global thing, not per-Memory, because the wasm ->
    wasm stubs are generated at Module time where we don't really have enough
    information to do the right thing (doing so would generate too much code).


Support changes:

  - Add a Proxy to Builder.js, which intercepts unknown property lookups. This
    creates way better error messages on typos than 'undefined is not a
    function', which happens semi-frequently as I typo opcode names (and which
    one is a typo is hard to find because we chain builders).
  - Add limited support for var{u}int64 (only the 32-bit values).
  - Get rid of much of the function index space: we already have all of its
    information elsewhere, and as-is it provides no extra efficiency.
  - Store Wasm::Memory directly on JSWebAssemblyMemory, instead of through a
    unique_ptr. This removes an indirection and useless allocation.
  - Remove VM'a WebAssembly memory / size, and obtain them from Instance
    instead. This requires an extra indirection, but is cleaner.
  - Cache the wasm entrypoint code's location on WebAssemblyFunction for faster
    access.
  - All Instance now have a valid JSWebAssemblyMemory when created, which allows
    the wasm -> wasm stub to assume it isn't null when trying to set the pinned
    registers.
Comment 11 JF Bastien 2016-12-30 11:52:13 PST
For comparison, here's the generated code if I always pin the registers, even if no memory is present:

Generated JIT code for WebAssembly function[0] I32 (I32):
    Code at [0x24a7fac391e0, 0x24a7fac39280):
      0x24a7fac391e0: push %rbp
      0x24a7fac391e1: mov %rsp, %rbp
      0x24a7fac391e4: add $0xffffffffffffffd0, %rsp
      0x24a7fac391e8: mov %r13, -0x10(%rbp)
      0x24a7fac391ec: mov %r14, -0x8(%rbp)
      0x24a7fac391f0: mov $0x0, %rax
      0x24a7fac391fa: mov %rax, 0x18(%rbp)
      0x24a7fac391fe: mov $0x0, 0x10(%rbp)
      0x24a7fac39206: mov $0x1055f5578, %r14         # It uses a different register here
      0x24a7fac39210: mov (%r14), %r13               # And here
      0x24a7fac39213: mov %edi, %eax
      0x24a7fac39215: mov $0xc0febeef00000000, %rdi
      0x24a7fac3921f: or %rax, %rdi
      0x24a7fac39222: mov 0x48(%r13), %rax
      0x24a7fac39226: cmp $0x36, 0x5(%rax)
      0x24a7fac3922a: jz 0x24a7fac39256
      0x24a7fac39230: mov $0x0, %r11
      0x24a7fac3923a: call *%r11
      0x24a7fac3923d: mov %r13, (%r14)               # Assumes these registers are callee-saved (they are) and uses them again!
      0x24a7fac39240: mov %rax, %rcx
      0x24a7fac39243: shr $0x20, %rcx
      0x24a7fac39247: xor %ecx, %eax
      0x24a7fac39249: mov -0x10(%rbp), %r13
      0x24a7fac3924d: mov -0x8(%rbp), %r14
      0x24a7fac39251: mov %rbp, %rsp
      0x24a7fac39254: pop %rbp
      0x24a7fac39255: ret 
      0x24a7fac39256: mov $0x0, %r11
      0x24a7fac39260: call *%r11
      0x24a7fac39263: jmp 0x24a7fac3923d
Generated JIT code for JavaScript->WebAssembly entrypoint[0] I32 (I32):
    Code at [0x24a7fac39280, 0x24a7fac392c0):
      0x24a7fac39280: push %rbp
      0x24a7fac39281: mov %rsp, %rbp
      0x24a7fac39284: add $0xffffffffffffffe0, %rsp
      0x24a7fac39288: mov $0x0, %rax
      0x24a7fac39292: mov %rax, 0x18(%rbp)
      0x24a7fac39296: mov $0x0, 0x10(%rbp)
      0x24a7fac3929e: cmp $0x1, 0x20(%rbp)
      0x24a7fac392a2: jb 0x24a7fac392bd
      0x24a7fac392a8: mov 0x28(%rbp), %edi
      0x24a7fac392ab: mov $0x24a7fac391e0, %r11
      0x24a7fac392b5: call *%r11
      0x24a7fac392b8: mov %rbp, %rsp
      0x24a7fac392bb: pop %rbp
      0x24a7fac392bc: ret 
      0x24a7fac392bd: int3
Comment 12 JF Bastien 2016-12-30 11:57:27 PST
I'm not sure it's even worth *not* pinning these registers for a Module which doesn't have a Memory. Is that ever a compelling usecase which we need to optimize for, and suffer my current headache?
Comment 13 JF Bastien 2016-12-30 12:11:01 PST
Created attachment 297859 [details]
WIP

Tiny update to always pin registers, even if we don't need to. The wasm-to-wasm.js test isn't passing yet, I must have messed something up. Will get back to it later today.
Comment 14 JF Bastien 2016-12-30 12:24:29 PST
Created attachment 297860 [details]
WIP

Actually I found the bug as I was walking away... I wasn't coercing 32-bit values in JavaScript properly, so it was using a negative value instead of a positive one as I wanted :-s

I think, with a ChangeLog, this patch is now good to go! Will submit one soon.
Comment 15 JF Bastien 2016-12-30 12:54:18 PST
Created attachment 297863 [details]
patch

Ready for review!
Comment 16 Saam Barati 2017-01-02 12:10:51 PST
Comment on attachment 297863 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=297863&action=review

Mostly LGTM, but I don't think we should require registers always be pinned. I don't think doing so actually make the code any simpler to reason about.

> Source/JavaScriptCore/jsc.cpp:2673
> +    SymbolTable* exportSymbolTable = SymbolTable::create(vm);
> +    auto* moduleStructure = InternalFunction::createSubclassStructure(exec, JSValue(), exec->lexicalGlobalObject()->WebAssemblyModuleStructure());
> +    RETURN_IF_EXCEPTION(scope, encodedJSValue());
> +    JSWebAssemblyModule* module = JSWebAssemblyModule::create(vm, moduleStructure, plan.takeModuleInformation(), plan.takeCallLinkInfos(), plan.takeWasmExitStubs(), exportSymbolTable, 0);
> +    Structure* instanceStructure = InternalFunction::createSubclassStructure(exec, JSValue(), globalObject->WebAssemblyInstanceStructure());
> +    RETURN_IF_EXCEPTION(scope, encodedJSValue());
> +    Identifier moduleKey = Identifier::fromUid(PrivateName(PrivateName::Description, "WebAssemblyInstance"));
> +    WebAssemblyModuleRecord* moduleRecord = WebAssemblyModuleRecord::create(exec, vm, globalObject->webAssemblyModuleRecordStructure(), moduleKey, module->moduleInformation());
> +    RETURN_IF_EXCEPTION(scope, encodedJSValue());
> +    JSWebAssemblyInstance* instance = JSWebAssemblyInstance::create(vm, instanceStructure, module, moduleRecord->getModuleNamespace(exec));
> +    RETURN_IF_EXCEPTION(scope, encodedJSValue());
> +    vm.topJSWebAssemblyInstance = instance;
> +
> +    if (!!module->moduleInformation().memory) {
> +        bool failed;
> +        Wasm::Memory memory(module->moduleInformation().memory.initial(), module->moduleInformation().memory.maximum(), failed);
> +        RELEASE_ASSERT(failed);
> +        instance->setMemory(vm, JSWebAssemblyMemory::create(vm, globalObject->WebAssemblyMemoryStructure(), WTFMove(memory)));
> +    }

Why is all this new stuff needed? I updated all tests that used this API and memory operations to the JS API.
Therefore, that RELEASE_ASSERT you're removing was really protecting us from new tests of that form from being added.

> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:260
> +    // FIXME we don't really need to pin registers here if there's no memory. It makes wasm -> wasm thunks simpler for now. https://bugs.webkit.org/show_bug.cgi?id=166623
> +        const PinnedRegisterInfo& pinnedRegs = PinnedRegisterInfo::get();
> +        m_memoryBaseGPR = pinnedRegs.baseMemoryPointer;
>          m_proc.pinRegister(m_memoryBaseGPR);
> -        ASSERT(!info.memory.pinnedRegisters().sizeRegisters[0].sizeOffset);
> -        m_memorySizeGPR = info.memory.pinnedRegisters().sizeRegisters[0].sizeRegister;
> -        for (const PinnedSizeRegisterInfo& regInfo : info.memory.pinnedRegisters().sizeRegisters)
> +        ASSERT(!pinnedRegs.sizeRegisters[0].sizeOffset);
> +        m_memorySizeGPR = pinnedRegs.sizeRegisters[0].sizeRegister;
> +        for (const PinnedSizeRegisterInfo& regInfo : pinnedRegs.sizeRegisters)
>              m_proc.pinRegister(regInfo.sizeRegister);

Style: Your indentation is off here.

> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:328
> +        B3::PatchpointValue* patchpoint = block->appendNew<B3::PatchpointValue>(proc, B3::Void, Origin());
> +        patchpoint->effects = Effects::none();
> +        patchpoint->effects.writesPinned = true;
> +        patchpoint->clobber(clobbers);
> +        patchpoint->appendVectorWithRep(arguments, ValueRep::SomeRegister);
> +
> +        patchpoint->setGenerator([&pinnedRegs] (CCallHelpers& jit, const B3::StackmapGenerationParams& params) {
> +            AllowMacroScratchRegisterUsage allowScratch(jit);
> +
> +            jit.move(params[0].gpr(), pinnedRegs.baseMemoryPointer);
> +            for (uint32_t i = 0; i < pinnedRegs.sizeRegisters.size(); ++i)
> +                jit.move(params[1 + i].gpr(), pinnedRegs.sizeRegisters[i].sizeRegister);
> +        });
> +    }

This is less efficient than the version I had. My version did most of the work in the MacroAssembler and required fewer moves. The reason for this is that it only took a Instance as an argument to the patchpoint, and then did all the work inside the patchpoint, so it could just load directly into the registers it needed to write to. Please revert back to the better version.

> Source/JavaScriptCore/wasm/WasmBinding.cpp:274
> +    // FIXME the following code assumes that all WebAssembly.Instance have the same pinned registers. https://bugs.webkit.org/show_bug.cgi?id=162952
> +    // Set up the callee's baseMemory register as well as the memory size registers.
> +    jit.loadPtr(JIT::Address(baseMemory, JSWebAssemblyInstance::offsetOfMemory()), baseMemory); // JSWebAssemblyMemory*.
> +    const auto& sizeRegs = pinnedRegs.sizeRegisters;
> +    ASSERT(sizeRegs.size() >= 1);
> +    ASSERT(sizeRegs[0].sizeRegister != baseMemory);
> +    ASSERT(sizeRegs[0].sizeRegister != scratch);
> +    ASSERT(!sizeRegs[0].sizeOffset); // The following code assumes we start at 0, and calculates subsequent size registers relative to 0.
> +    jit.loadPtr(JIT::Address(baseMemory, JSWebAssemblyMemory::offsetOfSize()), sizeRegs[0].sizeRegister); // Memory size.
> +    jit.loadPtr(JIT::Address(baseMemory, JSWebAssemblyMemory::offsetOfMemory()), baseMemory); // WasmMemory::void*.
> +    for (unsigned i = 1; i < sizeRegs.size(); ++i) {
> +        ASSERT(sizeRegs[i].sizeRegister != baseMemory);
> +        ASSERT(sizeRegs[i].sizeRegister != scratch);
> +        jit.add64(JIT::TrustedImm32(-sizeRegs[i].sizeOffset), sizeRegs[0].sizeRegister, sizeRegs[1].sizeRegister);
> +    }

This code is not correct.

> Source/JavaScriptCore/wasm/WasmMemory.h:51
> +    Memory() = default;

Not a fan of this. If you go with my suggestion of not having to always pin registers, this isn't needed.

> Source/JavaScriptCore/wasm/WasmSignature.h:85
> +    WTF::String toString() const;

Nit: WTF prefix not needed.

> Source/JavaScriptCore/wasm/js/WebAssemblyInstanceConstructor.cpp:91
> +    {
> +        // Always start with a dummy Memory, so that wasm -> wasm thunks avoid checking for a nullptr Memory when trying to set pinned registers.
> +        Wasm::Memory memory;
> +        instance->setMemory(vm, JSWebAssemblyMemory::create(vm, exec->lexicalGlobalObject()->WebAssemblyMemoryStructure(), WTFMove(memory)));
> +    }

I'm not sure this is the best design decision.

Why not have the check for nullptr be the check seeing if we need to materialize registers or not? Then we don't have to unconditionally pin registers. Maybe this won't be a common use case, but I could imagine some math-heavy functions being bogged down doing the extra materialization work (and also hurt by extra register pressure).

> Source/JavaScriptCore/wasm/js/WebAssemblyModuleRecord.cpp:-137
> -            // This should be guaranteed by module verification.
> -            RELEASE_ASSERT(instance->memory()); 

Then you can add this back.
Comment 17 Saam Barati 2017-01-02 12:23:08 PST
Comment on attachment 297863 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=297863&action=review

>> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:328
>> +    }
> 
> This is less efficient than the version I had. My version did most of the work in the MacroAssembler and required fewer moves. The reason for this is that it only took a Instance as an argument to the patchpoint, and then did all the work inside the patchpoint, so it could just load directly into the registers it needed to write to. Please revert back to the better version.

You can also make my code use the 3-operand add like you have in the wasmToWasm() stub.

>> Source/JavaScriptCore/wasm/WasmBinding.cpp:274
>> +    }
> 
> This code is not correct.

This is correct, sorry. I meant to delete this comment. I wrote this before I realized you pre-populated a JSWebAssemblyMemory, always. That said, I still think we shouldn't have to do this.
Comment 18 Saam Barati 2017-01-02 12:30:07 PST
(changing unicode in the title)
Comment 19 JF Bastien 2017-01-02 15:02:02 PST
Created attachment 297913 [details]
patch

Address comments.


> Mostly LGTM, but I don't think we should require registers always be pinned.
> I don't think doing so actually make the code any simpler to reason about.

We could check on wasm->wasm entry, but then:
 - Code would be more complex / slower.
 - Stuck at 1 scratch register.
 - Calls from wasm instances which don't pin registers couldn't be tail calls since they'd need to spill.

At the end of the day I just don't see wasm without a memory as a compelling use case. I could be wrong, but what's the point of using wasm that way?

I do think it makes the code more complicated to try to handle pinned registers properly.

 
> > Source/JavaScriptCore/jsc.cpp:2673
> > +    SymbolTable* exportSymbolTable = SymbolTable::create(vm);
> > +    auto* moduleStructure = InternalFunction::createSubclassStructure(exec, JSValue(), exec->lexicalGlobalObject()->WebAssemblyModuleStructure());
> > +    RETURN_IF_EXCEPTION(scope, encodedJSValue());
> > +    JSWebAssemblyModule* module = JSWebAssemblyModule::create(vm, moduleStructure, plan.takeModuleInformation(), plan.takeCallLinkInfos(), plan.takeWasmExitStubs(), exportSymbolTable, 0);
> > +    Structure* instanceStructure = InternalFunction::createSubclassStructure(exec, JSValue(), globalObject->WebAssemblyInstanceStructure());
> > +    RETURN_IF_EXCEPTION(scope, encodedJSValue());
> > +    Identifier moduleKey = Identifier::fromUid(PrivateName(PrivateName::Description, "WebAssemblyInstance"));
> > +    WebAssemblyModuleRecord* moduleRecord = WebAssemblyModuleRecord::create(exec, vm, globalObject->webAssemblyModuleRecordStructure(), moduleKey, module->moduleInformation());
> > +    RETURN_IF_EXCEPTION(scope, encodedJSValue());
> > +    JSWebAssemblyInstance* instance = JSWebAssemblyInstance::create(vm, instanceStructure, module, moduleRecord->getModuleNamespace(exec));
> > +    RETURN_IF_EXCEPTION(scope, encodedJSValue());
> > +    vm.topJSWebAssemblyInstance = instance;
> > +
> > +    if (!!module->moduleInformation().memory) {
> > +        bool failed;
> > +        Wasm::Memory memory(module->moduleInformation().memory.initial(), module->moduleInformation().memory.maximum(), failed);
> > +        RELEASE_ASSERT(failed);
> > +        instance->setMemory(vm, JSWebAssemblyMemory::create(vm, globalObject->WebAssemblyMemoryStructure(), WTFMove(memory)));
> > +    }
> 
> Why is all this new stuff needed? I updated all tests that used this API and
> memory operations to the JS API.
> Therefore, that RELEASE_ASSERT you're removing was really protecting us from
> new tests of that form from being added.

I wrote that code before your patches went in. You're right, with the new uses it's simpler so I deleted this and removed the EXPORT stuff.


> > Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:328
> > +        B3::PatchpointValue* patchpoint = block->appendNew<B3::PatchpointValue>(proc, B3::Void, Origin());
> > +        patchpoint->effects = Effects::none();
> > +        patchpoint->effects.writesPinned = true;
> > +        patchpoint->clobber(clobbers);
> > +        patchpoint->appendVectorWithRep(arguments, ValueRep::SomeRegister);
> > +
> > +        patchpoint->setGenerator([&pinnedRegs] (CCallHelpers& jit, const B3::StackmapGenerationParams& params) {
> > +            AllowMacroScratchRegisterUsage allowScratch(jit);
> > +
> > +            jit.move(params[0].gpr(), pinnedRegs.baseMemoryPointer);
> > +            for (uint32_t i = 0; i < pinnedRegs.sizeRegisters.size(); ++i)
> > +                jit.move(params[1 + i].gpr(), pinnedRegs.sizeRegisters[i].sizeRegister);
> > +        });
> > +    }
> 
> This is less efficient than the version I had. My version did most of the
> work in the MacroAssembler and required fewer moves. The reason for this is
> that it only took a Instance as an argument to the patchpoint, and then did
> all the work inside the patchpoint, so it could just load directly into the
> registers it needed to write to. Please revert back to the better version.

I thought that having the Value* around would allow B3 to move things more, and generate better code in the long run? The code I had was pretty similar to what Keith had written before for this.

In any case, you're right that what you wrote is pretty close to the thunks I also wrote, so I moved to this since it does generate better code at the moment, and from what you say it sounds like that's what's expected to be better in the future as well.


> > Source/JavaScriptCore/wasm/WasmSignature.h:85
> > +    WTF::String toString() const;
> 
> Nit: WTF prefix not needed.

It is needed because it's only forward-declared:

In file included from ../Source/JavaScriptCore/wasm/WasmSignature.cpp:27:
../Source/JavaScriptCore/wasm/WasmSignature.h:85:5: error: unknown type name 'String'; did you mean 'WTF::String'?
    String toString() const;
    ^~~~~~
    WTF::String
../Source/WTF/wtf/HashTraits.h:32:7: note: 'WTF::String' declared here
class String;
      ^

I could include the right header, but that's against the "include what you use" approach so I'd rather not.


> > Source/JavaScriptCore/wasm/js/WebAssemblyInstanceConstructor.cpp:91
> > +    {
> > +        // Always start with a dummy Memory, so that wasm -> wasm thunks avoid checking for a nullptr Memory when trying to set pinned registers.
> > +        Wasm::Memory memory;
> > +        instance->setMemory(vm, JSWebAssemblyMemory::create(vm, exec->lexicalGlobalObject()->WebAssemblyMemoryStructure(), WTFMove(memory)));
> > +    }
> 
> I'm not sure this is the best design decision.
> 
> Why not have the check for nullptr be the check seeing if we need to
> materialize registers or not? Then we don't have to unconditionally pin
> registers. Maybe this won't be a common use case, but I could imagine some
> math-heavy functions being bogged down doing the extra materialization work
> (and also hurt by extra register pressure).

I just don't see wasm without Memory being a serious use case. To do math-heavy things without a Memory you'd need to call into wasm, and then hold all of your state in locals / globals. Sure that can be done since we have infinite locals and globals, but wasm being a compilation target means that LLVM won't generate such code, at least not for the foreseeable future. You'd need to hand-write that code, and I don't think it'll happen soon.

On the other hand, I see wasm->wasm calls as a very serious use case! That's how you'll call libc and other things that your application depend on through dynamic linking. Having it be as fast as possible is super important IMO. That's why having a tail call, and fewer conditions in this code, are pretty important IMO.

In that case, why have the null check, ever?
Comment 20 Saam Barati 2017-01-02 16:16:50 PST
Comment on attachment 297913 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=297913&action=review

r=me with comments and some bugs found.

> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:309
> +        patchpoint->numGPScratchRegisters = 1;

This is no longer needed since you're code is doing the 3-operand add based off the zero size register.

> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:317
> +            jit.loadPtr(CCallHelpers::Address(baseMemory, JSWebAssemblyMemory::offsetOfSize()), sizeRegs[0].sizeRegister);

Can you add back your assert that this has a zero offset?

> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:320
> +            for (unsigned i = 1; i < sizeRegs.size(); ++i)
> +                jit.add64(CCallHelpers::TrustedImm32(-sizeRegs[i].sizeOffset), sizeRegs[0].sizeRegister, sizeRegs[1].sizeRegister);

This code here is wrong. It should not be writing to sizeRegs[1]. It should be writing to sizeRegs[i].sizeRegister. Given that this code is never run now, I wonder if we're just better off not pretending it can be run, and just asserting that it can't be run. Then we can get back to it when we implement > 1 pinned size register.

> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:841
> +        ExpressionType functionImport = m_currentBlock->appendNew<MemoryValue>(m_proc, Load, pointerType(), Origin(), m_instanceValue, JSWebAssemblyInstance::offsetOfImportFunction(functionIndex));
> +        ExpressionType jsTypeOfImport = m_currentBlock->appendNew<MemoryValue>(m_proc, Load8Z, Origin(), functionImport, JSCell::typeInfoTypeOffset());

Nit: I'd make these Value* instead of ExpressionType, since the very next statement uses Value*.

> Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:890
> +        // The call could have been to another WebAssembly instance, and / or could have modified our Memory.
> +        restoreWebAssemblyGlobalState(m_vm, m_info.memory, m_instanceValue, m_proc, continuation);

This code only needs to be emitted along the wasm path.

> Source/JavaScriptCore/wasm/WasmBinding.cpp:41
> +static void materializeImportJSCell(VM* vm, JIT& jit, unsigned importIndex, GPRReg gpr)

Style nit: I think we usually call "gpr" "result" in these types of functions.

> Source/JavaScriptCore/wasm/WasmBinding.cpp:43
> +    // Each JS -> wasm entry sets the WebAssembly.Instance whose export is being called. We're calling out of this Instance, and can therefore figure out the import being called.

Nit: This comment is somewhat confusing and I don't think adds much to help guide what the code is doing.

> Source/JavaScriptCore/wasm/WasmBinding.cpp:261
> +    // FIXME the following code assumes that all WebAssembly.Instance have the same pinned registers. https://bugs.webkit.org/show_bug.cgi?id=162952
> +    // Set up the callee's baseMemory register as well as the memory size registers.

We could easily support this in the future if we just had wasm callee generate this code for itself instead of the caller generate it. That might be a nicer architecture going forward regardless since it would also nicely handle the materialize pinned registers problem.

> Source/JavaScriptCore/wasm/WasmBinding.cpp:274
> +    for (unsigned i = 1; i < sizeRegs.size(); ++i) {
> +        ASSERT(sizeRegs[i].sizeRegister != baseMemory);
> +        ASSERT(sizeRegs[i].sizeRegister != scratch);
> +        jit.add64(JIT::TrustedImm32(-sizeRegs[i].sizeOffset), sizeRegs[0].sizeRegister, sizeRegs[1].sizeRegister);
> +    }

Same comment as earlier. This code shouldn't be storing to sizeRegs[1]. Also, this code is almost identical (or maybe exactly identical) to the code inside WasmB3IRGenerator. Maybe it's worth writing a common function for it?

> Source/JavaScriptCore/wasm/WasmFormat.h:262
> +        toJS,
> +        toWasm,

Style Nit: I'd make these capitalized. So:
"ToJs" and "ToWasm"

> Source/JavaScriptCore/wasm/WasmPlan.cpp:223
> +                    std::make_unique<B3::Compilation>(FINALIZE_CODE(linkBuffer, ("JavaScript->WebAssembly entrypoint[%i] %s", functionIndex, signatureDescription.ascii().data())), WTFMove(context.jsEntrypointByproducts));

Nice.

> Source/JavaScriptCore/wasm/js/WebAssemblyFunction.h:69
> +    WebAssemblyFunction(VM&, JSGlobalObject*, Structure*, Wasm::SignatureIndex, void*);

There is no need to pass in this void* directly since it can just be gotten of the JSWebAssemblyCallee inside finishCreation().

> Source/JavaScriptCore/wasm/js/WebAssemblyFunction.h:72
> +    void* m_wasmEntryPointCode; // Remove indirections for wasm -> wasm calls.

This comment is somewhat confusing. Maybe something like:
"Caching the code pointer allows the wasm -> wasm stub to do a single load and jump instead of having dependent loads." (or something like that)

> Source/JavaScriptCore/wasm/js/WebAssemblyInstanceConstructor.cpp:90
> +        // Always start with a dummy Memory, so that wasm -> wasm thunks avoid checking for a nullptr Memory when trying to set pinned registers.
> +        Wasm::Memory memory;
> +        instance->setMemory(vm, JSWebAssemblyMemory::create(vm, exec->lexicalGlobalObject()->WebAssemblyMemoryStructure(), WTFMove(memory)));

This makes me wonder if we're better off just having the callee instead of caller generate a wasm->wasm call thunk. Maybe we should do that in a later patch, or at least have a bug open to consider it.
Comment 21 JF Bastien 2017-01-02 17:01:52 PST
Created attachment 297918 [details]
patch

Address more comments. A few questions left:


> > Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:890
> > +        // The call could have been to another WebAssembly instance, and / or could have modified our Memory.
> > +        restoreWebAssemblyGlobalState(m_vm, m_info.memory, m_instanceValue, m_proc, continuation);
> 
> This code only needs to be emitted along the wasm path.

I don't think this is true: the JS code could modify Memory, or it could have called another Instance. We want to remove the double entry from JS to wasm, and then we'll need this code. No?


> > Source/JavaScriptCore/wasm/WasmBinding.cpp:274
> > +    for (unsigned i = 1; i < sizeRegs.size(); ++i) {
> > +        ASSERT(sizeRegs[i].sizeRegister != baseMemory);
> > +        ASSERT(sizeRegs[i].sizeRegister != scratch);
> > +        jit.add64(JIT::TrustedImm32(-sizeRegs[i].sizeOffset), sizeRegs[0].sizeRegister, sizeRegs[1].sizeRegister);
> > +    }
> 
> Same comment as earlier. This code shouldn't be storing to sizeRegs[1].
> Also, this code is almost identical (or maybe exactly identical) to the code
> inside WasmB3IRGenerator. Maybe it's worth writing a common function for it?

Yeah it's almost the same except:
 - Uses a different JIT type (AssemblyHelpers instead of CCallHelpers). Should I make these the same, or use a template?
 - This code also restores Instance differently (B3 has a Value* that is kept live in the function, this needs to figure it out from the import). It could be split up, but I have some asserts on which registers are used to avoid unexpected clobbers.

So I agree we could share things, but I'm not sure we want to.


> > Source/JavaScriptCore/wasm/js/WebAssemblyInstanceConstructor.cpp:90
> > +        // Always start with a dummy Memory, so that wasm -> wasm thunks avoid checking for a nullptr Memory when trying to set pinned registers.
> > +        Wasm::Memory memory;
> > +        instance->setMemory(vm, JSWebAssemblyMemory::create(vm, exec->lexicalGlobalObject()->WebAssemblyMemoryStructure(), WTFMove(memory)));
> 
> This makes me wonder if we're better off just having the callee instead of
> caller generate a wasm->wasm call thunk. Maybe we should do that in a later
> patch, or at least have a bug open to consider it.

Yeah I guess we could have an off-to-the-side entrypeint from "other wasm", in addition to the regular "entry from this wasm instance", for all of the exported functions. We'd need to also change the callee codegen, but then we wouldn't need to tail (or rather, we'd inline that tail).

That could be more efficient.
Comment 22 Saam Barati 2017-01-02 17:10:50 PST
(In reply to comment #21)
> Created attachment 297918 [details]
> patch
> 
> Address more comments. A few questions left:
> 
> 
> > > Source/JavaScriptCore/wasm/WasmB3IRGenerator.cpp:890
> > > +        // The call could have been to another WebAssembly instance, and / or could have modified our Memory.
> > > +        restoreWebAssemblyGlobalState(m_vm, m_info.memory, m_instanceValue, m_proc, continuation);
> > 
> > This code only needs to be emitted along the wasm path.
> 
> I don't think this is true: the JS code could modify Memory, or it could
> have called another Instance. We want to remove the double entry from JS to
> wasm, and then we'll need this code. No?
Yeah I'm wrong, I wasn't thinking about this correctly. Ignore my comment.

> 
> 
> > > Source/JavaScriptCore/wasm/WasmBinding.cpp:274
> > > +    for (unsigned i = 1; i < sizeRegs.size(); ++i) {
> > > +        ASSERT(sizeRegs[i].sizeRegister != baseMemory);
> > > +        ASSERT(sizeRegs[i].sizeRegister != scratch);
> > > +        jit.add64(JIT::TrustedImm32(-sizeRegs[i].sizeOffset), sizeRegs[0].sizeRegister, sizeRegs[1].sizeRegister);
> > > +    }
> > 
> > Same comment as earlier. This code shouldn't be storing to sizeRegs[1].
> > Also, this code is almost identical (or maybe exactly identical) to the code
> > inside WasmB3IRGenerator. Maybe it's worth writing a common function for it?
> 
> Yeah it's almost the same except:
>  - Uses a different JIT type (AssemblyHelpers instead of CCallHelpers).
> Should I make these the same, or use a template?
>  - This code also restores Instance differently (B3 has a Value* that is
> kept live in the function, this needs to figure it out from the import). It
> could be split up, but I have some asserts on which registers are used to
> avoid unexpected clobbers.
> 
> So I agree we could share things, but I'm not sure we want to.
Yeah, I agree that we probably don't want to share here. It seems like it'd make it more complicated.
(Side note: CCallHelpers inherits from AssemblyHelpers, so you'd just declare it as AssemblyHelpers)

> 
> 
> > > Source/JavaScriptCore/wasm/js/WebAssemblyInstanceConstructor.cpp:90
> > > +        // Always start with a dummy Memory, so that wasm -> wasm thunks avoid checking for a nullptr Memory when trying to set pinned registers.
> > > +        Wasm::Memory memory;
> > > +        instance->setMemory(vm, JSWebAssemblyMemory::create(vm, exec->lexicalGlobalObject()->WebAssemblyMemoryStructure(), WTFMove(memory)));
> > 
> > This makes me wonder if we're better off just having the callee instead of
> > caller generate a wasm->wasm call thunk. Maybe we should do that in a later
> > patch, or at least have a bug open to consider it.
> 
> Yeah I guess we could have an off-to-the-side entrypeint from "other wasm",
> in addition to the regular "entry from this wasm instance", for all of the
> exported functions. We'd need to also change the callee codegen, but then we
> wouldn't need to tail (or rather, we'd inline that tail).
> 
> That could be more efficient.
Yeah, let's open a bug and deal with it later.
Comment 23 JF Bastien 2017-01-02 17:30:41 PST
> > > > Source/JavaScriptCore/wasm/js/WebAssemblyInstanceConstructor.cpp:90
> > > > +        // Always start with a dummy Memory, so that wasm -> wasm thunks avoid checking for a nullptr Memory when trying to set pinned registers.
> > > > +        Wasm::Memory memory;
> > > > +        instance->setMemory(vm, JSWebAssemblyMemory::create(vm, exec->lexicalGlobalObject()->WebAssemblyMemoryStructure(), WTFMove(memory)));
> > > 
> > > This makes me wonder if we're better off just having the callee instead of
> > > caller generate a wasm->wasm call thunk. Maybe we should do that in a later
> > > patch, or at least have a bug open to consider it.
> > 
> > Yeah I guess we could have an off-to-the-side entrypeint from "other wasm",
> > in addition to the regular "entry from this wasm instance", for all of the
> > exported functions. We'd need to also change the callee codegen, but then we
> > wouldn't need to tail (or rather, we'd inline that tail).
> > 
> > That could be more efficient.
> Yeah, let's open a bug and deal with it later.

I think we can tag it as part of: https://bugs.webkit.org/show_bug.cgi?id=166486
I'll add a comment there.
Comment 24 WebKit Commit Bot 2017-01-02 17:58:36 PST
Comment on attachment 297918 [details]
patch

Clearing flags on attachment: 297918

Committed r210229: <http://trac.webkit.org/changeset/210229>
Comment 25 WebKit Commit Bot 2017-01-02 17:58:42 PST
All reviewed patches have been landed.  Closing bug.