Bug 169815 - WebAssembly: eliminate redundant ARM64 TLS load
Summary: WebAssembly: eliminate redundant ARM64 TLS load
Status: RESOLVED DUPLICATE of bug 169773
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 169611
Blocks: 159775
  Show dependency treegraph
 
Reported: 2017-03-17 10:05 PDT by JF Bastien
Modified: 2017-03-17 10:56 PDT (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description JF Bastien 2017-03-17 10:05:08 PDT
This is a small optimization, I'm not sure it'll pay off much but it's neat.

As part of bug #169611 we're moving the WebAssembly context to a TLS slot. On x86 that's a single load / store off the segment register, but on ARM64 it uses mrs + mask + {load,store}. the `mrs TPIDRRO EL0` instruction, coupled with the mask and the address generation, simply return the location of our TLS slot (the offset is defined as WTF_WASM_CONTEXT_KEY in wtf/FastTls.h). That value is idempotent as long as we're executing in the same thread, and that's an invariant of WebAssembly: different instances are set in that context but the location is the same per thread.

Right now this mrs+mask+memory combo is generated by the ARM64 macro assembler. This is inefficient. We could instead teach the compiler about the idempotent part (i.e. "get TLS slot #x") and then split off the load / store from that slot. For x86 that could mean combining both operations after the fact or keeping the same model we have now. For ARM64 that would allow us to eliminate redundant mrs+mask if profitable, or dematerializing them under register pressure.
Comment 1 JF Bastien 2017-03-17 10:56:07 PDT
Fil thinks we just want to pin a register on ARM because the optimization I propose will likely do the same thing by hoisting the redundant load to the top of each function. May as well get rid of the load entirely.

Let's just do is as part of bug #169773 then

*** This bug has been marked as a duplicate of bug 169773 ***