Bug 241905 - Add ldp and stp support to ARM64 and ARM64E offlineasm.
Summary: Add ldp and stp support to ARM64 and ARM64E offlineasm.
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
Keywords: InRadar
Depends on:
Reported: 2022-06-22 22:44 PDT by Mark Lam
Modified: 2022-06-23 13:43 PDT (History)
6 users (show)

See Also:

EWS testing (23.78 KB, patch)
2022-06-23 10:03 PDT, Mark Lam
ews-feeder: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Lam 2022-06-22 22:44:50 PDT
offlineasm used to emit this LLInt code:
    ".loc 1 996\n"        "ldr x19, [x0] \n"                   // LowLevelInterpreter.asm:996
    ".loc 1 997\n"        "ldr x20, [x0, #8] \n"               // LowLevelInterpreter.asm:997
    ".loc 1 998\n"        "ldr x21, [x0, #16] \n"              // LowLevelInterpreter.asm:998
    ".loc 1 999\n"        "ldr x22, [x0, #24] \n"              // LowLevelInterpreter.asm:999
    ".loc 1 1006\n"       "ldr d8, [x0, #80] \n"               // LowLevelInterpreter.asm:1006
    ".loc 1 1007\n"       "ldr d9, [x0, #88] \n"               // LowLevelInterpreter.asm:1007
    ".loc 1 1008\n"       "ldr d10, [x0, #96] \n"              // LowLevelInterpreter.asm:1008
    ".loc 1 1009\n"       "ldr d11, [x0, #104] \n"             // LowLevelInterpreter.asm:1009

Now, it emits this:
    ".loc 1 996\n"        "ldp x19, x20, [x0, #0] \n"          // LowLevelInterpreter.asm:996
    ".loc 1 997\n"        "ldp x21, x22, [x0, #16] \n"         // LowLevelInterpreter.asm:997
    ".loc 1 1001\n"       "ldp d8, d9, [x0, #80] \n"           // LowLevelInterpreter.asm:1001
    ".loc 1 1002\n"       "ldp d10, d11, [x0, #96] \n"         // LowLevelInterpreter.asm:1002

Also, there was some code that kept recomputing the base address of a sequence of load/store instructions.  For example,
    ".loc 6 902\n"        "add x13, sp, x10, lsl #3 \n"        // WebAssembly.asm:902
                          "ldr x0, [x13, #48] \n"
                          "add x13, sp, x10, lsl #3 \n"
                          "ldr x1, [x13, #56] \n"
                          "add x13, sp, x10, lsl #3 \n"
                          "ldr x2, [x13, #64] \n"
                          "add x13, sp, x10, lsl #3 \n"
                          "ldr x3, [x13, #72] \n"

For such places, we observe that the base address is the same for every load/store instruction in the sequence, and precompute it in the LLInt asm code to help out the offline asm.  This allows the offlineasm to now emits this more efficient code instead:
    ".loc 6 896\n"        "add x10, sp, x10, lsl #3 \n"        // WebAssembly.asm:896
    ".loc 6 898\n"        "ldp x0, x1, [x10, #48] \n"          // WebAssembly.asm:898
                          "ldp x2, x3, [x10, #64] \n"
Comment 1 Mark Lam 2022-06-22 23:15:05 PDT
Pull request: https://github.com/WebKit/WebKit/pull/1716
Comment 2 Mark Lam 2022-06-23 10:03:52 PDT
Created attachment 460449 [details]
EWS testing
Comment 3 EWS 2022-06-23 13:25:43 PDT
Committed 251799@main (79eb5e92d4fc): <https://commits.webkit.org/251799@main>

Reviewed commits have been landed. Closing PR #1716 and removing active labels.
Comment 4 Radar WebKit Bug Importer 2022-06-23 13:26:13 PDT