offlineasm used to emit this LLInt code: ".loc 1 996\n" "ldr x19, [x0] \n" // LowLevelInterpreter.asm:996 ".loc 1 997\n" "ldr x20, [x0, #8] \n" // LowLevelInterpreter.asm:997 ".loc 1 998\n" "ldr x21, [x0, #16] \n" // LowLevelInterpreter.asm:998 ".loc 1 999\n" "ldr x22, [x0, #24] \n" // LowLevelInterpreter.asm:999 ... ".loc 1 1006\n" "ldr d8, [x0, #80] \n" // LowLevelInterpreter.asm:1006 ".loc 1 1007\n" "ldr d9, [x0, #88] \n" // LowLevelInterpreter.asm:1007 ".loc 1 1008\n" "ldr d10, [x0, #96] \n" // LowLevelInterpreter.asm:1008 ".loc 1 1009\n" "ldr d11, [x0, #104] \n" // LowLevelInterpreter.asm:1009 ... Now, it emits this: ".loc 1 996\n" "ldp x19, x20, [x0, #0] \n" // LowLevelInterpreter.asm:996 ".loc 1 997\n" "ldp x21, x22, [x0, #16] \n" // LowLevelInterpreter.asm:997 ... ".loc 1 1001\n" "ldp d8, d9, [x0, #80] \n" // LowLevelInterpreter.asm:1001 ".loc 1 1002\n" "ldp d10, d11, [x0, #96] \n" // LowLevelInterpreter.asm:1002 ... Also, there was some code that kept recomputing the base address of a sequence of load/store instructions. For example, ".loc 6 902\n" "add x13, sp, x10, lsl #3 \n" // WebAssembly.asm:902 "ldr x0, [x13, #48] \n" "add x13, sp, x10, lsl #3 \n" "ldr x1, [x13, #56] \n" "add x13, sp, x10, lsl #3 \n" "ldr x2, [x13, #64] \n" "add x13, sp, x10, lsl #3 \n" "ldr x3, [x13, #72] \n" ... For such places, we observe that the base address is the same for every load/store instruction in the sequence, and precompute it in the LLInt asm code to help out the offline asm. This allows the offlineasm to now emits this more efficient code instead: ".loc 6 896\n" "add x10, sp, x10, lsl #3 \n" // WebAssembly.asm:896 ".loc 6 898\n" "ldp x0, x1, [x10, #48] \n" // WebAssembly.asm:898 "ldp x2, x3, [x10, #64] \n" ...
Pull request: https://github.com/WebKit/WebKit/pull/1716
Created attachment 460449 [details] EWS testing
Committed 251799@main (79eb5e92d4fc): <https://commits.webkit.org/251799@main> Reviewed commits have been landed. Closing PR #1716 and removing active labels.
<rdar://problem/95802476>