Bug 241856 - Change offlineasm to emit more efficient LLInt code.
Summary: Change offlineasm to emit more efficient LLInt code.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Mark Lam
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2022-06-22 09:26 PDT by Mark Lam
Modified: 2022-06-22 19:58 PDT (History)
1 user (show)

See Also:


Attachments
LLIntAssembly-before.h (9.51 MB, text/plain)
2022-06-22 10:46 PDT, Mark Lam
no flags Details
LLIntAssembly-before-wo-unnecessary-lsl.h (9.47 MB, text/plain)
2022-06-22 10:47 PDT, Mark Lam
no flags Details
LLIntAssembly-after.h (9.41 MB, text/plain)
2022-06-22 10:47 PDT, Mark Lam
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Lam 2022-06-22 09:26:28 PDT
1. Ruby treats numeric 0 as truthy.  However, there's a test in arm64LowerMalformedLoadStoreAddresses which assumes a value of 0 would be false.  As a result, we see offlineasm emit inefficient LLInt code like this:
    ".loc 3 821\n"        "movz x16, #0 \n"                    // LowLevelInterpreter64.asm:821
                          "add x13, x3, x16 \n"
                          "ldr x0, [x13] \n"

  ...  instead of this:
    ".loc 3 821\n"        "ldr x0, [x3] \n"                    // LowLevelInterpreter64.asm:821

   This patch fixes this.

2. offlineasm's emitARM64MoveImmediate chooses to use `movn` instead of `movz` based on whether a 64-bit value is negative or not.  Instead, it should be making that decision based on the number of halfwords (16-bits) in the value that is 0xffff vs 0.  As a result, offlineasm emits code like this:
    ".loc 1 1638\n"       "movn x27, #1, lsl #48 \n"           // LowLevelInterpreter.asm:1638
                          "movk x27, #0, lsl #32 \n"
                          "movk x27, #0, lsl #16 \n"
                          "movk x27, #0 \n"

  ...  instead of this:
    ".loc 1 1638\n"       "movz x27, #65534, lsl #48 \n"       // LowLevelInterpreter.asm:1638

   This patch fixes this.

3. offlineasm is trivially assuming the range of immediate offsets for ldr/str instructions is [-255..4095].  However, that's only the range for byte sized load-stores.  For 32-bit, the range is [-255..16380].  For 64-bit, the range is [-255..32760].  As a result, offlineasm emits code like this:
    ".loc 1 633\n"        "movn x16, #16383 \n"                // LowLevelInterpreter.asm:633
    ".loc 1 1518\n"       "and x3, x3, x16 \n"                 // LowLevelInterpreter.asm:1518
    ".loc 1 1519\n"       "movz x16, #16088 \n"                // LowLevelInterpreter.asm:1519
                          "add x17, x3, x16 \n"
                          "ldr x3, [x17] \n"

  ...  instead of this:
    ".loc 1 633\n"        "movn x17, #16383 \n"                // LowLevelInterpreter.asm:633
    ".loc 1 1518\n"       "and x3, x3, x17 \n"                 // LowLevelInterpreter.asm:1518
    ".loc 1 1519\n"       "ldr x3, [x3, #16088] \n"            // LowLevelInterpreter.asm:1519

   This patch fixes this for 64-bit and 32-bit load-stores.  16-bit load-stores also has a wider range, but for now, it will continue to use the conservative range.
   This patch also introduces an `isMalformedArm64LoadAStoreAddress` so that this range check can be done consistently in all the places that checks for it.

4. offlineasm is eagerly emitting no-op arguments in instructions, e.g. "lsl #0", and adding 0.  As a result, offlineasm emits code like this:
    ".loc 3 220\n"        "movz x13, #51168, lsl #0 \n"        // LowLevelInterpreter64.asm:220
                          "add x17, x1, x13, lsl #0 \n"
                          "ldr w4, [x17, #0] \n"

  ...  instead of this:
    ".loc 3 220\n"        "movz x13, #51168 \n"                // LowLevelInterpreter64.asm:220
                          "add x17, x1, x13 \n"
                          "ldr w4, [x17] \n"

   This unnecessary arguments are actually very common throughout the emitted LLIntAssembly.h.

   This patch removes these unnecessary arguments, which makes the emitted LLInt code more human readable due to less clutter.
Comment 1 Mark Lam 2022-06-22 10:03:03 PDT
PR: https://github.com/WebKit/WebKit/pull/1686
Comment 2 Mark Lam 2022-06-22 10:46:45 PDT
Created attachment 460419 [details]
LLIntAssembly-before.h
Comment 3 Mark Lam 2022-06-22 10:47:12 PDT
Created attachment 460420 [details]
LLIntAssembly-before-wo-unnecessary-lsl.h
Comment 4 Mark Lam 2022-06-22 10:47:35 PDT
Created attachment 460421 [details]
LLIntAssembly-after.h
Comment 5 EWS 2022-06-22 19:57:14 PDT
Committed r295766 (251771@main): <https://commits.webkit.org/251771@main>

Reviewed commits have been landed. Closing PR #1686 and removing active labels.
Comment 6 Radar WebKit Bug Importer 2022-06-22 19:58:22 PDT
<rdar://problem/95743853>