WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
Bug 163394
B3 needs a special WasmAddress Opcode
https://bugs.webkit.org/show_bug.cgi?id=163394
Summary
B3 needs a special WasmAddress Opcode
Keith Miller
Reported
2016-10-13 10:58:34 PDT
...
Attachments
Working but looking into lea failures
(17.63 KB, patch)
2016-10-13 11:00 PDT
,
Keith Miller
no flags
Details
Formatted Diff
Diff
Patch
(21.73 KB, patch)
2016-10-14 15:30 PDT
,
Keith Miller
no flags
Details
Formatted Diff
Diff
Patch for landing
(21.46 KB, patch)
2016-10-14 16:08 PDT
,
Keith Miller
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Keith Miller
Comment 1
2016-10-13 11:00:00 PDT
Created
attachment 291495
[details]
Working but looking into lea failures
Keith Miller
Comment 2
2016-10-13 11:12:12 PDT
For some reason the test case in the patch produces the following disassembly on x86_64: 0x21d1cec01000: push %rbp 0x21d1cec01001: mov %rsp, %rbp 0x21d1cec01004: cmp $0x0, %edi 0x21d1cec01007: ja 0x21d1cec0100f 0x21d1cec0100d: pop %rbp 0x21d1cec0100e: ret 0x21d1cec0100f: xor %rax, %rax 0x21d1cec01012: mov %eax, %ecx 0x21d1cec01014: shl $0x2, %ecx 0x21d1cec01017: add %rsi, %rcx 0x21d1cec0101a: mov $0x2a, (%rcx) 0x21d1cec01020: inc %eax 0x21d1cec01022: cmp %edi, %eax 0x21d1cec01024: jb 0x21d1cec01012 0x21d1cec0102a: jmp 0x21d1cec0100d I want to figure out why the add and mov are not being compressed into one instruction. It's also to bad that we can't fuse the shl, add, and mov since I expect that case to be common. Note, we can't fuse the shl with the others since it's 32-bit and the other values are 64-bit.
Filip Pizlo
Comment 3
2016-10-13 11:14:04 PDT
(In reply to
comment #2
)
> For some reason the test case in the patch produces the following > disassembly on x86_64: > > 0x21d1cec01000: push %rbp > 0x21d1cec01001: mov %rsp, %rbp > 0x21d1cec01004: cmp $0x0, %edi > 0x21d1cec01007: ja 0x21d1cec0100f > 0x21d1cec0100d: pop %rbp > 0x21d1cec0100e: ret > 0x21d1cec0100f: xor %rax, %rax > 0x21d1cec01012: mov %eax, %ecx > 0x21d1cec01014: shl $0x2, %ecx > 0x21d1cec01017: add %rsi, %rcx > 0x21d1cec0101a: mov $0x2a, (%rcx) > 0x21d1cec01020: inc %eax > 0x21d1cec01022: cmp %edi, %eax > 0x21d1cec01024: jb 0x21d1cec01012 > 0x21d1cec0102a: jmp 0x21d1cec0100d > > I want to figure out why the add and mov are not being compressed into one > instruction. > > It's also to bad that we can't fuse the shl, add, and mov since I expect > that case to be common. Note, we can't fuse the shl with the others since > it's 32-bit and the other values are 64-bit.
Can you please JSC_dumpGraphAtEachPhase=true?
Filip Pizlo
Comment 4
2016-10-13 11:20:57 PDT
(In reply to
comment #2
)
> For some reason the test case in the patch produces the following > disassembly on x86_64: > > 0x21d1cec01000: push %rbp > 0x21d1cec01001: mov %rsp, %rbp > 0x21d1cec01004: cmp $0x0, %edi > 0x21d1cec01007: ja 0x21d1cec0100f > 0x21d1cec0100d: pop %rbp > 0x21d1cec0100e: ret > 0x21d1cec0100f: xor %rax, %rax > 0x21d1cec01012: mov %eax, %ecx > 0x21d1cec01014: shl $0x2, %ecx
This is a 32-bit left-shift-by-2. There is no lea for that. Lea is only for Add(Shl(value, const), value). A lone Shl is better off as a lone Shl.
> 0x21d1cec01017: add %rsi, %rcx
You're emitting this instruction yourself. This add instruction is there because of the code you added to B3LowerToAir.
> 0x21d1cec0101a: mov $0x2a, (%rcx) > 0x21d1cec01020: inc %eax > 0x21d1cec01022: cmp %edi, %eax > 0x21d1cec01024: jb 0x21d1cec01012 > 0x21d1cec0102a: jmp 0x21d1cec0100d > > I want to figure out why the add and mov are not being compressed into one > instruction. > > It's also to bad that we can't fuse the shl, add, and mov since I expect > that case to be common. Note, we can't fuse the shl with the others since > it's 32-bit and the other values are 64-bit.
Keith Miller
Comment 5
2016-10-14 15:30:38 PDT
Created
attachment 291672
[details]
Patch
Filip Pizlo
Comment 6
2016-10-14 15:36:42 PDT
Comment on
attachment 291672
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=291672&action=review
> Source/JavaScriptCore/b3/B3LowerToAir.cpp:521 > + // We would further pattern match on the opcode of the pointer and > + // try to construct a more complete addressing mode. But since any > + // math on the pointer would be 32-bit there is not much we can do.
Remove this comment, since you already say specifically about the ARM thing.
> Source/JavaScriptCore/b3/B3LowerToAir.cpp:526 > + // FIXME: We should support ARM64 LDR 32-bit addressing, which will > + // allow us to fuse a Shl ptr, 2 into the address. Additionally, and > + // perhaps more importantly, it would allow us to avoid a truncating > + // move.
File a bug and reference it here.
> Source/JavaScriptCore/b3/B3LowerToAir.cpp:528 > + return Arg::index(Air::Tmp(wasmAddress->pinnedGPR()), tmp(pointer), 1, offset);
Do you have to say Air:: here?
> Source/JavaScriptCore/b3/testb3.cpp:13766 > +void testWasmAddress() > + {
Indentation.
Keith Miller
Comment 7
2016-10-14 16:08:06 PDT
Created
attachment 291677
[details]
Patch for landing
Keith Miller
Comment 8
2016-10-14 16:24:48 PDT
Committed
r207360
: <
http://trac.webkit.org/changeset/207360
>
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug