Bug 183243 - [MIPS] Optimize generated JIT code for loads/stores
Summary: [MIPS] Optimize generated JIT code for loads/stores
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: WebKit Nightly Build
Hardware: Other Linux
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2018-03-01 05:47 PST by Stanislav Ocovaj
Modified: 2018-03-20 10:56 PDT (History)
11 users (show)

See Also:


Attachments
Patch (18.54 KB, patch)
2018-03-01 06:08 PST, Stanislav Ocovaj
no flags Details | Formatted Diff | Diff
Patch (20.66 KB, patch)
2018-03-07 02:23 PST, Stanislav Ocovaj
no flags Details | Formatted Diff | Diff
Patch (27.73 KB, patch)
2018-03-15 08:07 PDT, Stanislav Ocovaj
ysuzuki: review-
Details | Formatted Diff | Diff
Patch (27.73 KB, patch)
2018-03-15 09:13 PDT, Stanislav Ocovaj
no flags Details | Formatted Diff | Diff
Patch (26.56 KB, patch)
2018-03-15 09:15 PDT, Stanislav Ocovaj
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stanislav Ocovaj 2018-03-01 05:47:59 PST
JIT currently generates three MIPS instructions for a load/store from/to an absolute address:
  lui adrTmpReg, address >> 16
  ori adrTmpReg, address & 0xffff
  lw dataReg, 0(adrTmpReg)
Since load/store instructions on MIPS have a 16-bit offset, lower 16 bits of the address can be encoded into the load/store and ori can be removed:
  lui adrTmpReg, (address + 0x8000) >> 16
  lw dataReg, (address & 0xffff)(adrTmpReg)
Comment 1 Stanislav Ocovaj 2018-03-01 06:08:52 PST
Created attachment 334812 [details]
Patch

The patch that implements the proposed optimizations. This improves the benchmark results by ~1-2%
Comment 2 Petar Jovanovic 2018-03-06 08:27:39 PST
Can someone look at this patch and review it?
Comment 3 Yusuke Suzuki 2018-03-06 08:54:40 PST
(In reply to Petar Jovanovic from comment #2)
> Can someone look at this patch and review it?

Please describe the details of this change in ChangeLog file, and set r? flag :)
Comment 4 Stanislav Ocovaj 2018-03-07 02:23:41 PST
Created attachment 335180 [details]
Patch

Rebased and added details to ChangeLog
Comment 5 Adrian Perez 2018-03-07 10:32:10 PST
Comment on attachment 335180 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=335180&action=review

Hi there! Though I am not officially a reviewer, I have been doing informal
reviews on MIPS-related patches. This patch looks good to me, and even when
shaving one instruction per load/store may not look like a lot, it probably
makes for measurable savings as there will be many load/store operations in
typical generated code (By th way: do you have any performance numbers with
and without the patch? — I am sure this is something we want to merge, I am
just being curious myself :-D).

I have just a small comment about using “uintptr_t” instead of
“uint32_t”; but it's not something that should block landing the
patch.

Thanks a lot for your contribution!

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:239
> +            uint32_t adr = reinterpret_cast<uint32_t>(address.m_ptr);

Small nit: I would use “uintptr_t” here. Technically it will be
equivalent, but IMHO it makes the intention of storing a pointer
as an integer value clearer.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:242
> +            if (imm.m_value >= -32768 && imm.m_value <= 32767)

I know that this statement was just moved around and was already
written this way, but maybe it would be nicer to use “INT16_MIN”
and “INT16_MAX” instead of writing the not-so-magic numbers in
the condition check.

I am fine leaving the patch as-is currently, and if we want to use
the constant names I would change that in a separate patch. At any
rate, I am just thinking out loud, and I am not sure whether this
is worth caring about — what do other reviewers think about this?

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:268
> +            uint32_t adr = reinterpret_cast<uint32_t>(address.m_ptr);

Same here about using “uintptr_t”.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:453
> +            uint32_t adr = reinterpret_cast<uint32_t>(dest.m_ptr);

Same here about using “uintptr_t”.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:507
> +            uint32_t adr = reinterpret_cast<uint32_t>(dest.m_ptr);

Same here about using “uintptr_t”.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:693
> +            uint32_t adr = reinterpret_cast<uint32_t>(address.m_ptr);

Same here about using “uintptr_t”.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:879
> +            uint32_t adr = reinterpret_cast<uint32_t>(address);

Same here about using “uintptr_t”.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:945
> +            uint32_t adr = reinterpret_cast<uint32_t>(address);

Same here about using “uintptr_t”.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:1111
> +            uint32_t adr = reinterpret_cast<uint32_t>(address);

Same here about using “uintptr_t”.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:1271
> +            uint32_t adr = reinterpret_cast<uint32_t>(address);

Same here about using “uintptr_t”.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:1290
> +            uint32_t adr = reinterpret_cast<uint32_t>(address);

Ditto.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:1478
> +            uint32_t adr = reinterpret_cast<uint32_t>(address);

Ditto.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:1496
> +            uint32_t adr = reinterpret_cast<uint32_t>(address);

Ditto.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:2091
> +            uint32_t adr = reinterpret_cast<uint32_t>(dest.m_ptr);

Ditto.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:2714
> +            uint32_t adr = reinterpret_cast<uint32_t>(address.m_value);

Ditto.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:2870
> +            uint32_t adr = reinterpret_cast<uint32_t>(address.m_value);

Ditto.
Comment 6 Stanislav Ocovaj 2018-03-15 08:07:21 PDT
Created attachment 335846 [details]
Patch
Comment 7 Stanislav Ocovaj 2018-03-15 08:14:57 PDT
I agree about the use of uintptr_t instead of uint32_t, I changed that. As for constants like 32767, they are used in many places in the file, unrelated to this patch, so I guess it would be better to fix that in a separate patch. I also added an optimization for 8-bit and 32-bit loads and stores with BaseIndex addressing, where a left shift instruction can be omitted if address.scale is 0.
Comment 8 Yusuke Suzuki 2018-03-15 09:00:54 PDT
Comment on attachment 335846 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=335846&action=review

Overall, looks good to me. But r- because I found one bug in this patch.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:269
> +            if ((adr >> 15) == ((adr+4) >> 15)) {

Style nits: use `(adr + 4)` instead of `(adr+4)`.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:280
> +                m_assembler.lw(dataTempRegister, addrTempRegister, (adr+4) & 0xffff);

Ditto.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:284
> +                m_assembler.sw(dataTempRegister, addrTempRegister, (adr+4) & 0xffff);

Ditto.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:709
> +               li   addrTemp, address
> +               li   immTemp, imm
> +               lw   dataTemp, 0(addrTemp)
> +               subu dataTemp, dataTemp, immTemp
> +               sw   dataTemp, 0(addrTemp)

We need to fix this to align to the actual code.

> Source/JavaScriptCore/assembler/MacroAssemblerMIPS.h:807
> +        }

It is wrong. ConvertibleLoadLabel's instructions need to be convertible to addPtr instruction sequence
Please check MIPSAssembler::replaceWithLoad too.
Comment 9 Stanislav Ocovaj 2018-03-15 09:13:51 PDT
Created attachment 335852 [details]
Patch
Comment 10 Stanislav Ocovaj 2018-03-15 09:15:14 PDT
Created attachment 335853 [details]
Patch
Comment 11 Yusuke Suzuki 2018-03-15 09:35:37 PDT
Comment on attachment 335853 [details]
Patch

r=me, nice!
Comment 12 Build Bot 2018-03-15 10:31:14 PDT
Comment on attachment 335853 [details]
Patch

Attachment 335853 [details] did not pass jsc-ews (mac):
Output: http://webkit-queues.webkit.org/results/6965799

New failing tests:
stress/ftl-put-by-id-setter-exception-interesting-live-state.js.no-cjit-collect-continuously
Comment 13 Stanislav Ocovaj 2018-03-16 07:23:55 PDT
Just for the record, with these optimizations I got the following results on a mips32r1 platform:
Benchmark  Original  Optimized  Improvement
Octane2    845       861        1.9%
Kraken     3306 ms   3247 ms    1.8%
JetStream  7.13      7.27       2.0%
Comment 14 Adrian Perez 2018-03-16 13:15:07 PDT
(In reply to Stanislav Ocovaj from comment #13)
> Just for the record, with these optimizations I got the following results on
> a mips32r1 platform:
> Benchmark  Original  Optimized  Improvement
> Octane2    845       861        1.9%
> Kraken     3306 ms   3247 ms    1.8%
> JetStream  7.13      7.27       2.0%

That's pretty good for such a relatively small change, good work!
Comment 15 Petar Jovanovic 2018-03-20 09:31:56 PDT
How do we get this change in?
Comment 16 Adrian Perez 2018-03-20 10:27:56 PDT
Comment on attachment 335853 [details]
Patch

The test failed in the EWS has to be unrelated to this patch, because
this only touches “MacroAssemblerMIPS.h” and the EWS is not running
on a MIPS box, so I'll just cq+ to get this landed. I'll keep an eye
on the bots but, as said, anything else than MIPS should be affected.
Comment 17 Adrian Perez 2018-03-20 10:32:49 PDT
(In reply to Petar Jovanovic from comment #15)
> How do we get this change in?

There is a positive review (r+) from Yusuke, and JSC is his area
of expertise, that's perfect. The EWS bot cancelled the commit-queue
flag due to a test case failure, so I have set it back to cq+ because
it seems completely unrelated. The commit-queue bot will apply the
patch, so you don't need to do anything else.

(Thanks for pinging in the bug, though; many of us are busy and
the reminder from Bugzilla usually works. Alternatively, you can
try to ping people in the IRC Freenode #webkit channel :D)
Comment 18 WebKit Commit Bot 2018-03-20 10:54:04 PDT
Comment on attachment 335853 [details]
Patch

Clearing flags on attachment: 335853

Committed r229766: <https://trac.webkit.org/changeset/229766>
Comment 19 WebKit Commit Bot 2018-03-20 10:54:05 PDT
All reviewed patches have been landed.  Closing bug.
Comment 20 Radar WebKit Bug Importer 2018-03-20 10:56:24 PDT
<rdar://problem/38670859>