Bug 192496 - Enable DFG on ARM/Linux again
Summary: Enable DFG on ARM/Linux again
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on: 193330
Blocks: 191163 192983
  Show dependency treegraph
 
Reported: 2018-12-07 04:23 PST by Dominik Inführ
Modified: 2019-01-11 12:41 PST (History)
13 users (show)

See Also:


Attachments
Patch (14.62 KB, patch)
2018-12-07 04:32 PST, Dominik Inführ
no flags Details | Formatted Diff | Diff
Patch (14.03 KB, patch)
2018-12-07 04:37 PST, Dominik Inführ
no flags Details | Formatted Diff | Diff
Patch (17.21 KB, patch)
2018-12-13 03:18 PST, Dominik Inführ
no flags Details | Formatted Diff | Diff
Patch (19.64 KB, patch)
2019-01-08 09:20 PST, Dominik Inführ
no flags Details | Formatted Diff | Diff
Patch (19.77 KB, patch)
2019-01-09 01:54 PST, Dominik Inführ
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dominik Inführ 2018-12-07 04:23:48 PST
Enable DFG on ARM/Linux again
Comment 1 Dominik Inführ 2018-12-07 04:32:55 PST
Created attachment 356799 [details]
Patch
Comment 2 Dominik Inführ 2018-12-07 04:37:04 PST
Created attachment 356800 [details]
Patch
Comment 3 Guillaume Emont 2018-12-07 08:23:46 PST
Comment on attachment 356800 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=356800&action=review

> Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:2017
>      loadp Callee[cfr], t1
>      andp MarkedBlockMask, t1
>      loadp MarkedBlockFooterOffset + MarkedBlock::Footer::m_vm[t1], t1

Would it make more sense to save t1 earlier than recalculate it (on a callee-saved reg or pushed on the stack)?
Comment 4 Dominik Inführ 2018-12-09 23:59:04 PST
LowLevelInterpreter64.asm also recalculates that reference. So I don't think this code is super hot and I am not sure where we would best store that value.
Comment 5 Guillaume Emont 2018-12-10 01:22:24 PST
(In reply to Dominik Inführ from comment #4)
> LowLevelInterpreter64.asm also recalculates that reference. So I don't think
> this code is super hot and I am not sure where we would best store that
> value.

Fair enough.
Comment 6 Yusuke Suzuki 2018-12-10 01:44:33 PST
Comment on attachment 356800 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=356800&action=review

> Source/JavaScriptCore/ChangeLog:10
> +        architectures. Enable DFG now again on ARM/Linux. Do not use register r11
> +        in compiled DFG code.

Please describe why r11 is avoided.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:143
> +#endif

Why?

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:171
> +#endif

Ditto.

> Source/JavaScriptCore/jit/CallFrameShuffler.cpp:58
> +#endif

Why?
Comment 7 Dominik Inführ 2018-12-10 05:26:59 PST
Comment on attachment 356800 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=356800&action=review

Thanks for taking a look at the patch! I agree, I should definitely add some more comments for that. I will certainly add them, when you agree with the approaches I took. Again, thanks for your time!

>> Source/JavaScriptCore/ChangeLog:10
>> +        in compiled DFG code.
> 
> Please describe why r11 is avoided.

True, I should definitely document that. I don't use it, since r11 is used as callee-saved register for metadataTable in LLInt.

>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:143
>> +#endif
> 
> Why?

I avoided to implement this since we don't have callee-saved floating-point registers on 32-bit architectures yet. AFAIU implementing this might be a bit complicated since general-purpose register are 32-bit wide for 32-bit architectures, while floating point registers are 64-bit wide.

>> Source/JavaScriptCore/jit/CallFrameShuffler.cpp:58
>> +#endif
> 
> Why?

RegisterSet::vmCalleeSaveRegisters() used to be the empty set on 32-bit architectures, now on ARM this returns r11 (so the metadataTable register). So the register r11 would be used by CallFrameShuffler and therefore needs to be saved/restored, it seemed the simplest to just avoid using this register in here. RegisterSet::vmCalleeSaveRegisters() used to be the empty set, so there shouldn't be fewer register available than before.
Comment 8 Yusuke Suzuki 2018-12-11 03:15:27 PST
Comment on attachment 356800 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=356800&action=review

>>> Source/JavaScriptCore/ChangeLog:10
>>> +        in compiled DFG code.
>> 
>> Please describe why r11 is avoided.
> 
> True, I should definitely document that. I don't use it, since r11 is used as callee-saved register for metadataTable in LLInt.

Make sense, please describe it in ChangeLog.

>>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:143
>>> +#endif
>> 
>> Why?
> 
> I avoided to implement this since we don't have callee-saved floating-point registers on 32-bit architectures yet. AFAIU implementing this might be a bit complicated since general-purpose register are 32-bit wide for 32-bit architectures, while floating point registers are 64-bit wide.

Could you add FIXME and comments about this here?
And let's put braces for `else`.

>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:171
>> +#endif
> 
> Ditto.

Ditto.

>>> Source/JavaScriptCore/jit/CallFrameShuffler.cpp:58
>>> +#endif
>> 
>> Why?
> 
> RegisterSet::vmCalleeSaveRegisters() used to be the empty set on 32-bit architectures, now on ARM this returns r11 (so the metadataTable register). So the register r11 would be used by CallFrameShuffler and therefore needs to be saved/restored, it seemed the simplest to just avoid using this register in here. RegisterSet::vmCalleeSaveRegisters() used to be the empty set, so there shouldn't be fewer register available than before.

OK, I see. Please describe the detail here as a comment.
In JSVALUE32_64 environment, we do not have CallFrameShuffleData::setupCalleeSaveRegisters, right?

> Source/JavaScriptCore/jit/GPRInfo.h:551
> +    static const GPRReg regT7 = ARMRegisters::r5;

Let's clean up some code in offline arm.rb too. (Note that ARM traditional JIT is removed now).

>> Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:2017
>>      loadp MarkedBlockFooterOffset + MarkedBlock::Footer::m_vm[t1], t1
> 
> Would it make more sense to save t1 earlier than recalculate it (on a callee-saved reg or pushed on the stack)?

looks ok.
Comment 9 Dominik Inführ 2018-12-13 03:18:21 PST
Created attachment 357221 [details]
Patch
Comment 10 Dominik Inführ 2018-12-14 01:58:53 PST
Hi, I think I've addressed all the comments you had. I've updated some code in offlineasm/arm.rb as well.

> In JSVALUE32_64 environment, we do not have CallFrameShuffleData::setupCalleeSaveRegisters, right?
Right.
Comment 11 Dominik Inführ 2019-01-08 09:20:08 PST
Created attachment 358603 [details]
Patch
Comment 12 Yusuke Suzuki 2019-01-08 11:19:44 PST
Comment on attachment 358603 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=358603&action=review

r=me with nits.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:142
> +            // FIXME: support callee-saved floating point registers on 32-bit architectures

Please file a bug with FIXME and add a url here.

> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:172
> +            // FIXME: support callee-saved floating point registers on 32-bit architectures

Ditto.

> Source/JavaScriptCore/jit/GPRInfo.h:529
>  #if CPU(ARM)

I think we should use #if CPU(ARM_THUMB2) here since the following information is not correct for CPU(ARM) from now.
Comment 13 Dominik Inführ 2019-01-09 01:54:25 PST
Created attachment 358683 [details]
Patch
Comment 14 Dominik Inführ 2019-01-09 01:56:47 PST
Thanks for your review! I've created a bug and changed `#if CPU(ARM)` as requested.
Comment 15 WebKit Commit Bot 2019-01-10 03:55:59 PST
Comment on attachment 358683 [details]
Patch

Clearing flags on attachment: 358683

Committed r239825: <https://trac.webkit.org/changeset/239825>
Comment 16 WebKit Commit Bot 2019-01-10 03:56:01 PST
All reviewed patches have been landed.  Closing bug.
Comment 17 Radar WebKit Bug Importer 2019-01-10 03:56:32 PST
<rdar://problem/47174867>
Comment 18 WebKit Commit Bot 2019-01-10 12:00:55 PST
Re-opened since this is blocked by bug 193330
Comment 19 Guillaume Emont 2019-01-10 12:06:00 PST
We had 118 test failures on the armv7 bot (and 117 on the armv7 softfpabi bot) after this landed.
Also, running all the tests took about 6h40, whereas IIRC it used to take about 2h15 on the same bot before the bytecode change landed (last time we had DFG for the platform), so we might have a performance issue too.

Link to test run: https://build.webkit.org/builders/JSCOnly%20Linux%20ARMv7%20Thumb2%20Release/builds/7209
Comment 20 Guillaume Emont 2019-01-11 08:51:58 PST
The issue was with the buildbot and not this patch: the buildbot needed to do a clean build. After doing one, the revision that landed this patch only yields 1 failure, which is a test that should be skipped on arm and mips as it time outs (that's not new, only the skip was changed to "not $jitTests").

See the test run:
https://build.webkit.org/builders/JSCOnly%20Linux%20ARMv7%20Thumb2%20Release/builds/7217

Also, the slowness was due to compiling with C_LOOP and running all the tests and test configurations as if we had DFG. Now in that last run, the tests ran in under 2h08m.

Please re-land that patch and then ping me so that I can trigger a buildbot clean build.
Comment 21 WebKit Commit Bot 2019-01-11 12:41:49 PST
Comment on attachment 358683 [details]
Patch

Clearing flags on attachment: 358683

Committed r239867: <https://trac.webkit.org/changeset/239867>
Comment 22 WebKit Commit Bot 2019-01-11 12:41:51 PST
All reviewed patches have been landed.  Closing bug.