RESOLVED FIXED Bug 192496
Enable DFG on ARM/Linux again
https://bugs.webkit.org/show_bug.cgi?id=192496
Summary Enable DFG on ARM/Linux again
Dominik Inführ
Reported 2018-12-07 04:23:48 PST
Enable DFG on ARM/Linux again
Attachments
Patch (14.62 KB, patch)
2018-12-07 04:32 PST, Dominik Inführ
no flags
Patch (14.03 KB, patch)
2018-12-07 04:37 PST, Dominik Inführ
no flags
Patch (17.21 KB, patch)
2018-12-13 03:18 PST, Dominik Inführ
no flags
Patch (19.64 KB, patch)
2019-01-08 09:20 PST, Dominik Inführ
no flags
Patch (19.77 KB, patch)
2019-01-09 01:54 PST, Dominik Inführ
no flags
Dominik Inführ
Comment 1 2018-12-07 04:32:55 PST
Dominik Inführ
Comment 2 2018-12-07 04:37:04 PST
Guillaume Emont
Comment 3 2018-12-07 08:23:46 PST
Comment on attachment 356800 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=356800&action=review > Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:2017 > loadp Callee[cfr], t1 > andp MarkedBlockMask, t1 > loadp MarkedBlockFooterOffset + MarkedBlock::Footer::m_vm[t1], t1 Would it make more sense to save t1 earlier than recalculate it (on a callee-saved reg or pushed on the stack)?
Dominik Inführ
Comment 4 2018-12-09 23:59:04 PST
LowLevelInterpreter64.asm also recalculates that reference. So I don't think this code is super hot and I am not sure where we would best store that value.
Guillaume Emont
Comment 5 2018-12-10 01:22:24 PST
(In reply to Dominik Inführ from comment #4) > LowLevelInterpreter64.asm also recalculates that reference. So I don't think > this code is super hot and I am not sure where we would best store that > value. Fair enough.
Yusuke Suzuki
Comment 6 2018-12-10 01:44:33 PST
Comment on attachment 356800 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=356800&action=review > Source/JavaScriptCore/ChangeLog:10 > + architectures. Enable DFG now again on ARM/Linux. Do not use register r11 > + in compiled DFG code. Please describe why r11 is avoided. > Source/JavaScriptCore/dfg/DFGOSRExit.cpp:143 > +#endif Why? > Source/JavaScriptCore/dfg/DFGOSRExit.cpp:171 > +#endif Ditto. > Source/JavaScriptCore/jit/CallFrameShuffler.cpp:58 > +#endif Why?
Dominik Inführ
Comment 7 2018-12-10 05:26:59 PST
Comment on attachment 356800 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=356800&action=review Thanks for taking a look at the patch! I agree, I should definitely add some more comments for that. I will certainly add them, when you agree with the approaches I took. Again, thanks for your time! >> Source/JavaScriptCore/ChangeLog:10 >> + in compiled DFG code. > > Please describe why r11 is avoided. True, I should definitely document that. I don't use it, since r11 is used as callee-saved register for metadataTable in LLInt. >> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:143 >> +#endif > > Why? I avoided to implement this since we don't have callee-saved floating-point registers on 32-bit architectures yet. AFAIU implementing this might be a bit complicated since general-purpose register are 32-bit wide for 32-bit architectures, while floating point registers are 64-bit wide. >> Source/JavaScriptCore/jit/CallFrameShuffler.cpp:58 >> +#endif > > Why? RegisterSet::vmCalleeSaveRegisters() used to be the empty set on 32-bit architectures, now on ARM this returns r11 (so the metadataTable register). So the register r11 would be used by CallFrameShuffler and therefore needs to be saved/restored, it seemed the simplest to just avoid using this register in here. RegisterSet::vmCalleeSaveRegisters() used to be the empty set, so there shouldn't be fewer register available than before.
Yusuke Suzuki
Comment 8 2018-12-11 03:15:27 PST
Comment on attachment 356800 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=356800&action=review >>> Source/JavaScriptCore/ChangeLog:10 >>> + in compiled DFG code. >> >> Please describe why r11 is avoided. > > True, I should definitely document that. I don't use it, since r11 is used as callee-saved register for metadataTable in LLInt. Make sense, please describe it in ChangeLog. >>> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:143 >>> +#endif >> >> Why? > > I avoided to implement this since we don't have callee-saved floating-point registers on 32-bit architectures yet. AFAIU implementing this might be a bit complicated since general-purpose register are 32-bit wide for 32-bit architectures, while floating point registers are 64-bit wide. Could you add FIXME and comments about this here? And let's put braces for `else`. >> Source/JavaScriptCore/dfg/DFGOSRExit.cpp:171 >> +#endif > > Ditto. Ditto. >>> Source/JavaScriptCore/jit/CallFrameShuffler.cpp:58 >>> +#endif >> >> Why? > > RegisterSet::vmCalleeSaveRegisters() used to be the empty set on 32-bit architectures, now on ARM this returns r11 (so the metadataTable register). So the register r11 would be used by CallFrameShuffler and therefore needs to be saved/restored, it seemed the simplest to just avoid using this register in here. RegisterSet::vmCalleeSaveRegisters() used to be the empty set, so there shouldn't be fewer register available than before. OK, I see. Please describe the detail here as a comment. In JSVALUE32_64 environment, we do not have CallFrameShuffleData::setupCalleeSaveRegisters, right? > Source/JavaScriptCore/jit/GPRInfo.h:551 > + static const GPRReg regT7 = ARMRegisters::r5; Let's clean up some code in offline arm.rb too. (Note that ARM traditional JIT is removed now). >> Source/JavaScriptCore/llint/LowLevelInterpreter32_64.asm:2017 >> loadp MarkedBlockFooterOffset + MarkedBlock::Footer::m_vm[t1], t1 > > Would it make more sense to save t1 earlier than recalculate it (on a callee-saved reg or pushed on the stack)? looks ok.
Dominik Inführ
Comment 9 2018-12-13 03:18:21 PST
Dominik Inführ
Comment 10 2018-12-14 01:58:53 PST
Hi, I think I've addressed all the comments you had. I've updated some code in offlineasm/arm.rb as well. > In JSVALUE32_64 environment, we do not have CallFrameShuffleData::setupCalleeSaveRegisters, right? Right.
Dominik Inführ
Comment 11 2019-01-08 09:20:08 PST
Yusuke Suzuki
Comment 12 2019-01-08 11:19:44 PST
Comment on attachment 358603 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=358603&action=review r=me with nits. > Source/JavaScriptCore/dfg/DFGOSRExit.cpp:142 > + // FIXME: support callee-saved floating point registers on 32-bit architectures Please file a bug with FIXME and add a url here. > Source/JavaScriptCore/dfg/DFGOSRExit.cpp:172 > + // FIXME: support callee-saved floating point registers on 32-bit architectures Ditto. > Source/JavaScriptCore/jit/GPRInfo.h:529 > #if CPU(ARM) I think we should use #if CPU(ARM_THUMB2) here since the following information is not correct for CPU(ARM) from now.
Dominik Inführ
Comment 13 2019-01-09 01:54:25 PST
Dominik Inführ
Comment 14 2019-01-09 01:56:47 PST
Thanks for your review! I've created a bug and changed `#if CPU(ARM)` as requested.
WebKit Commit Bot
Comment 15 2019-01-10 03:55:59 PST
Comment on attachment 358683 [details] Patch Clearing flags on attachment: 358683 Committed r239825: <https://trac.webkit.org/changeset/239825>
WebKit Commit Bot
Comment 16 2019-01-10 03:56:01 PST
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 17 2019-01-10 03:56:32 PST
WebKit Commit Bot
Comment 18 2019-01-10 12:00:55 PST
Re-opened since this is blocked by bug 193330
Guillaume Emont
Comment 19 2019-01-10 12:06:00 PST
We had 118 test failures on the armv7 bot (and 117 on the armv7 softfpabi bot) after this landed. Also, running all the tests took about 6h40, whereas IIRC it used to take about 2h15 on the same bot before the bytecode change landed (last time we had DFG for the platform), so we might have a performance issue too. Link to test run: https://build.webkit.org/builders/JSCOnly%20Linux%20ARMv7%20Thumb2%20Release/builds/7209
Guillaume Emont
Comment 20 2019-01-11 08:51:58 PST
The issue was with the buildbot and not this patch: the buildbot needed to do a clean build. After doing one, the revision that landed this patch only yields 1 failure, which is a test that should be skipped on arm and mips as it time outs (that's not new, only the skip was changed to "not $jitTests"). See the test run: https://build.webkit.org/builders/JSCOnly%20Linux%20ARMv7%20Thumb2%20Release/builds/7217 Also, the slowness was due to compiling with C_LOOP and running all the tests and test configurations as if we had DFG. Now in that last run, the tests ran in under 2h08m. Please re-land that patch and then ping me so that I can trigger a buildbot clean build.
WebKit Commit Bot
Comment 21 2019-01-11 12:41:49 PST
Comment on attachment 358683 [details] Patch Clearing flags on attachment: 358683 Committed r239867: <https://trac.webkit.org/changeset/239867>
WebKit Commit Bot
Comment 22 2019-01-11 12:41:51 PST
All reviewed patches have been landed. Closing bug.
Note You need to log in before you can comment on or make changes to this bug.