WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
150562
Update FTL to support UntypedUse operands for op_sub.
https://bugs.webkit.org/show_bug.cgi?id=150562
Summary
Update FTL to support UntypedUse operands for op_sub.
Mark Lam
Reported
2015-10-26 09:57:59 PDT
Patch coming.
Attachments
the patch.
(21.49 KB, patch)
2015-10-27 17:03 PDT
,
Mark Lam
ggaren
: review+
ggaren
: commit-queue-
Details
Formatted Diff
Diff
X86_64 benchmark result 1
(64.37 KB, text/plain)
2015-10-28 10:01 PDT
,
Mark Lam
no flags
Details
X86_64 benchmark result 2
(63.92 KB, text/plain)
2015-10-28 10:02 PDT
,
Mark Lam
no flags
Details
X86_64 benchmark result 3
(64.04 KB, text/plain)
2015-10-28 10:02 PDT
,
Mark Lam
no flags
Details
Patch for landing: incorporated Geoff's feedback, and added a JSRegress benchmark.
(26.04 KB, patch)
2015-10-28 11:25 PDT
,
Mark Lam
no flags
Details
Formatted Diff
Diff
Show Obsolete
(1)
View All
Add attachment
proposed patch, testcase, etc.
Mark Lam
Comment 1
2015-10-26 15:09:22 PDT
The patch is almost ready, but I'm seeing failed test i.e. got a bug. Will resolve that first before posting.
Mark Lam
Comment 2
2015-10-27 17:03:11 PDT
Created
attachment 264176
[details]
the patch.
Mark Lam
Comment 3
2015-10-27 17:04:53 PDT
Comment on
attachment 264176
[details]
the patch. This patch has passed the JSC tests. Still need to run benchmarks and layout tests.
Geoffrey Garen
Comment 4
2015-10-27 21:06:37 PDT
Comment on
attachment 264176
[details]
the patch. View in context:
https://bugs.webkit.org/attachment.cgi?id=264176&action=review
Code looks pretty good. I think I see one bug.
> Source/JavaScriptCore/ftl/FTLCompile.cpp:308 > +class BinaryArithGenerationContext {
Let's call this BinarySnippetRegisterContext.
> Source/JavaScriptCore/ftl/FTLCompile.cpp:330 > + void assignRegisters()
You can fold this function into the constructor.
> Source/JavaScriptCore/ftl/FTLCompile.cpp:345 > +#if CPU(X86_64) > + reservedRegisters.set(MacroAssembler::scratchRegister); > +#elif CPU(ARM64) > + reservedRegisters.set(MacroAssembler::dataTempRegister); > + reservedRegisters.set(MacroAssembler::memoryTempRegister); > +#endif
I think it would be nicer to give each MacroAssembler something like a static std::array of reserved registers, and have this code loop over that abstract array. That way, the definition of reserved registers lives with the assembler that reserves them. This could go in GPRInfo or MacroAssembler.
> Source/JavaScriptCore/ftl/FTLCompile.cpp:443 > + auto numberOfBytesUsedToPreserveReusedRegisters = > + allocator.preserveReusedRegistersByPushing(fastPathJIT, ScratchRegisterAllocator::ExtraStackSpace::NoExtraSpace); > + > + context.initializeRegisters(fastPathJIT);
This seems wrong. The code preserves all allocated scratch registers and then allocates new scratch registers. My understanding is that preservation is only correct if it happens after we have computed all the scratch registers we will use. I think you need to initializeRegisters before preserveReused...
Mark Lam
Comment 5
2015-10-27 21:20:35 PDT
Comment on
attachment 264176
[details]
the patch. View in context:
https://bugs.webkit.org/attachment.cgi?id=264176&action=review
>> Source/JavaScriptCore/ftl/FTLCompile.cpp:308 >> +class BinaryArithGenerationContext { > > Let's call this BinarySnippetRegisterContext.
OK, will do.
>> Source/JavaScriptCore/ftl/FTLCompile.cpp:330 >> + void assignRegisters() > > You can fold this function into the constructor.
Ok. Will do.
>> Source/JavaScriptCore/ftl/FTLCompile.cpp:345 >> +#endif > > I think it would be nicer to give each MacroAssembler something like a static std::array of reserved registers, and have this code loop over that abstract array. That way, the definition of reserved registers lives with the assembler that reserves them. This could go in GPRInfo or MacroAssembler.
OK, I'll look into it.
>> Source/JavaScriptCore/ftl/FTLCompile.cpp:443 >> + context.initializeRegisters(fastPathJIT); > > This seems wrong. The code preserves all allocated scratch registers and then allocates new scratch registers. My understanding is that preservation is only correct if it happens after we have computed all the scratch registers we will use. > > I think you need to initializeRegisters before preserveReused...
initializeRegisters() does not allocate scratch registers. It only initializes them. assignRegister() above (which will be folded into the BinarySnippetRegisterContext) did all needed allocations, and waits till after we have preserved their original values before we start writing into the registers here. Hence, there's no bug here.
Mark Lam
Comment 6
2015-10-28 10:01:44 PDT
Created
attachment 264220
[details]
X86_64 benchmark result 1
Mark Lam
Comment 7
2015-10-28 10:02:04 PDT
Created
attachment 264221
[details]
X86_64 benchmark result 2
Mark Lam
Comment 8
2015-10-28 10:02:47 PDT
Created
attachment 264222
[details]
X86_64 benchmark result 3 Benchmark results show no significant change.
Geoffrey Garen
Comment 9
2015-10-28 10:13:32 PDT
Can you add a benchmark that shows a change?
Mark Lam
Comment 10
2015-10-28 11:25:41 PDT
Created
attachment 264230
[details]
Patch for landing: incorporated Geoff's feedback, and added a JSRegress benchmark.
Mark Lam
Comment 11
2015-10-28 11:33:09 PDT
Here's the result of the newly added benchmark: base new ftl-object-sub 379.0623+-19.9868 ^ 236.6578+-3.0308 ^ definitely 1.6017x faster <geometric> 379.0623+-19.9868 ^ 236.6578+-3.0308 ^ definitely 1.6017x faster The 1.6x speed up is deceiving. Basically, this patch enables the FTL to compile functions that have ArithSub operating on UnusedType operands. The implementation of this ArithSub in the FTL is effectively the same as the DFG i.e. they both use the JITSubGenerator. In fact, the FTL is slightly slower because of the extra register setup and restoration work that it has too do to match the snippet's expectations. The real benefit of this patch is that it allows the FTL to compile that function at all. So, what I did in the benchmark was created a loop and some conditional logic that the FTL will optimize better than the DFG. Hence, what the benchmark is actually measuring is the speed benefits of the FTL optimizing that other logic, not the ArithSub itself. The difference is that with this patch, the FTL can now compile that function whereas before, the FTL cannot, and we had to run with the DFG version. Will land the patch shortly.
Mark Lam
Comment 12
2015-10-28 11:38:09 PDT
Thanks for the review. Landed in
r191683
: <
http://trac.webkit.org/r191683
>.
Filip Pizlo
Comment 13
2015-10-28 18:18:52 PDT
This looks like it regressed cdjs:
https://build.webkit.org/builders/Apple%20Mavericks%2032-bit%20JSC%20%28BuildAndTest%29/builds/13753/steps/webkit-32bit-jsc-test/logs/stdio
Mark Lam
Comment 14
2015-10-29 10:02:10 PDT
(In reply to
comment #13
)
> This looks like it regressed cdjs: >
https://build.webkit.org/builders/Apple%20Mavericks%2032
- > bit%20JSC%20%28BuildAndTest%29/builds/13753/steps/webkit-32bit-jsc-test/logs/ > stdio
I'm investigating, but I have not been able to reproduce this failure yet. Not all the bots are able to reproduce it either. Even the ones that do reproduce it does not do so consistently. I'll continue investigating.
Geoffrey Garen
Comment 15
2015-10-29 11:19:21 PDT
If we can't resolve in the next day or two, we should roll out. We have too many regressions accumulating.
Mark Lam
Comment 16
2015-10-29 13:07:21 PDT
Re-opening to address issue documented in
https://bugs.webkit.org/show_bug.cgi?id=150687
.
Mark Lam
Comment 17
2015-10-29 17:17:54 PDT
Found the issue. It was a dumb mistake. In BinarySnippetRegisterContext, I shuffle registers to make room for the tag registers and other reserved registers that the snippet expects. In BinarySnippetRegisterContext::restoreRegister(), I shuffled everything back ... except for the result register. If the result was ever assigned a scratch (e.g. the incoming result register assigned by LLVM is one the tag registers), then I need to copy the result back from its assigned scratch. Otherwise, we get the wrong result. I have a fix but I want to think it through more carefully first. Will upload a patch later tonight.
Mark Lam
Comment 18
2015-11-12 11:10:25 PST
With
https://bugs.webkit.org/show_bug.cgi?id=150712
, the remaining issues should be resolved.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug