This will be fun.
Current status: - It runs a lot of JS code, but not all of it. It usually runs it a bit slower than if it was compiled with LLVM. The largest single benchmark slowdown is ~25% on mandreel, and the largest single benchmark speed-up is ~15% on imaging-gaussian-blur. More than half of the benchmarks are neutral, but the slowdowns currently dominate. We think we know all of the optimizations that we are missing, and most of them are trivial (like missing instruction selection patterns and missing strength reduction rules). - Still need to finish gluing our FTL exception implementation to B3's notion of patchpoints. Currently any code that uses exceptions in FTL B3 will crash or do strange things, because most of the exception paths are unimplemented. - Still need to finish fixing the tailcall implementation. It works for simple cases, but encounters an assertion in the callframe shuffler. I haven't investigated this yet. - ARM64 is being worked on, but it's not done, yet. - We anticipate having to implement some more sophisticated optimizations in B3, like load elimination, tail duplication, CSE with Phi insertion, sinking, and probably LSR. We don't think that we will need all of them, but probably some of them, to meet our performance goals. - We anticipate having to beef up how the register allocator handles pressure. Currently it does the classic kind of spilling where every access to a spilled temporary turns into a stack access. That's suboptimal. My current thinking is that we want something like what GCC calls "reload", but much more focused on spill handling and rematerialization. Rematerialization is a big deal because the way JSC works, we end up with a lot of large constants, and they are easier to remat than spill. We'll approach this by first filling in all coverage and getting FTL B3 to pass all tests. We're not there yet, but it will probably happen soon - maybe even this week. Then, we'll optimize the heck out of it. We can turn on FTL B3, and remove FTL LLVM, if and when our Octane, JetStream, and Kraken scores with FTL B3 are as good as with FTL LLVM. It's still possible (but unlikely) that we won't get there, in which case we can explore other options, like slotting FTL B3 in as a middle tier.
As of http://trac.webkit.org/changeset/194805, FTL B3 passes all JSC tests on Mac.
Created attachment 268633 [details] current performance, before tuning
As of http://trac.webkit.org/changeset/195549, FTL B3 is as fast as FTL LLVM on X86_64. We can still make FTL B3 even better, but at this point, we should enable it on trunk on X86_64.
*** Bug 118400 has been marked as a duplicate of this bug. ***
*** Bug 127245 has been marked as a duplicate of this bug. ***
B3 is now enabled on all platforms that support the FTL JIT.
Comment on attachment 268633 [details] current performance, before tuning We've sped up a lot since then. See the blog post for latest perf numbers.
FTL does use B3 as a backend! Yaaayyyy!