Bug 212511 - br_table is slow or it doesn't scale
Summary: br_table is slow or it doesn't scale
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: Other
Hardware: All All
: P2 Enhancement
Assignee: Tadeu Zagallo
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-05-28 23:30 PDT by Vlad Brezae
Modified: 2020-06-04 16:10 PDT (History)
9 users (show)

See Also:


Attachments
Patch (56.87 KB, patch)
2020-06-02 15:40 PDT, Tadeu Zagallo
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Vlad Brezae 2020-05-28 23:30:40 PDT
Consider this C file https://gist.github.com/BrzVlad/1b5a6bdb20205db1c970edae709d337e. Compile it to wasm using an emsdk (emsdk/upstream/emscripten/emcc -O2 huge-interp.c). Run it once as is and once with SWITCH_CASE defined to nothing. While on v8 you can see that the performance is more or less the same, on jsc the test case becomes 10x slower, which shouldn't happen. Looking into the generated native code, it shows that br_table is implemented by doing a lot of sequential comparisons rather than dispatching through a table.

This is significantly impacting the performance of the mono runtime (when it is running in interpreter mode)
Comment 1 Radar WebKit Bug Importer 2020-05-31 17:40:15 PDT
<rdar://problem/63813245>
Comment 2 Tadeu Zagallo 2020-06-02 15:40:38 PDT
Created attachment 400861 [details]
Patch
Comment 3 Mark Lam 2020-06-02 15:57:10 PDT
Comment on attachment 400861 [details]
Patch

r=me if you've verified that JetStream2 and Speedometer2 are not impacted.  Can you confirm that perf is not impacted?
Comment 4 Saam Barati 2020-06-02 17:05:52 PDT
Comment on attachment 400861 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=400861&action=review

> Source/JavaScriptCore/ChangeLog:14
> +        already have LICM in DFG, it should be ok to disable it for now.

but B3's LICM will probably hoist things DFG LICM can't.

Why not just disable it for now in Wasm code?
Comment 5 Tadeu Zagallo 2020-06-03 15:59:16 PDT
Comment on attachment 400861 [details]
Patch

Ran perf tests and the results were neutral on JS2 and Speedometer2. Talked with Saam offline and he agreed to landing this patch and working on improving the phase in the future.
Comment 6 EWS 2020-06-03 16:18:35 PDT
Committed r262523: <https://trac.webkit.org/changeset/262523>

All reviewed patches have been landed. Closing bug and clearing flags on attachment 400861 [details].
Comment 7 Yusuke Suzuki 2020-06-04 15:54:29 PDT
testb3 is failing after this change, can you take a look?
Comment 8 Filip Pizlo 2020-06-04 16:10:39 PDT
(In reply to Saam Barati from comment #4)
> Comment on attachment 400861 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=400861&action=review
> 
> > Source/JavaScriptCore/ChangeLog:14
> > +        already have LICM in DFG, it should be ok to disable it for now.
> 
> but B3's LICM will probably hoist things DFG LICM can't.
> 
> Why not just disable it for now in Wasm code?

I added B3’s LICM for fun and completeness. Not sad to see it disabled if it causes any issues at all whatsoever since its upside is believed to be something like zero. LICM is a cheesy optimization anyway (huge win for some benchmarks but zero effect on the overwhelming majority of loops) and the DFG’s is really good. So you ha e very few LICM opportunities and DFG finds almost all of them. The ones left to B3 are not that interesting. 

And in wasm I would assume that licm rarely finds something that is both profitable and legal to hoist. B3’s LICM can’t hoist checking opcodes so that immediately means no hoisting of memory accesses. Memory access hoisting is the most profitable, so that means it’s just not going to be a big perf winner.