199866 – Better cache our serialization of the outer TDZ environment when creating FunctionExecutables during bytecode generation

RESOLVED FIXED Bug 199866

Better cache our serialization of the outer TDZ environment when creating FunctionExecutables during bytecode generation

https://bugs.webkit.org/show_bug.cgi?id=199866

Summary Better cache our serialization of the outer TDZ environment when creating Fun...

Andrew Brogdon

Reported 2019-07-17 09:04:26 PDT

I've run into an issue attempting to load a large JS file (the compiled-to-JavaScript version of the Dart SDK) in Safari, and I'm unable to replicate the problem in other, non-WebKit browsers. Loading the file as if it were a single script with no requireJS-style wrapper or ES6 module import works perfectly well (~5MB of JS in <300ms). When I add the `define` wrapper needed for requireJS or add an export statement and attempt to load the file via an ES6 module, however, Safari freezes for ~10 seconds during hard reloads. I'm unable to interact with the page at all. Once the script is loaded, I can usually reload the page without incurring the delay (I'm guessing whatever in-memory representation is made of the script file hangs out in between loads). I knocked out a GitHub repo to demonstrate the issue at https://github.com/RedBrogdon/parsing_speed. It has four HTML files, all of which try different techniques for loading the file. They're marked for which have a delay in Safari/WebKit and which do not. I've tested that code with Safari v12.1.1 (14607.2.6.1.1) and the MiniBrowser shipped with build 247494 of WebKit. Both show the same behavior, which I've been unable to replicate with any other browser. I'm not a JS engine expert by any means, but I'm guessing there's a parsing optimization that's not available when the code in my file is moved out of global scope and into a function/module, and it's possible that the behavior I'm seeing is WebKit working as intended. If that's the case and you can suggest some way for me to modify my code (or even point me to a guide that explains when/how parsing can be optimized), I'd appreciate it.

Attachments
Repro for 10x slower initialization time in Figma (2.33 MB, application/zip) 2020-10-20 23:12 PDT, evan.exe	no flags	Details
repro for JSC cli of dart code (4.72 MB, application/x-javascript) 2020-10-21 10:31 PDT, Saam Barati	no flags	Details
WIP (18.25 KB, patch) 2020-10-21 15:56 PDT, Saam Barati	no flags	Details Formatted Diff Diff
WIP to fix the Dart issue (18.93 KB, patch) 2020-10-21 17:45 PDT, Saam Barati	no flags	Details Formatted Diff Diff
WIP (31.80 KB, patch) 2020-10-21 22:55 PDT, Saam Barati	no flags	Details Formatted Diff Diff
WIP (77.63 KB, patch) 2020-10-22 17:11 PDT, Saam Barati	no flags	Details Formatted Diff Diff
WIP (78.25 KB, patch) 2020-10-23 12:36 PDT, Saam Barati	no flags	Details Formatted Diff Diff
patch (5.68 MB, patch) 2020-10-23 19:43 PDT, Saam Barati	ews-feeder: commit-queue-	Details Formatted Diff Diff
patch without huge test case (83.04 KB, patch) 2020-10-23 19:44 PDT, Saam Barati	ews-feeder: commit-queue-	Details Formatted Diff Diff
patch (5.68 MB, patch) 2020-10-26 10:21 PDT, Saam Barati	no flags	Details Formatted Diff Diff
patch (5.68 MB, patch) 2020-10-26 10:22 PDT, Saam Barati	no flags	Details Formatted Diff Diff
same patch, but without huge test case (83.09 KB, patch) 2020-10-26 10:23 PDT, Saam Barati	tzagallo: review+	Details Formatted Diff Diff
patch (5.68 MB, patch) 2020-10-26 10:27 PDT, Saam Barati	no flags	Details Formatted Diff Diff
patch for landing (4.92 MB, patch) 2020-10-27 17:58 PDT, Saam Barati	no flags	Details Formatted Diff Diff
patch for landing (4.92 MB, patch) 2020-10-27 18:02 PDT, Saam Barati	no flags	Details Formatted Diff Diff
patch for landing (4.92 MB, patch) 2020-10-27 18:11 PDT, Saam Barati	no flags	Details Formatted Diff Diff
Show Obsolete (13) View All Add attachment proposed patch, testcase, etc.

Keith Miller

Comment 1 2019-07-19 18:13:39 PDT

Interesting, I tried to take a quick look and it looks like we are spending a lot of time in scope resolution code. I couldn't get the vanilla version to run in the JSC CLI. I'll try to take a look at this soon though.

Radar WebKit Bug Importer

Comment 2 2019-07-19 18:13:53 PDT

<rdar://problem/53333108>

Andrew Brogdon

Comment 3 2019-07-22 21:38:53 PDT

(In reply to Keith Miller from comment #1) > I'll try to take a look at this soon though. Awesome, thanks. If there's anything I can do to help track down the issue, let me know.

Andrew Brogdon

Comment 4 2019-08-07 08:39:10 PDT

Just checking in -- has anyone had time to look into the issue further?

Tadeu Zagallo

Comment 5 2019-11-21 12:52:12 PST

*** Bug 204462 has been marked as a duplicate of this bug. ***

Keith Miller

Comment 6 2019-11-21 12:56:25 PST

Wow, sorry about the delay on this! I completely forgot about it... I'll look into this now.

Andrew Brogdon

Comment 7 2019-12-03 13:13:53 PST

Thanks for taking a look at the issue!

Andrew Brogdon

Comment 8 2019-12-03 13:15:26 PST

As I mentioned above, if there's any additional info we can provide, I'm happy to track it down.

Keith Miller

Comment 9 2020-01-22 18:41:21 PST

So, I figured out what the problem is but I haven't had time to implement a fix yet. Basically, the problem is in how we track if we need a TDZ check. If a work around is helpful, I tried changing dart_sdk_es6.js to use "var" everywhere instead of "let"/"const" and it ran instantly in about 300ms in the JSC CLI.

Andrew Brogdon

Comment 10 2020-01-27 11:22:07 PST

(In reply to Keith Miller from comment #9) > So, I figured out what the problem is but I haven't had time to implement a > fix yet. Basically, the problem is in how we track if we need a TDZ check. > > If a work around is helpful, I tried changing dart_sdk_es6.js to use "var" > everywhere instead of "let"/"const" and it ran instantly in about 300ms in > the JSC CLI. Interesting -- and thanks for taking the time to diagnose it. I'll communicate the workaround back to the Dart team.

Andrew Brogdon

Comment 11 2020-03-19 15:49:09 PDT

An update from our end: the Dart SDK team has modified their development compiler to implement the workaround you suggested (replace "let/const" with "var") in its output. That change made its way into DartPad a few days ago, and we're no longer seeing the delay. Thanks for the help!

Keith Miller

Comment 12 2020-09-01 15:44:35 PDT

It would probably be worthwhile to, at least roughly, document what a working solution would be. I think we can have each scope contain a shared (with each nested closure) TDZ epoch object. This object would have the TDZ epoch for each let/const variable declared in that scope. At each closure we would make sure to record the current epoch lexically at that point. Then whenever we load a variable from a scope we would just check if the function's epoch is greater than the variable's. If it's not then we need a TDZ check.

evan.exe

Comment 13 2020-10-20 23:12:48 PDT

Created attachment 411960 [details] Repro for 10x slower initialization time in Figma I'm coming here from https://github.com/evanw/esbuild/issues/478. Attached are two builds of the Figma code base (https://www.figma.com/) bundled with the esbuild bundler. One is a normal build and loads in ~2000ms in Safari. The other one has a workaround where let/const/class statements are replaced with var statements and loads in ~200ms in Safari. See the readme for instructions. I hope this is helpful. Let me know if you need more information from me. (This code is fine to share publicly since it's basically a more minified version of the code that's served on figma.com when you use the app)

Mihai Parparita

Comment 14 2020-10-20 23:25:33 PDT

*** Bug 212962 has been marked as a duplicate of this bug. ***

Saam Barati

Comment 15 2020-10-21 10:31:44 PDT

Created attachment 412007 [details] repro for JSC cli of dart code If you remove the outermost IIFE, we compile bytecode in 40ms. If you put in the IIFE, we compile bytecode in 2000ms.

Saam Barati

Comment 16 2020-10-21 15:56:09 PDT

Created attachment 412040 [details] WIP 2400ms -> 40ms

Saam Barati

Comment 17 2020-10-21 17:37:23 PDT

It appears the Figma issue is different. Looking into it now.

Saam Barati

Comment 18 2020-10-21 17:45:02 PDT

Created attachment 412054 [details] WIP to fix the Dart issue

Saam Barati

Comment 19 2020-10-21 22:55:00 PDT

Created attachment 412068 [details] WIP Fixes our pathologies in both Dart/Sigma. Just requires some final touches to the patch tomorrow, then I'll have it ready for review.

Saam Barati

Comment 20 2020-10-22 16:20:14 PDT

(In reply to Saam Barati from comment #17) > It appears the Figma issue is different. Looking into it now. This is wrong. The fix is the same for both, but there are two things I'm optimizing, and Figma needed both to be fully performant, where Dart needed like 1.5 of the things.

Saam Barati

Comment 21 2020-10-22 17:11:15 PDT

Created attachment 412146 [details] WIP

Saam Barati

Comment 22 2020-10-23 12:36:51 PDT

Created attachment 412206 [details] WIP

Saam Barati

Comment 23 2020-10-23 19:43:29 PDT

Created attachment 412225 [details] patch

Saam Barati

Comment 24 2020-10-23 19:44:34 PDT

Created attachment 412226 [details] patch without huge test case might be easier to review without the ~5MB test case

Saam Barati

Comment 25 2020-10-26 09:46:32 PDT

Looking into the failures now

Saam Barati

Comment 26 2020-10-26 10:19:06 PDT

I broke TDZ checks as I refactored my patch, forgot to actually assign to a member field inside the constructor of the environment ;-)

Saam Barati

Comment 27 2020-10-26 10:21:10 PDT

Created attachment 412328 [details] patch

Saam Barati

Comment 28 2020-10-26 10:22:39 PDT

Created attachment 412329 [details] patch

Saam Barati

Comment 29 2020-10-26 10:23:18 PDT

Created attachment 412330 [details] same patch, but without huge test case

Saam Barati

Comment 30 2020-10-26 10:27:37 PDT

Created attachment 412333 [details] patch

Tadeu Zagallo

Comment 31 2020-10-26 17:29:34 PDT

Comment on attachment 412330 [details] same patch, but without huge test case View in context: https://bugs.webkit.org/attachment.cgi?id=412330&action=review r=me > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:366 > + m_parentScopeTDZVariables = parentScopeTDZVariables; This is not necessary, right? All this information is already in m_cachedVariablesUnderTDZ > Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:367 > + m_cachedVariablesUnderTDZ = m_parentScopeTDZVariables; Could this be implemented as just a link to the parent environment?

Saam Barati

Comment 32 2020-10-27 17:39:02 PDT

Comment on attachment 412330 [details] same patch, but without huge test case View in context: https://bugs.webkit.org/attachment.cgi?id=412330&action=review >> Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:366 >> + m_parentScopeTDZVariables = parentScopeTDZVariables; > > This is not necessary, right? All this information is already in m_cachedVariablesUnderTDZ Yeah, we can just turn this into an integer to know where to start the search in m_cachedVariablesUnderTDZ >> Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:367 >> + m_cachedVariablesUnderTDZ = m_parentScopeTDZVariables; > > Could this be implemented as just a link to the parent environment? Tadeu and I spoke offline about this. We could definitely move this to being a linked list/inverse tree like data structure, but we're going to stick with a vector for now, until the data says it's necessary

Saam Barati

Comment 33 2020-10-27 17:58:53 PDT

Created attachment 412487 [details] patch for landing

Saam Barati

Comment 34 2020-10-27 18:02:36 PDT

Created attachment 412488 [details] patch for landing

Saam Barati

Comment 35 2020-10-27 18:11:00 PDT

Created attachment 412490 [details] patch for landing

EWS

Comment 36 2020-10-28 12:25:19 PDT

Committed r269115: <https://trac.webkit.org/changeset/269115> All reviewed patches have been landed. Closing bug and clearing flags on attachment 412490 [details].

Darin Adler

Comment 37 2020-10-28 12:39:15 PDT

Comment on attachment 412330 [details] same patch, but without huge test case View in context: https://bugs.webkit.org/attachment.cgi?id=412330&action=review > Source/WTF/wtf/RefPtr.h:-200 > -template<typename T, typename U, typename V, typename X, typename Y, typename Z, typename = std::enable_if_t<!std::is_same<U, DumbPtrTraits<T>>::value || !std::is_same<Y, DumbPtrTraits<X>>::value>> > -inline void swap(RefPtr<T, U, V>& a, RefPtr<X, Y, Z>& b) > -{ > - a.swap(b); > -} This is intended to be a performance optimization, so calls to swap(a, b) when both are RefPtr don’t ref/deref. The change log says that this function template is no longer necessary. My guess is that it was never "necessary" but maybe it was an effective optimization. Could you clarify how you determined we didn’t need the optimization? Maybe move semantics mean this isn’t as important to optimize as it once was. Or maybe inlining causes all the ref/deref pairs to cancel each other out. Or maybe it is not big deal to churn reference because that additional code and execution time is not enough to be a problem. Or something else perhaps.

Andrew Brogdon

Comment 38 2020-10-29 00:12:05 PDT

Thanks for the fix, y'all!

Saam Barati

Comment 39 2020-10-29 15:58:10 PDT

(In reply to Darin Adler from comment #37) > Comment on attachment 412330 [details] > same patch, but without huge test case > > View in context: > https://bugs.webkit.org/attachment.cgi?id=412330&action=review > > > Source/WTF/wtf/RefPtr.h:-200 > > -template<typename T, typename U, typename V, typename X, typename Y, typename Z, typename = std::enable_if_t<!std::is_same<U, DumbPtrTraits<T>>::value || !std::is_same<Y, DumbPtrTraits<X>>::value>> > > -inline void swap(RefPtr<T, U, V>& a, RefPtr<X, Y, Z>& b) > > -{ > > - a.swap(b); > > -} > > This is intended to be a performance optimization, so calls to swap(a, b) > when both are RefPtr don’t ref/deref. The change log says that this function > template is no longer necessary. My guess is that it was never "necessary" > but maybe it was an effective optimization. Could you clarify how you > determined we didn’t need the optimization? Maybe move semantics mean this > isn’t as important to optimize as it once was. Or maybe inlining causes all > the ref/deref pairs to cancel each other out. Or maybe it is not big deal to > churn reference because that additional code and execution time is not > enough to be a problem. Or something else perhaps. I removed it because it was causing a compilation error. This function was also only available when not using DumbPtrTraits, so I'm not sure how it was ever much of an optimization given the limited use. I'd also expect the performance optimization to be more important for things with custom ref/deref functions, and for ThreadSafeRefCounted. For normal RefCounted, I'd expect the compiler to optimize it out. Let me see if I can get a POC of the compiler error I was encountering. I'm happy add this back if I can get it to compile

Saam Barati

Comment 40 2020-10-29 16:33:31 PDT

Here's a POC. C++ program: ``` #include <vector> #include <algorithm> namespace WTF { template <typename T> class RefPtr { public: RefPtr(T* ptr) : m_ptr(ptr) { } RefPtr& operator=(const RefPtr<T>& other) { m_ptr = other.m_ptr; return *this; } void swap(RefPtr<T>& other) { std::swap(m_ptr, other.m_ptr); } T* m_ptr; }; template<typename T, typename U> inline void swap(RefPtr<T>& a, RefPtr<U>& b) { a.swap(b); } } using WTF::RefPtr; namespace JavaScriptCore { void foo() { std::vector<RefPtr<int>> v; std::sort(v.begin(), v.end(), [&] (auto& a, auto& b) { return a.m_ptr < b.m_ptr; }); } } ``` And how to compile: $ clang++ test.cpp -stdlib=libc++ -std=c++17 And the compile error is many lines looking like this: ``` In file included from test.cpp:1: In file included from <.....>/usr/bin/../include/c++/v1/vector:274: In file included from <.....>/usr/bin/../include/c++/v1/__bit_reference:15: <....>/usr/bin/../include/c++/v1/algorithm:3945:17: error: call to 'swap' is ambiguous swap(*__first, *__last); ^~~~ <....>/usr/bin/../include/c++/v1/algorithm:4125:12: note: in instantiation of function template specialization 'std::__1::__sort<(lambda at test.cpp:43:35) &, WTF::RefPtr<int> *>' requested here _VSTD::__sort<_Comp_ref>(__first, __last, _Comp_ref(__comp)); ^ <....>/usr/bin/../include/c++/v1/algorithm:4158:12: note: in instantiation of function template specialization 'std::__1::sort<WTF::RefPtr<int> *, (lambda at test.cpp:43:35) &>' requested here _VSTD::sort<_Tp*, _Comp_ref>(__first.base(), __last.base(), __comp); ^ test.cpp:43:10: note: in instantiation of function template specialization 'std::__1::sort<WTF::RefPtr<int>, (lambda at test.cpp:43:35)>' requested here std::sort(v.begin(), v.end(), [&] (auto& a, auto& b) { ``` I gave up after a couple hours of trying to resolve this, given I was doubtful that the swap optimization bought us any perf.

Darin Adler

Comment 41 2020-10-30 13:01:43 PDT

Can we discuss this in another bug report? The language/library feature we are trying to take advantage of is here: https://en.cppreference.com/w/cpp/named_req/Swappable

Darin Adler

Comment 42 2020-10-30 13:06:54 PDT

(In reply to Saam Barati from comment #40) > template<typename T, typename U> > inline void swap(RefPtr<T>& a, RefPtr<U>& b) In the test you provided, changing the lines above to these fixes the error and restores the optimization: template<typename T> inline void swap(RefPtr<T>& a, RefPtr<T>& b) Maybe there are other possible solutions; this is just the first thing I tried.

Saam Barati

Comment 43 2020-11-02 12:52:38 PST

(In reply to Darin Adler from comment #41) > Can we discuss this in another bug report? The language/library feature we > are trying to take advantage of is here: > > https://en.cppreference.com/w/cpp/named_req/Swappable I opened https://bugs.webkit.org/show_bug.cgi?id=218454

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Mac

OS macOS 10.14

Product WebKit

Component JavaScriptCore

Assignee

Saam Barati

Reported

2019-07-17 09:04 PDT

Modified

2020-11-02 12:52 PST History

CC List

25 users Show

URL

Keywords InRadar

Duplicates (2)

204462 212962 View as bug list

Depends on

Blocks

benjamin
cdumez
chi187
cmarcelo
comexk
cyb.ai.815
darin
evan.exe
ews-watchlist
fpizlo
hi
jmuizelaar
joepeck
keith_miller
mark.lam
m.goleb+bugzilla
mihaip
mjs
msaboff
nikita
ross.kirsling
saam
tzagallo
webkit-bug-importer
ysuzuki