WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
271610
Web Inspector: Generating JS builtins for createInspectorInjectedScript is very slow
https://bugs.webkit.org/show_bug.cgi?id=271610
Summary
Web Inspector: Generating JS builtins for createInspectorInjectedScript is ve...
Philippe Normand
Reported
2024-03-24 09:15:10 PDT
Takes 30 to 40 seconds and blocks the build... That function nests so many other functions, BuiltinsCollection._parse_functions() is the bottleneck, using various regular expressions, looking for patterns on the whole file every time. Would it be possible to move some of these nested functions out? Was there a specific reason for the current design?
Attachments
Add attachment
proposed patch, testcase, etc.
Philippe Normand
Comment 1
2024-03-24 09:17:12 PDT
time python Source/JavaScriptCore/Scripts/generate-js-builtins.py --framework JavaScriptCore --output-directory /tmp/ --combined Source/JavaScriptCore/builtins/*.js Source/JavaSc riptCore/inspector/InjectedScriptSource.js ________________________________________________________ Executed in 43.86 secs fish external usr time 43.68 secs 394.00 micros 43.68 secs sys time 0.05 secs 91.00 micros 0.05 secs
Radar WebKit Bug Importer
Comment 2
2024-03-31 23:16:13 PDT
<
rdar://problem/125703146
>
Karl Dubost
Comment 3
2024-04-01 00:08:17 PDT
https://searchfox.org/wubkat/rev/473ca5f8512b88edd7e82c8783e7e09158f17ba1/Source/JavaScriptCore/Scripts/wkbuiltins/builtins_model.py#314-373
This is calling matches = [func for func in functionHeadRegExp.finditer(text)] which is functionHeadRegExp = re.compile(r"(?:@[\w|=\[\] \"\.]+\s*\n)*(?:async\s+)?function\s+\w+\s*\(.*?\)", re.MULTILINE | re.DOTALL) which then ends up in another for loop: for match in matches: The latest big change on this code was done on May 2022
https://bugs.webkit.org/show_bug.cgi?id=152294
https://github.com/WebKit/WebKit/commit/1d75ec6c3ee1575325c7af691084fb59a6e7587c
I wonder about this too.
https://searchfox.org/wubkat/rev/473ca5f8512b88edd7e82c8783e7e09158f17ba1/Source/JavaScriptCore/Scripts/wkbuiltins/builtins_model.py#163
Karl Dubost
Comment 4
2024-04-02 01:36:18 PDT
TLDR As a starting reference, on my laptop, I get: 16.54s user 0.04s system 99% cpu 16.715 total but most of that time is used by one file: InjectedScriptSource.js, if I remove it I get: 0.93s user 0.02s system 97% cpu 0.973 total More details: _parse_functions is relatively fast for most of the iterations, but indeed InjectedScriptSource.js is one of the bottlenecks. Large file. This is used here.
https://searchfox.org/wubkat/rev/473ca5f8512b88edd7e82c8783e7e09158f17ba1/Source/JavaScriptCore/Scripts/wkbuiltins/builtins_model.py#215-217
It chokes only on one iteration. Replacing the code by: import time tic = time.perf_counter() parsed_functions = self._parse_functions(text) toc = time.perf_counter() print(f"_parse_functions: {toc - tic:0.4f}s / {filename} / {len(text)} chars") The number of matches is given by length(matches) on matches = [func for func in functionHeadRegExp.finditer(text)] nb of matches: 6 / _parse_functions: 0.0461s / ArrayConstructor.js / 7841 chars nb of matches: 2 / _parse_functions: 0.0053s / ArrayIteratorPrototype.js / 3003 chars nb of matches: 34 / _parse_functions: 0.1062s / ArrayPrototype.js / 25784 chars nb of matches: 8 / _parse_functions: 0.0278s / AsyncFromSyncIteratorPrototype.js / 6395 chars nb of matches: 3 / _parse_functions: 0.0063s / AsyncFunctionPrototype.js / 2819 chars nb of matches: 21 / _parse_functions: 0.0587s / AsyncGeneratorPrototype.js / 11595 chars nb of matches: 1 / _parse_functions: 0.0001s / AsyncIteratorPrototype.js / 1460 chars nb of matches: 3 / _parse_functions: 0.0009s / FunctionPrototype.js / 2042 chars nb of matches: 4 / _parse_functions: 0.0081s / GeneratorPrototype.js / 3878 chars nb of matches: 1 / _parse_functions: 0.0003s / GlobalObject.js / 1676 chars nb of matches: 1 / _parse_functions: 0.0017s / GlobalOperations.js / 1985 chars nb of matches: 2 / _parse_functions: 0.0109s / InternalPromiseConstructor.js / 3159 chars nb of matches: 4 / _parse_functions: 0.0046s / IteratorHelpers.js / 3598 chars nb of matches: 1 / _parse_functions: 0.0001s / IteratorPrototype.js / 1447 chars nb of matches: 1 / _parse_functions: 0.0056s / MapConstructor.js / 2614 chars nb of matches: 2 / _parse_functions: 0.0024s / MapIteratorPrototype.js / 2331 chars nb of matches: 1 / _parse_functions: 0.0016s / MapPrototype.js / 1902 chars nb of matches: 20 / _parse_functions: 0.0521s / ModuleLoader.js / 24361 chars nb of matches: 1 / _parse_functions: 0.0003s / NumberConstructor.js / 1588 chars nb of matches: 2 / _parse_functions: 0.0066s / ObjectConstructor.js / 3223 chars nb of matches: 18 / _parse_functions: 0.0780s / PromiseConstructor.js / 13018 chars nb of matches: 29 / _parse_functions: 0.0746s / PromiseOperations.js / 21115 chars nb of matches: 3 / _parse_functions: 0.0078s / PromisePrototype.js / 3204 chars nb of matches: 5 / _parse_functions: 0.0203s / ProxyHelpers.js / 5748 chars nb of matches: 4 / _parse_functions: 0.0018s / ReflectObject.js / 2801 chars nb of matches: 13 / _parse_functions: 0.2080s / RegExpPrototype.js / 22590 chars nb of matches: 1 / _parse_functions: 0.0091s / RegExpStringIteratorPrototype.js / 2694 chars nb of matches: 2 / _parse_functions: 0.0018s / SetIteratorPrototype.js / 2219 chars nb of matches: 9 / _parse_functions: 0.0362s / SetPrototype.js / 9197 chars nb of matches: 4 / _parse_functions: 0.0049s / ShadowRealmPrototype.js / 3285 chars nb of matches: 1 / _parse_functions: 0.0038s / StringConstructor.js / 2200 chars nb of matches: 1 / _parse_functions: 0.0082s / StringIteratorPrototype.js / 2588 chars nb of matches: 29 / _parse_functions: 0.0356s / StringPrototype.js / 13145 chars nb of matches: 2 / _parse_functions: 0.0283s / TypedArrayConstructor.js / 4428 chars nb of matches: 14 / _parse_functions: 0.0224s / TypedArrayPrototype.js / 9332 chars nb of matches: 2 / _parse_functions: 0.0005s / WebAssembly.js / 1725 chars nb of matches: 20 / _parse_functions: 15.9046s / InjectedScriptSource.js / 62229 chars Let's see inside: matches = [func for func in functionHeadRegExp.finditer(text)] is not the bottleneck. matches perf: 0.0002s / matches: 6 matches perf: 0.0001s / matches: 2 matches perf: 0.0010s / matches: 34 matches perf: 0.0002s / matches: 8 matches perf: 0.0001s / matches: 3 matches perf: 0.0004s / matches: 21 matches perf: 0.0000s / matches: 1 matches perf: 0.0000s / matches: 3 matches perf: 0.0001s / matches: 4 matches perf: 0.0000s / matches: 1 matches perf: 0.0000s / matches: 1 matches perf: 0.0001s / matches: 2 matches perf: 0.0001s / matches: 4 matches perf: 0.0000s / matches: 1 matches perf: 0.0000s / matches: 1 matches perf: 0.0000s / matches: 2 matches perf: 0.0000s / matches: 1 matches perf: 0.0004s / matches: 20 matches perf: 0.0000s / matches: 1 matches perf: 0.0001s / matches: 2 matches perf: 0.0004s / matches: 18 matches perf: 0.0007s / matches: 29 matches perf: 0.0001s / matches: 3 matches perf: 0.0002s / matches: 5 matches perf: 0.0000s / matches: 4 matches perf: 0.0006s / matches: 13 matches perf: 0.0001s / matches: 1 matches perf: 0.0000s / matches: 2 matches perf: 0.0003s / matches: 9 matches perf: 0.0001s / matches: 4 matches perf: 0.0000s / matches: 1 matches perf: 0.0000s / matches: 1 matches perf: 0.0004s / matches: 29 matches perf: 0.0001s / matches: 2 matches perf: 0.0003s / matches: 14 matches perf: 0.0000s / matches: 2 matches perf: 0.0020s / matches: 20 Ah found the thing which takes time is the return statement. tuc = time.perf_counter() print(f"before list(map): {tuc - tic:0.8f}s") thingtoreturn = list(map(BuiltinFunction.fromString, functionStrings)) tec = time.perf_counter() print(f"before return: {tec - tic:0.8f}s") return thingtoreturn We get ---- matches perf: 0.00086213s / matches: 6 after for loop: 0.00204904s before list(map): 0.00206413s before return: 0.10147529s ---- matches perf: 0.00015438s / matches: 2 after for loop: 0.00035775s before list(map): 0.00036813s before return: 0.01159988s ---- matches perf: 0.00195908s / matches: 34 after for loop: 0.00354625s before list(map): 0.00356558s before return: 0.17573083s ---- matches perf: 0.00039850s / matches: 8 after for loop: 0.00065675s before list(map): 0.00066321s before return: 0.03363075s ---- matches perf: 0.00006025s / matches: 3 after for loop: 0.00013025s before list(map): 0.00013296s before return: 0.00832750s ---- matches perf: 0.00067088s / matches: 21 after for loop: 0.00126525s before list(map): 0.00127637s before return: 0.08663021s ---- matches perf: 0.00000875s / matches: 1 after for loop: 0.00003654s before list(map): 0.00004133s before return: 0.00020388s ---- matches perf: 0.00005858s / matches: 3 after for loop: 0.00018046s before list(map): 0.00020887s before return: 0.00143387s ---- matches perf: 0.00019867s / matches: 4 after for loop: 0.00035454s before list(map): 0.00036325s before return: 0.00981967s ---- matches perf: 0.00001000s / matches: 1 after for loop: 0.00002362s before list(map): 0.00002583s before return: 0.00036204s ---- matches perf: 0.00005350s / matches: 1 after for loop: 0.00012946s before list(map): 0.00013804s before return: 0.00283438s ---- matches perf: 0.00006054s / matches: 2 after for loop: 0.00014371s before list(map): 0.00014733s before return: 0.01643325s ---- matches perf: 0.00015025s / matches: 4 after for loop: 0.00038112s before list(map): 0.00039075s before return: 0.00595804s ---- matches perf: 0.00002438s / matches: 1 after for loop: 0.00005275s before list(map): 0.00005858s before return: 0.00020354s ---- matches perf: 0.00006971s / matches: 1 after for loop: 0.00013025s before list(map): 0.00013554s before return: 0.00682037s ---- matches perf: 0.00004017s / matches: 2 after for loop: 0.00008846s before list(map): 0.00009133s before return: 0.00295483s ---- matches perf: 0.00002750s / matches: 1 after for loop: 0.00007175s before list(map): 0.00007696s before return: 0.00219413s ---- matches perf: 0.00042275s / matches: 20 after for loop: 0.00098187s before list(map): 0.00098813s before return: 0.05855404s ---- matches perf: 0.00001100s / matches: 1 after for loop: 0.00005625s before list(map): 0.00006233s before return: 0.00034088s ---- matches perf: 0.00013438s / matches: 2 after for loop: 0.00021438s before list(map): 0.00021813s before return: 0.00783958s ---- matches perf: 0.00047179s / matches: 18 after for loop: 0.00107554s before list(map): 0.00108154s before return: 0.08648308s ---- matches perf: 0.00065763s / matches: 29 after for loop: 0.00149079s before list(map): 0.00151117s before return: 0.07941017s ---- matches perf: 0.00006600s / matches: 3 after for loop: 0.00018075s before list(map): 0.00020383s before return: 0.00815983s ---- matches perf: 0.00017421s / matches: 5 after for loop: 0.00040908s before list(map): 0.00042392s before return: 0.02092254s ---- matches perf: 0.00005058s / matches: 4 after for loop: 0.00010371s before list(map): 0.00011133s before return: 0.00197779s ---- matches perf: 0.00065625s / matches: 13 after for loop: 0.00158467s before list(map): 0.00159621s before return: 0.21502942s ---- matches perf: 0.00005500s / matches: 1 after for loop: 0.00012642s before list(map): 0.00012967s before return: 0.00934592s ---- matches perf: 0.00003154s / matches: 2 after for loop: 0.00007296s before list(map): 0.00007908s before return: 0.00186450s ---- matches perf: 0.00027429s / matches: 9 after for loop: 0.00065712s before list(map): 0.00066079s before return: 0.03788817s ---- matches perf: 0.00006625s / matches: 4 after for loop: 0.00014829s before list(map): 0.00015296s before return: 0.00489046s ---- matches perf: 0.00003221s / matches: 1 after for loop: 0.00008021s before list(map): 0.00008400s before return: 0.00396992s ---- matches perf: 0.00005129s / matches: 1 after for loop: 0.00012296s before list(map): 0.00012596s before return: 0.00840154s ---- matches perf: 0.00042504s / matches: 29 after for loop: 0.00095896s before list(map): 0.00096475s before return: 0.03761792s ---- matches perf: 0.00010217s / matches: 2 after for loop: 0.00024237s before list(map): 0.00024500s before return: 0.02967329s ---- matches perf: 0.00025683s / matches: 14 after for loop: 0.00061087s before list(map): 0.00061604s before return: 0.02288504s ---- matches perf: 0.00001508s / matches: 2 after for loop: 0.00003850s before list(map): 0.00004042s before return: 0.00047821s ---- matches perf: 0.00228304s / matches: 20 after for loop: 0.00531071s before list(map): 0.00531458s before return: 15.74802542s So the list(map(BuiltinFunction.fromString, functionStrings)) is the thing to improve.
Karl Dubost
Comment 5
2024-04-02 01:45:35 PDT
diving a bit more in the rabbit hole. import time print(f"----") tic = time.perf_counter() themap = map(BuiltinFunction.fromString, functionStrings) toc = time.perf_counter() print(f"how long for map: {toc - tic:0.8f}s") tec = time.perf_counter() thingtoreturn = list(themap) tuc = time.perf_counter() print(f"how long for list: {tuc - tec:0.8f}s") return thingtoreturn map(BuiltinFunction.fromString, functionStrings) is very fast. but list() is the slow bit. ---- how long for map: 0.00000067s how long for list: 0.04454333s ---- how long for map: 0.00000054s how long for list: 0.00522896s ---- how long for map: 0.00000025s how long for list: 0.10435229s ---- how long for map: 0.00000008s how long for list: 0.02742804s ---- how long for map: 0.00000029s how long for list: 0.00610137s ---- how long for map: 0.00000008s how long for list: 0.05791729s ---- how long for map: 0.00000013s how long for list: 0.00007237s ---- how long for map: 0.00000013s how long for list: 0.00085775s ---- how long for map: 0.00000017s how long for list: 0.00789221s ---- how long for map: 0.00000008s how long for list: 0.00032267s ---- how long for map: 0.00000038s how long for list: 0.00170508s ---- how long for map: 0.00000021s how long for list: 0.01071083s ---- how long for map: 0.00000017s how long for list: 0.00450279s ---- how long for map: 0.00000017s how long for list: 0.00004983s ---- how long for map: 0.00000029s how long for list: 0.00536333s ---- how long for map: 0.00000017s how long for list: 0.00225646s ---- how long for map: 0.00000008s how long for list: 0.00150821s ---- how long for map: 0.00000008s how long for list: 0.05064929s ---- how long for map: 0.00000029s how long for list: 0.00026858s ---- how long for map: 0.00000013s how long for list: 0.00655154s ---- how long for map: 0.00000025s how long for list: 0.07719392s ---- how long for map: 0.00000017s how long for list: 0.07323875s ---- how long for map: 0.00000008s how long for list: 0.00774633s ---- how long for map: 0.00000017s how long for list: 0.01983171s ---- how long for map: 0.00000017s how long for list: 0.00177975s ---- how long for map: 0.00000021s how long for list: 0.20562171s ---- how long for map: 0.00000063s how long for list: 0.00885329s ---- how long for map: 0.00000054s how long for list: 0.00178362s ---- how long for map: 0.00000033s how long for list: 0.03592921s ---- how long for map: 0.00000038s how long for list: 0.00455142s ---- how long for map: 0.00000017s how long for list: 0.00376550s ---- how long for map: 0.00000008s how long for list: 0.00770092s ---- how long for map: 0.00000008s how long for list: 0.03476367s ---- how long for map: 0.00000033s how long for list: 0.02836421s ---- how long for map: 0.00000021s how long for list: 0.02175638s ---- how long for map: 0.00000008s how long for list: 0.00043421s ---- how long for map: 0.00000017s how long for list: 15.65461208s
Karl Dubost
Comment 6
2024-04-02 02:08:59 PDT
map is an iterator, list will convert it to a real list, but it does it one at a time. Maybe the next step to try would be to replace the map with a list comprehension Nope… it's no better. In list(map(BuiltinFunction.fromString, functionStrings)) the functionStrings iterator for InjectedScriptSource.js has only 3 items, two short and the last one very long. [418, 254, 57055] The two firsts and the beginning of the third one which is a long list of functions. ###################### @linkTimeConstant function createObjectWithoutPrototype(/**/) { if (arguments.length % 2 !== 0) @throwRangeError("`createObjectWithoutPrototype` requires that each key argument be followed by a value, resulting in an even number of arguments."); let object = @Object.@create(null) for (let i = 0; i < arguments.length; i += 2) object[arguments[i]] = arguments[i + 1]; return object; } ###################### @linkTimeConstant function createArrayWithoutPrototype(/**/) { let array = new @Array(arguments.length); array.__proto__ = null; for (let i = 0; i < arguments.length; ++i) @arrayPush(array, arguments[i]); return array; } ###################### @linkTimeConstant @visibility=PrivateRecursive function createInspectorInjectedScript(InjectedScriptHost, inspectedGlobalObject, injectedScriptId) { function PrototypelessObjectBase() {} PrototypelessObjectBase.prototype = null; function toString(obj) { return @String(obj); } function toStringDescription(obj) { if (obj === 0 && 1 / obj < 0) return "-0"; if (isBigInt(obj)) return toString(obj) + "n"; return toString(obj); } etc…
Karl Dubost
Comment 7
2024-04-02 02:13:11 PDT
I wonder if it's because of the nesting of functions. ``` function createInspectorInjectedScript(InjectedScriptHost, inspectedGlobalObject, injectedScriptId) { function PrototypelessObjectBase() {} … a lot of code return injectedScript; } ```
Karl Dubost
Comment 8
2024-04-02 19:32:57 PDT
next step would be to checkout what `BuiltinFunction.fromString` is doing for this function createInspectorInjectedScript(InjectedScriptHost, inspectedGlobalObject, injectedScriptId) {…} Because this is basically the issue.
Karl Dubost
Comment 9
2024-04-02 19:49:24 PDT
This patch for
Bug 152294
might have introduced the issue:
https://github.com/WebKit/WebKit/commit/1d75ec6c3ee1575325c7af691084fb59a6e7587c#diff-226d8e95d34f44f09530c7ad6abf681a6cc386fe0b0f3884b53e0ae85afae049R56-R57
Karl Dubost
Comment 10
2024-04-02 21:41:55 PDT
If I remove the wrapper, the gain in speed is huge. The wrapper is: function createInspectorInjectedScript(InjectedScriptHost, inspectedGlobalObject, injectedScriptId) { } so with Source/JavaScriptCore/Scripts/wkbuiltins/builtins_model.py functionStrings = [text[start:end].strip() for (start, end) in functionBounds] import time print(f"----") tic = time.perf_counter() thingtoreturn = list(map(BuiltinFunction.fromString, functionStrings)) toc = time.perf_counter() print(f"how long for BuiltinFunction.fromString: {toc - tic:0.8f}s") return thingtoreturn % python3 Source/JavaScriptCore/Scripts/generate-js-builtins.py --framework JavaScriptCore --output-directory /tmp/ --combined Source/JavaScriptCore/inspector/InjectedScriptSource.js how long for BuiltinFunction.fromString: with wrapper: 15.52224183s without wrapper: 0.08558171s
Karl Dubost
Comment 11
2024-04-02 22:03:30 PDT
timing the different regex inside BuiltinFunction.fromString % python3 Source/JavaScriptCore/Scripts/generate-js-builtins.py --framework JavaScriptCore --output-directory /tmp/ --combined Source/JavaScriptCore/inspector/InjectedScriptSource.js -------- Inside fromString multilineCommentRegExp: 0.00000183s functionIntrinsicRegExp: 0.00029496s functionOverriddenNameRegExp: 0.00030350s functionNameRegExp: 0.00000129s function_name: createObjectWithoutPrototype functionVisibilityRegExp: 0.00030337s -------- Inside fromString multilineCommentRegExp: 0.00000067s functionIntrinsicRegExp: 0.00011783s functionOverriddenNameRegExp: 0.00011892s functionNameRegExp: 0.00000079s function_name: createArrayWithoutPrototype functionVisibilityRegExp: 0.00011775s -------- Inside fromString multilineCommentRegExp: 0.00001742s functionIntrinsicRegExp: 7.45201550s functionOverriddenNameRegExp: 5.22347075s functionNameRegExp: 0.00006025s function_name: createInspectorInjectedScript functionVisibilityRegExp: 0.00018467s how long for BuiltinFunction.fromString: 17.81850321s These are the two RegExp which are not performant. functionIntrinsicRegExp: 7.45201550s functionOverriddenNameRegExp: 5.22347075s functionIntrinsicRegExp = re.compile(r".*^@intrinsic=(\w+)", re.MULTILINE | re.DOTALL) functionOverriddenNameRegExp = re.compile(r".*^@overriddenName=(\".+\")$", re.MULTILINE | re.DOTALL)
Karl Dubost
Comment 12
2024-04-02 22:08:03 PDT
Removing the wrapper function in InjectedScriptSource.js and suddenly the performances are good. Inside fromString multilineCommentRegExp: 0.00000067s * functionIntrinsicRegExp: 0.00251513s * functionOverriddenNameRegExp: 0.00258671s functionNameRegExp: 0.00000108s function_name: CommandLineAPI functionVisibilityRegExp: 0.00245400s
Karl Dubost
Comment 13
2024-04-02 22:20:02 PDT
As a matter of fact There is no @intrinsic @overriddenName in InjectedScriptSource.js I have a fix, which reduces the full time to 6s. So there's another regex not performant. Let see.
Karl Dubost
Comment 14
2024-04-02 22:43:35 PDT
Ok the 3rd one is the nested functionVisibilityRegExp # This is ok. No issue visibilityMatch = functionVisibilityRegExp.search(function_source) if visibilityMatch: visibility = visibilityMatch.group(1) # This one is costly function_source = functionVisibilityRegExp.sub("", function_source) elif is_link_time_constant: visibility = "Private" Profiling the code: Full Time of execution: 33.5% functionIntrinsicRegExp = re.compile(r".*^@intrinsic=(\w+)", re.MULTILINE | re.DOTALL) 32.8% functionOverriddenNameRegExp = re.compile(r".*^@overriddenName=(\".+\")$", re.MULTILINE | re.DOTALL) 33.7% functionVisibilityRegExp = re.compile(r".*^@visibility=(\w+)", re.MULTILINE | re.DOTALL) Let's fix this :)
Karl Dubost
Comment 15
2024-04-02 23:44:47 PDT
Pull request:
https://github.com/WebKit/WebKit/pull/26777
Karl Dubost
Comment 16
2024-04-03 01:10:26 PDT
I have a better patch. main: 21.63s user 0.04s system 99% cpu 21.710 total my second patch: 0.10s user 0.03s system 84% cpu 0.157 total :) There's one little fix to check to be extra-sure.
EWS
Comment 17
2024-04-04 16:32:56 PDT
Committed
277085@main
(2a64b0eff0e3): <
https://commits.webkit.org/277085@main
> Reviewed commits have been landed. Closing PR #26777 and removing active labels.
Philippe Normand
Comment 18
2024-04-06 09:23:04 PDT
Thanks for fixing this, Karl!
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug