RESOLVED FIXED 259267
SerializedScriptValue could be faster by atomizing strings early in some cases
https://bugs.webkit.org/show_bug.cgi?id=259267
Summary SerializedScriptValue could be faster by atomizing strings early in some cases
Jarred Sumner
Reported 2023-07-17 01:23:14 PDT
There are a number of fast paths for JSON.stringify, JSON.parse, Object.entries, and more that could likely be re-used in WebCore::SerializedScriptValue, with some modifications. Some examples Instead of `getOwnPropertyNames`, in a number of cases `structure->forEachProperty(lambda)` could be used to iterate through an object quicker. The hard part of doing that here is dealing with stack overflow as carefully as the current code, but maybe a similar idea used in JSON.stringify could be reused here (check soft recursion limit) ``` indexStack.append(0); propertyStack.append(PropertyNameArray(vm, PropertyNameMode::Strings, PrivateSymbolMode::Exclude)); inObject->methodTable()->getOwnPropertyNames(inObject, m_lexicalGlobalObject, propertyStack.last(), DontEnumPropertiesMode::Exclude); ``` https://github.com/WebKit/WebKit/blob/7707bd0020f85ef8ab11f353f64b963d5f59f92e/Source/WebCore/bindings/js/SerializedScriptValue.cpp#L2391-L2394 The code for reading from arrays could use the indexing mode to find out if it only contains int32 or double and then have a faster path for reading/writing arrays of int32 or double Identifiers do a look up in the AtomString table, which means for property names, the string caching is duplicated with the calls to Identifier::fromString(vm, string), and also hashed an extra time. Storing an Identifier in CachedString may improve deserialization performance. ``` case ObjectStartVisitMember: { CachedStringRef cachedString; bool wasTerminator = false; if (!readStringData(cachedString, wasTerminator)) { if (!wasTerminator) goto error; JSObject* outObject = outputObjectStack.last(); outValue = outObject; outputObjectStack.removeLast(); break; } if (JSValue terminal = readTerminal()) { putProperty(outputObjectStack.last(), Identifier::fromString(vm, cachedString->string()), terminal); goto objectStartVisitMember; } stateStack.append(ObjectEndVisitMember); propertyNameStack.append(Identifier::fromString(vm, cachedString->string())); goto stateUnknown; } ``` Making the change to directly read them as Identifier yielded a 20% performance improvement to a deserialization microbenchmark in Bun: https://github.com/oven-sh/bun/pull/3655/files#diff-644fa389dd06a2df5a14ad4c234d546fd452170a9d57d28f4919d929fed14710R2973-R3003 After: cpu: Apple M1 Max runtime: bun 0.6.15 (arm64-darwin) benchmark time (avg) (min … max) p75 p99 p995 --------------------------------------------------- ----------------------------- deserialize 8.54 µs/iter (8.31 µs … 8.84 µs) 8.66 µs 8.84 µs 8.84 µs Before: cpu: Apple M1 Max runtime: bun 0.6.15 (arm64-darwin) benchmark time (avg) (min … max) p75 p99 p995 --------------------------------------------------- ----------------------------- deserialize 10.01 µs/iter (9.78 µs … 10.32 µs) 10.16 µs 10.32 µs 10.32 µs ``` There are more things that can be done here. constructEmptyObject() could pass the count of the number of properties (if that's available) when the size is less than the inline size limit. Structures potentially should be cached in a temporary buffer if it turns out that two objects have the same set of keys (ideally those would be in the constant pool too). If they were serialized that way, putDirectOffset could be used (for the named properties) and it would be very fast to clone final objects
Attachments
Radar WebKit Bug Importer
Comment 1 2023-07-24 01:24:17 PDT
Chris Dumez
Comment 2 2023-08-01 13:10:11 PDT
EWS
Comment 3 2023-08-01 20:36:07 PDT
Committed 266503@main (d2a72f55c356): <https://commits.webkit.org/266503@main> Reviewed commits have been landed. Closing PR #16285 and removing active labels.
Note You need to log in before you can comment on or make changes to this bug.