We have a memory optimization where the HTML parser will atomize any text node string that is all white space. The process for this is a bit expensive, since we must loop over all the characters in the string three times: * First, to check that all the characters are white space * Second, to hash the string when looking up the atom hash table * Third, to check the string for equality with any existing atom hash table entry Most white space strings we encounter have a limited form -- they have at most three or four runs of consecutive equal white space characters, e.g. it's common to see a newline followed by a number of space characters. We can take advantage of this by compressing the white space string into a simple run-length encoded form while we check that the string is entirely white space. If we keep a cache of recently atomized white space strings that can be quickly looked up, keyed off the encoded form, we can re-use a previous result of atomizing an identical string and avoid the hashing and hash entry equality checks. I have a WIP patch for this that is showing a 1% improvement on Speedometer 2 overall (due to 2-4% improvements on a few of the subtests) and no change to PLT5 on my local machine.
<rdar://problem/81601565>
Created attachment 435057 [details] Patch
Created attachment 435138 [details] Patch
Comment on attachment 435138 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=435138&action=review r=me > Source/WebCore/html/parser/HTMLConstructionSite.cpp:858 > + code |= (end - startOfRun); It should be `character - startOfRun`.
Created attachment 435169 [details] Patch
Committed r280772 (240356@main): <https://commits.webkit.org/240356@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 435169 [details].
Committed r280773 (240357@main): <https://commits.webkit.org/240357@main>
Comment on attachment 435169 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=435169&action=review > Source/WebCore/html/parser/HTMLConstructionSite.cpp:896 > + WTFLogAlways("reuse code %llx", code); Will this create a lot of logging? > Source/WebCore/html/parser/HTMLConstructionSite.cpp:901 > + WTFLogAlways("override"); Ditto. > Source/WebCore/html/parser/HTMLConstructionSite.cpp:906 > + WTFLogAlways("replace code %llx", code); Ditto. > Source/WebCore/html/parser/HTMLConstructionSite.cpp:913 > + WTFLogAlways("new code %llx", code); Ditto.
(In reply to Per Arne Vollan from comment #8) > Comment on attachment 435169 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=435169&action=review > > > Source/WebCore/html/parser/HTMLConstructionSite.cpp:896 > > + WTFLogAlways("reuse code %llx", code); > > Will this create a lot of logging? Yes, Yusuke removed them shortly after this patch landed.