WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
119921
Parsing HTML entities shouldn't call malloc
https://bugs.webkit.org/show_bug.cgi?id=119921
Summary
Parsing HTML entities shouldn't call malloc
Ryosuke Niwa
Reported
2013-08-16 19:59:43 PDT
https://chromium.googlesource.com/chromium/blink/+/f7f0532523c8ba48374d500a81f4a2127253a6e9
I've seen the HTML entity parser show up on a number of the backtraces for malloc in profiles of top-1000000 sites. There's no reason to call malloc when parsing HTML entities. This CL removes all the calls to malloc in the common code paths. This CL also untwists the code now that we don't need to support NEW_XML. Rather than having a templated function do the work, we now do the work in a normal function. Also, we no longer need to support XML's user-defined entities, so we can go back to assuming that decoded entities are at most four UTF-16 code units long, which removes the need for a variable length output buffer. For good measure, I also replaced the buffer we use to recover from parse errors with a Vector that has inline capacity so that we don't need to call malloc for it in the common case. This CL reduces the number of calls to malloc on
http://thithtoolwin.com
by 17%. This site isn't particularly pathological. It was just the straw that broke the camel's back and caused me to write this CL.
Attachments
Add attachment
proposed patch, testcase, etc.
Ahmad Saleem
Comment 1
2022-08-13 07:25:21 PDT
Seems like this is not merged:
https://github.com/WebKit/WebKit/blob/8afe31a018b11741abdf9b4d5bb973d7c1d9ff05/Source/WebCore/html/parser/HTMLEntityParser.cpp#L46
https://github.com/WebKit/WebKit/blob/8afe31a018b11741abdf9b4d5bb973d7c1d9ff05/Source/WebCore/html/parser/HTMLEntityParser.h#L33
https://github.com/WebKit/WebKit/blob/8afe31a018b11741abdf9b4d5bb973d7c1d9ff05/Source/WebCore/html/parser/HTMLTokenizer.cpp#L132
rniwa@webkit.org
- Is something needed here? Thanks!
Darin Adler
Comment 2
2022-08-15 09:17:13 PDT
Even all these years later, I still think this is a good optimization in Chromium; we should do something similar in WebKit.
Radar WebKit Bug Importer
Comment 3
2023-05-29 10:36:25 PDT
<
rdar://problem/109976279
>
Darin Adler
Comment 4
2023-05-29 14:26:44 PDT
Pull request:
https://github.com/WebKit/WebKit/pull/14462
EWS
Comment 5
2023-05-30 08:34:30 PDT
Committed
264675@main
(6298382b1e4a): <
https://commits.webkit.org/264675@main
> Reviewed commits have been landed. Closing PR #14462 and removing active labels.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug