Bug 41807

Summary: WebCore/benchmarks/parser/html-parser.html spends a lot of time in deprecatedParseURL
Product: WebKit Reporter: Eric Seidel (no email) <eric>
Component: Tools / TestsAssignee: Nobody <webkit-unassigned>
Severity: Normal CC: abarth, commit-queue, darin, mjs, sam
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Bug Depends on:    
Bug Blocks: 41123    
Description Flags
Sample while running the benchmark with the new treebuilder
Patch none

Description Eric Seidel (no email) 2010-07-07 15:38:18 PDT
326 WebCore::HTMLConstructionSite::createElement(WebCore::AtomicHTMLToken&)
                                          185 WebCore::Element::setAttributeMap(WTF::PassRefPtr<WebCore::NamedNodeMap>, WebCore::FragmentScriptingPermission)
                                            162 WebCore::StyledElement::attributeChanged(WebCore::Attribute*, bool)
                                              86 WebCore::HTMLAnchorElement::parseMappedAttribute(WebCore::Attribute*)
                                                64 WebCore::deprecatedParseURL(WebCore::String const&)
                                                  51 WebCore::deprecatedParseURL(WebCore::String const&)
                                                  10 WebCore::StringImpl::create(unsigned short const*, unsigned int)

It's all spent mallocing strings for URLs.

It's not clear to me why we're hitting the slow path in deprecatedParseURL.

I suspect this hits both TreeBuilders equally, and may be affecting general page load times as well.

I don't know if there is a non-deprecated parseURL to replace this with (or if that's even the right solution).
Comment 1 Eric Seidel (no email) 2010-07-07 15:40:22 PDT
Created attachment 60790 [details]
Sample while running the benchmark with the new treebuilder

We're not really fast enough to bother sharking yet.  We still have low hanging fruit.  Hence the sample instead of shark sample.
Comment 2 Eric Seidel (no email) 2010-07-07 15:46:07 PDT
Um.  We always hit the slow case:

    // Optimize for the likely case there there is nothing to strip.
    if (l == length) {
        int k;
        for (k = 0; k < length; k++) {
            if (characters[k] > '\r')
        if (k == length)
            return url;

'\r' is 13.  Every damn ASCII of use is > 13.

That code is just wrong.  I don't know what it's trying to do though.
Comment 3 Eric Seidel (no email) 2010-07-07 15:48:10 PDT
http://trac.webkit.org/changeset/13005 was where that code was added.
Comment 4 Eric Seidel (no email) 2010-07-07 15:49:53 PDT
Looks like that should be a < instead of a >.  Will prepare a patch.

PLT speedup here we come... :)
Comment 5 Eric Seidel (no email) 2010-07-07 16:03:46 PDT
Created attachment 60793 [details]
Comment 6 Adam Barth 2010-07-07 16:07:05 PDT
Comment on attachment 60793 [details]

Comment 7 Eric Seidel (no email) 2010-07-07 16:09:17 PDT
I should note, the original code which darin/maciej wrote was fine.  It broke since then and no one seemed to notice.  I didn't dig up the exact revision where code movement caused it to break though.
Comment 8 Darin Adler 2010-07-07 16:16:49 PDT
Comment on attachment 60793 [details]

I'm guessing that I broke this.
Comment 9 Darin Adler 2010-07-07 16:19:40 PDT
The incorrect optimization was brand new, just added 8 weeks ago <http://trac.webkit.org/changeset/59281/trunk/WebCore/css/CSSHelper.cpp>.
Comment 10 Eric Seidel (no email) 2010-07-07 16:29:36 PDT
I think this is one of the troubles of the PLT... its so fast it's hard to tell when you actually make a beneficial change. :)
Comment 11 Adam Barth 2010-07-07 16:35:54 PDT
Yeah, I wish we had more benchmarks like the parser benchmark.  It's sort of too bad that it's only a small fraction of what the PTL covers.
Comment 12 WebKit Commit Bot 2010-07-08 02:01:25 PDT
Comment on attachment 60793 [details]

Clearing flags on attachment: 60793

Committed r62771: <http://trac.webkit.org/changeset/62771>
Comment 13 WebKit Commit Bot 2010-07-08 02:01:30 PDT
All reviewed patches have been landed.  Closing bug.