Bug 111708 - SegmentedString copy constructor is 1% of total time for background html parser?
Summary: SegmentedString copy constructor is 1% of total time for background html parser?
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 111645
  Show dependency treegraph
 
Reported: 2013-03-07 03:53 PST by Eric Seidel (no email)
Modified: 2013-03-09 13:14 PST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2013-03-07 03:53:14 PST
SegmentedString copy constructor is 1% of total time for background html parser?

Something is wrong here.

Running Time	Self		Symbol Name
18.6ms    0.9%	0.0	 	 WTF::Deque<WebCore::SegmentedSubstring, 0ul>::Deque(WTF::Deque<WebCore::SegmentedSubstring, 0ul> const&)
18.3ms    0.8%	0.0	 	  WebCore::SegmentedString::operator=(WebCore::SegmentedString const&)
18.3ms    0.8%	0.0	 	   WebCore::HTMLSourceTracker::start(WebCore::SegmentedString&, WebCore::HTMLTokenizer*, WebCore::HTMLToken&)
18.3ms    0.8%	0.0	 	    WebCore::BackgroundHTMLParser::pumpTokenizer()
18.3ms    0.8%	0.0	 	     WTF::BoundFunctionImpl<WTF::FunctionWrapper<void (WebCore::BackgroundHTMLParser::*)(WTF::String const&)>, void (WTF::WeakPtr<WebCore::BackgroundHTMLParser>, WTF::String)>::operator()()
18.3ms    0.8%	0.0	 	      WebCore::HTMLParserThread::runLoop()

That sample was taken with bug 107236 applied (so that I could actually see the parser amongst the rendering noise).
Comment 1 Eric Seidel (no email) 2013-03-07 04:41:12 PST
I feel like I fixed this identical bug back when I was doing all the malloc(0) removal in WebKit.
Comment 2 Eric Seidel (no email) 2013-03-07 04:44:09 PST
The bug I'm thinking of is bug 55005.  I suspect this is just a new variant of that. :(
Comment 3 Eric Seidel (no email) 2013-03-07 04:45:29 PST
I see.  That was never actually needed because bug 55091 solved things.
Comment 4 Eric Seidel (no email) 2013-03-09 02:06:41 PST
I added some logging, it looks like in parsing the whole of teh HTML5 spec, we only copy Deque from this callsite 10 times or so?  That makes me very surprised to see it be 2% of total time, as I did in my sample just now (other parts of gotten faster since I filed this bug).

Not sure what's up.  It's possible my methods are flawed.

I'll try changing Deque<SegmentedSubstring> m_substrings to use an inline capacity of 2 tomorrow and see if that makes things faster.  I worry since SegmentedSubstring can't be copied with memcpy, that will just shift the slowness elsewhere.

It's not immediately clear to me why HTMLSourceTracker needs to do this copying in the first place. :(
Comment 5 Adam Barth 2013-03-09 11:23:24 PST
> It's not immediately clear to me why HTMLSourceTracker needs to do this copying in the first place. :(

It needs to remember where in the input stream the token started so that it can later provide the source string that generated the token.
Comment 6 Eric Seidel (no email) 2013-03-09 13:14:39 PST
I tried changing to use a Deque with inline capacity 2, but that just causes VectorBuffer::swap to be hot.  Presumably from:

    template<typename T, size_t inlineCapacity>
    inline Deque<T, inlineCapacity>& Deque<T, inlineCapacity>::operator=(const Deque<T, inlineCapacity>& other)
    {
        // FIXME: This is inefficient if we're using an inline buffer and T is
        // expensive to copy since it will copy the buffer twice instead of once.
        Deque<T, inlineCapacity> copy(other);
        swap(copy);
        return *this;
    }

I suspect there is a way to get what HTMLSourceTracker wants w/o the malloc, I just have to study what it's trying to do more.