Bug 111708

Summary: SegmentedString copy constructor is 1% of total time for background html parser?
Product: WebKit Reporter: Eric Seidel (no email) <eric>
Component: New BugsAssignee: Nobody <webkit-unassigned>
Status: RESOLVED WONTFIX    
Severity: Normal CC: abarth, annevk, tonyg
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 111645    

Eric Seidel (no email)
Reported 2013-03-07 03:53:14 PST
SegmentedString copy constructor is 1% of total time for background html parser? Something is wrong here. Running Time Self Symbol Name 18.6ms 0.9% 0.0 WTF::Deque<WebCore::SegmentedSubstring, 0ul>::Deque(WTF::Deque<WebCore::SegmentedSubstring, 0ul> const&) 18.3ms 0.8% 0.0 WebCore::SegmentedString::operator=(WebCore::SegmentedString const&) 18.3ms 0.8% 0.0 WebCore::HTMLSourceTracker::start(WebCore::SegmentedString&, WebCore::HTMLTokenizer*, WebCore::HTMLToken&) 18.3ms 0.8% 0.0 WebCore::BackgroundHTMLParser::pumpTokenizer() 18.3ms 0.8% 0.0 WTF::BoundFunctionImpl<WTF::FunctionWrapper<void (WebCore::BackgroundHTMLParser::*)(WTF::String const&)>, void (WTF::WeakPtr<WebCore::BackgroundHTMLParser>, WTF::String)>::operator()() 18.3ms 0.8% 0.0 WebCore::HTMLParserThread::runLoop() That sample was taken with bug 107236 applied (so that I could actually see the parser amongst the rendering noise).
Attachments
Eric Seidel (no email)
Comment 1 2013-03-07 04:41:12 PST
I feel like I fixed this identical bug back when I was doing all the malloc(0) removal in WebKit.
Eric Seidel (no email)
Comment 2 2013-03-07 04:44:09 PST
The bug I'm thinking of is bug 55005. I suspect this is just a new variant of that. :(
Eric Seidel (no email)
Comment 3 2013-03-07 04:45:29 PST
I see. That was never actually needed because bug 55091 solved things.
Eric Seidel (no email)
Comment 4 2013-03-09 02:06:41 PST
I added some logging, it looks like in parsing the whole of teh HTML5 spec, we only copy Deque from this callsite 10 times or so? That makes me very surprised to see it be 2% of total time, as I did in my sample just now (other parts of gotten faster since I filed this bug). Not sure what's up. It's possible my methods are flawed. I'll try changing Deque<SegmentedSubstring> m_substrings to use an inline capacity of 2 tomorrow and see if that makes things faster. I worry since SegmentedSubstring can't be copied with memcpy, that will just shift the slowness elsewhere. It's not immediately clear to me why HTMLSourceTracker needs to do this copying in the first place. :(
Adam Barth
Comment 5 2013-03-09 11:23:24 PST
> It's not immediately clear to me why HTMLSourceTracker needs to do this copying in the first place. :( It needs to remember where in the input stream the token started so that it can later provide the source string that generated the token.
Eric Seidel (no email)
Comment 6 2013-03-09 13:14:39 PST
I tried changing to use a Deque with inline capacity 2, but that just causes VectorBuffer::swap to be hot. Presumably from: template<typename T, size_t inlineCapacity> inline Deque<T, inlineCapacity>& Deque<T, inlineCapacity>::operator=(const Deque<T, inlineCapacity>& other) { // FIXME: This is inefficient if we're using an inline buffer and T is // expensive to copy since it will copy the buffer twice instead of once. Deque<T, inlineCapacity> copy(other); swap(copy); return *this; } I suspect there is a way to get what HTMLSourceTracker wants w/o the malloc, I just have to study what it's trying to do more.
Anne van Kesteren
Comment 7 2023-12-25 10:04:59 PST
Threaded HTML parser was removed.
Note You need to log in before you can comment on or make changes to this bug.