Bug 107764

Summary: HTMLDocumentParser::insert should be aware of threaded parsing
Product: WebKit Reporter: Adam Barth <abarth>
Component: New BugsAssignee: Adam Barth <abarth>
Status: RESOLVED FIXED    
Severity: Normal CC: eric, ojan.autocc, tonyg, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 107753    
Attachments:
Description Flags
work in progress
none
Patch
none
Patch for landing none

Description Adam Barth 2013-01-23 17:16:57 PST
HTMLDocumentParser should be aware of threaded parsing
Comment 1 Adam Barth 2013-01-29 00:10:23 PST
I'm not sure what I had in mind for this bug.
Comment 2 Adam Barth 2013-01-29 00:14:36 PST
I remember now.
Comment 3 Adam Barth 2013-01-29 00:42:58 PST
Created attachment 185188 [details]
work in progress
Comment 4 Eric Seidel (no email) 2013-01-29 14:38:22 PST
Comment on attachment 185188 [details]
work in progress

View in context: https://bugs.webkit.org/attachment.cgi?id=185188&action=review

> Source/WebCore/html/parser/HTMLDocumentParser.cpp:81
> +    , m_token(m_options.useThreading ? 0 : adoptPtr(new HTMLToken))
> +    , m_tokenizer(m_options.useThreading ? 0 : HTMLTokenizer::create(m_options))

I'm not entirely sure what you're going for here, but I like the idea of controlling the lifetime of these objects more carefully when using the threaded parser.  That can allow us to better ASSERT that we're doing things in the order we expect (and on the right thread).
Comment 5 Adam Barth 2013-01-29 14:40:57 PST
(In reply to comment #4)
> (From update of attachment 185188 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=185188&action=review
> 
> > Source/WebCore/html/parser/HTMLDocumentParser.cpp:81
> > +    , m_token(m_options.useThreading ? 0 : adoptPtr(new HTMLToken))
> > +    , m_tokenizer(m_options.useThreading ? 0 : HTMLTokenizer::create(m_options))
> 
> I'm not entirely sure what you're going for here, but I like the idea of controlling the lifetime of these objects more carefully when using the threaded parser.  That can allow us to better ASSERT that we're doing things in the order we expect (and on the right thread).

Yeah, this part of the change isn't strictly needed, but it seems cleaner to have a tokenizer on the main thread only if we're actually using it.
Comment 6 Adam Barth 2013-01-30 14:18:18 PST
Created attachment 185570 [details]
Patch
Comment 7 Eric Seidel (no email) 2013-01-30 14:31:36 PST
Comment on attachment 185570 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=185570&action=review

I assume this causes a change in behavior?  Could you explain better in your ChangeLog how this affects tests (because it doesn't seem they can pass yet w/o implementing the speculation abort.)

> Source/WebCore/html/parser/HTMLDocumentParser.cpp:706
> +            if (isWaitingForScripts() || isStopped())

Could you explain this change?  It looks right.  I assume you were pumping when you didnt' expec to?

> Source/WebCore/html/parser/HTMLTreeBuilder.cpp:856
> +        if (m_parser->tokenizer())
> +            m_parser->tokenizer()->setState(HTMLTokenizerState::RCDATAState);

Should this just be setTokenizerState() on the parser, and the parser should handle the check?  That would make it possibly easier for you to compare the state after insert() with the expected state.
Comment 8 Adam Barth 2013-01-30 14:39:32 PST
> I assume this causes a change in behavior?  Could you explain better in your ChangeLog how this affects tests (because it doesn't seem they can pass yet w/o implementing the speculation abort.)

Oh, sorry.  This patch doesn't affect behavior.  I'll mention that in the ChangeLog.  It's just setting us up for the next patch which actually implements the speculation abort.  (The uberpatch is in bug 107753 if you want to see the whole picture.)

> > Source/WebCore/html/parser/HTMLDocumentParser.cpp:706
> > +            if (isWaitingForScripts() || isStopped())
> 
> Could you explain this change?  It looks right.  I assume you were pumping when you didnt' expec to?

Yeah, if processing tokens from the background thread causes the parser to stop, then we need to not process further tokens.  (We'll hit a crash from zero document.)

That's covered by fast/parser/iframe-onload-document-close-with-external-script* (or maybe it was a different one---I made this change based on one of the fast/parser tests.)

> > Source/WebCore/html/parser/HTMLTreeBuilder.cpp:856
> > +        if (m_parser->tokenizer())
> > +            m_parser->tokenizer()->setState(HTMLTokenizerState::RCDATAState);
> 
> Should this just be setTokenizerState() on the parser, and the parser should handle the check?  That would make it possibly easier for you to compare the state after insert() with the expected state.

You're a brilliant man.  Yes, that solves two problems.  If you don't mind, I'd like to make that change when we get smarter about not blowing away the speculative parsing buffer on every document.write (which is when we'll want to hook that call).
Comment 9 Adam Barth 2013-01-30 14:43:00 PST
Created attachment 185581 [details]
Patch for landing
Comment 10 Eric Seidel (no email) 2013-01-30 15:09:42 PST
Thank you for the explanations.
Comment 11 WebKit Review Bot 2013-01-30 15:17:35 PST
Comment on attachment 185581 [details]
Patch for landing

Clearing flags on attachment: 185581

Committed r141328: <http://trac.webkit.org/changeset/141328>
Comment 12 WebKit Review Bot 2013-01-30 15:17:40 PST
All reviewed patches have been landed.  Closing bug.