Bug 108027 - Enable the preload scanner on the background parser thread
Summary: Enable the preload scanner on the background parser thread
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: New Bugs (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Eric Seidel (no email)
URL:
Keywords:
Depends on: 107807 109861
Blocks: 106127
  Show dependency treegraph
 
Reported: 2013-01-27 00:24 PST by Eric Seidel (no email)
Modified: 2013-02-15 14:16 PST (History)
4 users (show)

See Also:


Attachments
Patch (12.08 KB, patch)
2013-01-27 01:19 PST, Eric Seidel (no email)
no flags Details | Formatted Diff | Diff
WIP: Compiles but crashes every test (16.73 KB, patch)
2013-02-14 15:32 PST, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (22.22 KB, patch)
2013-02-14 16:59 PST, Adam Barth
no flags Details | Formatted Diff | Diff
Patch (21.33 KB, patch)
2013-02-15 12:01 PST, Adam Barth
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Seidel (no email) 2013-01-27 00:24:06 PST
Enable the preload scanner on the background parser thread
Comment 1 Eric Seidel (no email) 2013-01-27 01:19:48 PST
Created attachment 184903 [details]
Patch
Comment 2 Adam Barth 2013-01-27 17:15:27 PST
Comment on attachment 184903 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=184903&action=review

IMHO, we should use CompactHTMLTokens rather than HTMLTokens.  This patch means we're creating 2x the number of strings on the background thread (because both the preload scanner and the compact HTML token make strings for every tag and attribute).  Our profiles tell us that we're spending a significant amount of time creating strings, which means we don't want to double that work.

> Source/WebCore/html/parser/HTMLDocumentParser.cpp:476
> +    KURL documentURL = document()->url().copy();

Is this a thread-safe copy?

> Source/WebCore/html/parser/HTMLPreloadScanner.cpp:95
> +            String attributeName(iter->m_name.data(), iter->m_name.size());

Here we're making another copy of every attribute name and value.  If we used CompactHTMLToken, then we could use the String we've already created.
Comment 3 Eric Seidel (no email) 2013-01-27 17:24:38 PST
(In reply to comment #2)
> (From update of attachment 184903 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=184903&action=review
> 
> IMHO, we should use CompactHTMLTokens rather than HTMLTokens.  This patch means we're creating 2x the number of strings on the background thread (because both the preload scanner and the compact HTML token make strings for every tag and attribute).  Our profiles tell us that we're spending a significant amount of time creating strings, which means we don't want to double that work.

That's fine.  I took this path in order to keep the main-thread working as much as it was as possible.

Either we'll need to templatize the code, fork HTMLPreloadScanner, or use CompactHTMLTokens on teh main thread.  Likely we'll end up with the later, but it's slightly tricky as you'll want to do the compacting as late as possible so as to avoid making them if possible.  On the parser thread we obviously never avoid making them.

> > Source/WebCore/html/parser/HTMLDocumentParser.cpp:476
> > +    KURL documentURL = document()->url().copy();
> 
> Is this a thread-safe copy?


    // Makes a deep copy. Helpful only if you need to use a KURL on another
    // thread.  Since the underlying StringImpl objects are immutable, there's
    // no other reason to ever prefer copy() over plain old assignment.
    KURL copy() const;


> > Source/WebCore/html/parser/HTMLPreloadScanner.cpp:95
> > +            String attributeName(iter->m_name.data(), iter->m_name.size());
> 
> Here we're making another copy of every attribute name and value.  If we used CompactHTMLToken, then we could use the String we've already created.

Agreed.  The having to satisfy two masters is the problem.  I chose not to fork the preload scanner, but instead redesign it so that it was threadsafe while still useable from the main thread.
Comment 4 Adam Barth 2013-01-27 18:16:20 PST
It seems like you could have the shared logic work in terms of Strings and then have a controller-like code that calls the shared code starting from either an HTMLToken or a CompactHTMLToken.  You've looked at this code in more detail than I have, so maybe there's some reason why that wouldn't work.
Comment 5 Eric Seidel (no email) 2013-01-27 18:57:45 PST
(In reply to comment #4)
> It seems like you could have the shared logic work in terms of Strings and then have a controller-like code that calls the shared code starting from either an HTMLToken or a CompactHTMLToken.  You've looked at this code in more detail than I have, so maybe there's some reason why that wouldn't work.

Yes, I think there is some potential there.  We can discuss in person tomrorow.  I'm not planning on landing this tonight.
Comment 6 Adam Barth 2013-02-14 15:32:31 PST
Created attachment 188437 [details]
WIP: Compiles but crashes every test
Comment 7 Adam Barth 2013-02-14 16:59:51 PST
Created attachment 188452 [details]
Patch
Comment 8 Eric Seidel (no email) 2013-02-15 07:18:34 PST
Comment on attachment 188452 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=188452&action=review

> Source/WebCore/html/parser/HTMLDocumentParser.cpp:565
> +    config->preloadScanner = adoptPtr(new TokenPreloadScanner(document()->url().copy()));

Why create this from the main thread?
Comment 9 Adam Barth 2013-02-15 08:52:45 PST
(In reply to comment #8)
> (From update of attachment 188452 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=188452&action=review
> 
> > Source/WebCore/html/parser/HTMLDocumentParser.cpp:565
> > +    config->preloadScanner = adoptPtr(new TokenPreloadScanner(document()->url().copy()));
> 
> Why create this from the main thread?

Mostly to match the XSSAuditor and to be ready for future changes (like supporting rewinding the TokenPreloadScanner on document.write).
Comment 10 Adam Barth 2013-02-15 09:51:12 PST
Note: We need to move to a struct in any case because we'd otherwise need to send the documentURL.
Comment 11 Tony Gentilcore 2013-02-15 10:31:38 PST
Comment on attachment 188452 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=188452&action=review

> Source/WebCore/html/parser/BackgroundHTMLParser.h:87
> +    explicit BackgroundHTMLParser(PassRefPtr<WeakReference<BackgroundHTMLParser> >, PassOwnPtr<Configuration>);

Remove explicit, this has 2 args.
Comment 12 Adam Barth 2013-02-15 10:32:34 PST
Thanks for the review!
Comment 13 Tony Gentilcore 2013-02-15 10:32:43 PST
It would be nice to rebase so EWS can chomp on this.
Comment 14 Adam Barth 2013-02-15 11:03:54 PST
Yep!  Will do after bug 109861 lands.
Comment 15 Eric Seidel (no email) 2013-02-15 12:00:08 PST
Its not clear to me that sending everything over in one hunk is the right design. I think we may want to issue preloads sooner.  But this is definitely fine to get us going.  Thank you for finishing this.
Comment 16 Adam Barth 2013-02-15 12:01:49 PST
Created attachment 188612 [details]
Patch
Comment 17 Adam Barth 2013-02-15 12:03:39 PST
> Its not clear to me that sending everything over in one hunk is the right design. I think we may want to issue preloads sooner.  But this is definitely fine to get us going.  Thank you for finishing this.

Yeah, this patch is just a first step.  We'll definitely want to iterate from here.
Comment 18 WebKit Review Bot 2013-02-15 14:16:12 PST
Comment on attachment 188612 [details]
Patch

Clearing flags on attachment: 188612

Committed r143051: <http://trac.webkit.org/changeset/143051>
Comment 19 WebKit Review Bot 2013-02-15 14:16:17 PST
All reviewed patches have been landed.  Closing bug.