17480 – Implement speculative preloading

RESOLVED FIXED17480

Implement speculative preloading

https://bugs.webkit.org/show_bug.cgi?id=17480

Summary Implement speculative preloading

Antti Koivisto

Reported 2008-02-21 15:28:26 PST

Currently when the parser encounters an external script resource it halts parsing and continues only after the resource has been received from the network. A problem with this is that we are not issuing any more load requests to the network layer. Loading performance depends heavily on parallel loading to hide network latency. With parser halted we quickly run out of things to load leading to poor network bandwidth utilizations. This effect can be can reduced by introducing a side parser that parses the document source after the resource that halted the main parse. It searches for resources that are likely to be needed to complete the load and issues requests for them. This also makes it possible to schedule script and stylesheet requests (which block rendering) before any image requests.

Attachments
preliminary patch (64.87 KB, patch) 2008-02-21 18:18 PST, Antti Koivisto	no flags	Details Formatted Diff Diff
updated patch (66.96 KB, patch) 2008-02-26 23:47 PST, Antti Koivisto	no flags	Details Formatted Diff Diff
another updated patch (69.28 KB, patch) 2008-03-07 00:13 PST, Antti Koivisto	no flags	Details Formatted Diff Diff
yet another updated patch (70.05 KB, patch) 2008-03-11 17:19 PDT, Antti Koivisto	no flags	Details Formatted Diff Diff
Darin's comments etc. (79.87 KB, patch) 2008-03-12 00:25 PDT, Antti Koivisto	darin: review+	Details Formatted Diff Diff
Show Obsolete (4) View All Add attachment proposed patch, testcase, etc.

Antti Koivisto

Comment 1 2008-02-21 18:18:33 PST

Created attachment 19269 [details] preliminary patch The patch implements speculative preloading. It also implements reordering of resource loads to schedule scripts and stylesheets first.

Alexey Proskuryakov

Comment 2 2008-02-21 21:42:44 PST

See also: bug 7722, bug 11838, bug 17040.

Antti Koivisto

Comment 3 2008-02-26 23:47:10 PST

Created attachment 19398 [details] updated patch Added code to track how well preloads are utilized and debug output that shows some hit rate statistics after page load. Some minor bug fixes too and made it apply to current ToT.

Antti Koivisto

Comment 4 2008-03-07 00:13:29 PST

Created attachment 19580 [details] another updated patch - maintain preloads in DocLoader, not HTMLPreloadTokenizer - preparse document.write() output, a major performance win in some cases - fix reversed load ordering

Antti Koivisto

Comment 5 2008-03-11 17:19:02 PDT

Created attachment 19691 [details] yet another updated patch Some cleanups and more optimal loading for file: protocol. Performance looks ok in PLT too. Flipping the review bit to troll for comments.

Darin Adler

Comment 6 2008-03-11 17:32:23 PDT

Comment on attachment 19691 [details] yet another updated patch The basic structure of the code looks really good! +// FIXME main tokenizer includes this too, are we getting two copies of the code and data? We are! We should fix that. I am sad to see so much of the HTML tokenizing/parsing code copied and pasted, but I don't have a better suggestion. + class HTMLPreloadTokenizer : Noncopyable + { Please move brace up. + HTMLPreloadTokenizer(Document* doc); Please omit the name "doc" here. + void emitCharacter(UChar c); + + void tokenizeCSS(UChar c); And the "c" here. + , m_preloadTokenizer(0) Not needed. OwnPtr gets initialized to 0 automatically. bool referenced() const { return !m_clients.isEmpty(); } + enum PreloadResult { I think a blank line here would make this read better. +#if PRELOAD_DEBUG +#include "CString.h" +#endif +#if REQUEST_DEBUG +#include "CString.h" +#endif I think it's OK to include this header unconditionally in these places. + void registerPreload(CachedResource* resource); No need for the word resource here. +// Match the parallel connection count used by the networking layer +// FIXME should not hardcode something like this +const unsigned maxRequestsInFlightPerHost = 4; +// Having a limit might still help getting more important resources first +const unsigned maxRequestsInFlightForNonHTTPProtocols= 20; These should be "static const" rather than just "const". + case CachedResource::ImageResource: + return Low; + default: + ASSERT_NOT_REACHED(); + return Low; Moving the default case out of hte switch will cause us to get a warning if there are any unhandled cases. + KURL url(resource->url()); If you use a const KURL& you can save a little refcount thrashing. + bool isHTTP = url.protocol() == "http" || url.protocol() == "https"; This should use the new protocolIs function. For one thing, it makes us not be case sensitive. And it's also more efficient. + if (priority > Low || !isHTTP) + // Try to request important resources immediately + host->servePendingRequests(priority); + else + // Handle asynchronously so early low priority requests don't get scheduled before later high priority ones + scheduleServePendingRequests(); Please put braces around these multiline bodies, even though they are one statement (the comma makes it two lines). + enum Priority { + Low, Medium, High + }; I'd like this better all on a single line. + Priority determinePriority(const CachedResource* resource) const; No need for the word resource here. + void cancelPendingRequests(RequestQueue& requestsPending, DocLoader* docLoader); No need for the word "docLoader" here.

Antti Koivisto

Comment 7 2008-03-12 00:25:49 PDT

Created attachment 19694 [details] Darin's comments etc. - Implemented Darin's comments - Renamed the preloader class to PreloadScanner as suggested by Mitz - ChangeLogs - Some other minor tweaks

Darin Adler

Comment 8 2008-03-13 10:18:48 PDT

Comment on attachment 19694 [details] Darin's comments etc. +// FIXME: temporary, add PreloadScanner to all build files +#if PLATFORM(MAC) +#define PRELOAD_SCANNER_ENABLED 1 +#else +#define PRELOAD_SCANNER_ENABLED 0 +#endif I think this should go in the header, not the .cpp file. + // printf("DOCUMENT.WRITE PRELOAD\n"); Please don't check in commented-out printf. +#include "config.h" + +#include "PreloadScanner.h" Normally we don't leave a blank line there. + // FIXME there is a table for more exceptions in the spec We'd normally use a colon and use a sentence (capital first letter, period) in a comment like this one. It would be good to say which specification you're talking about. In general it seems that a comment saying that this was derived from the HTML 5 specification and which draft you used would be helpful. +#define PreloadScanner_h + +#include "AtomicString.h" +#include "PlatformString.h" +#include "SegmentedString.h" No need to include PlatformString.h here. +#include <wtf/ListHashSet.h> +#include <wtf/Noncopyable.h> +#include <wtf/OwnPtr.h> No reason to include ListHashSet.h or OwnPtr.h in this header. + unsigned m_preloadCount; + PreloadResult m_preloadResult; + bool m_requestedFromNetworkingLayer; Why are these protected rather than private? Please make things private unless there is a reason not to. +static const unsigned maxRequestsInFlightForNonHTTPProtocols= 20; +static const unsigned maxRequestsInFlightForNonHTTPProtocols= 10000; Missing space before "=". +#if REQUEST_DEBUG + KURL u(resource->url()); + printf("HOST %s COUNT %d RECEIVED %s\n", u.host().latin1().data(), m_requestsLoading.size(), resource->url().latin1().data()); +#endif I think we need to add easier ways to log things. All this x.latin1().data() and x.utf8().data() gets tiring after a while! These are all minor editorial comments, so r=me

Mike Belshe

Comment 9 2008-03-13 10:42:29 PDT

Note: I believe in CancelRequests, the ';' at the end of the if statement is a bug: + if (host->hasRequests()); + host->cancelRequests(docLoader);

Mike Belshe

Comment 10 2008-03-13 11:19:32 PDT

In PreloadScanner::tokenize if (!match || (!isWhitespace(tmpChar) && tmpChar != '>' & tmpChar != '/')) { I believe that should be a "&&"

Antti Koivisto

Comment 11 2008-03-13 14:33:48 PDT

r31038 plus fixes to issues spotted by Mike in r31041

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Enhancement

Classification Unclassified

Version 528+ (Nightly build)

Hardware All

OS OS X 10.5

Product WebKit

Component Page Loading

Assignee

Antti Koivisto

Reported

2008-02-21 15:28 PST

Modified

2008-03-13 14:33 PDT History

CC List

5 users Show

URL

Keywords

Depends on

Blocks