Bug 12279 - Two pass loading is needed when the network has high latency and low bandwidth
Summary: Two pass loading is needed when the network has high latency and low bandwidth
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: 420+
Hardware: Other Other
: P2 Normal
Assignee: David Carson
Depends on:
Reported: 2007-01-15 05:59 PST by David Carson
Modified: 2008-11-11 21:40 PST (History)
4 users (show)

See Also:

Incomplete patch (15.50 KB, patch)
2007-01-15 08:15 PST, David Carson
no flags Details | Formatted Diff | Diff
updated patch (15.09 KB, patch)
2007-01-19 06:45 PST, David Carson
no flags Details | Formatted Diff | Diff
patch for s60 webkit tot doubletree (81.23 KB, patch)
2007-02-21 10:29 PST, Bradley Morrison
no flags Details | Formatted Diff | Diff
patch for ToT (rev 19972) (18.04 KB, patch)
2007-03-05 18:42 PST, Grace Kloba
darin: review-
Details | Formatted Diff | Diff
updated patch according to review (rev 20051) (20.59 KB, patch)
2007-03-07 22:37 PST, Grace Kloba
hyatt: review-
Details | Formatted Diff | Diff
updated patch according to review (rev 20546) (21.19 KB, patch)
2007-03-27 23:08 PDT, Grace Kloba
hyatt: review+
Details | Formatted Diff | Diff
updated patch for fix layout tests (rev 20675) (20.86 KB, patch)
2007-04-02 23:42 PDT, Grace Kloba
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Carson 2007-01-15 05:59:28 PST
When loading some sites on devices that have high latency and low bandwidth (HL/LB), the user may not see content for quite some time. This is due the external javascript references and CSS references. When the parser hits the external JS reference, it blocks - which is the correct behaviour - and waits until the JS is available then executes the JS. Then continues parsing the markup. 
When using a HL/LB device, the user can be waiting too long, and they press cancel to stop loading thinking that the page is not loading. 
One way to address this issue is to load the page, but not execute any JS or block on external CSS requests, though, the content is still requested. This way the user sees some content. When all the content is available, the browser needs to reparse the HTML content executing the JS and applying the CSS.
This process does result in a FoUC (Flash of Unstyled Content), though, this is preferable to the user pressing cancel.
Comment 1 David Carson 2007-01-15 08:15:18 PST
Created attachment 12457 [details]
Incomplete patch

The attached patch is not complete, and presently does not compile. The code in FrameLoader is not correct as it was migrated from Frame.cpp. It had to be migrated as the frame loading code has been moved from Frame to FrameLoader.
Comment 2 David Carson 2007-01-19 06:45:17 PST
Created attachment 12562 [details]
updated patch

This new patch compiles fine. In this patch double tree is enabled for the  mac build. It is enabled via WebKit/config.h WTF_USE_DOUBLE_TREE
This patch works fine when there are no resources to load, but crashes otherwise.
Patch is Work In Progress (WIP)
Comment 3 Bradley Morrison 2007-02-21 10:29:33 PST
Created attachment 13300 [details]
patch for s60 webkit tot doubletree

You may find this patch by anttik an interesting reference. It implements the s60 doubletree.

This won't of course apply cleanly to WebKit ToT because:
   - our internal ToT is not quite up to date with WebKit ToT (but it;s close)
   - the patch touches some s60 specific files that won't be in WebKiT ToT (or so I'm guessing)
Comment 4 Grace Kloba 2007-03-05 18:42:17 PST
Created attachment 13483 [details]
patch for ToT (rev 19972)

This patch should work with ToT (rev 19972). It is enabled by WTF_USE_DOUBLE_TREE in WebCore/config.h for MAC. While running LayoutTests, I found some small problems.
1. It caused assertion for fast/dom/css-insert-import-rule.html. But I can reproduce the assertion without this patch. It is reported in bug http://bugs.webkit.org/show_bug.cgi?id=12973
2. When running script run-webkit-tests, I saw the following two fail cases.
     fast/events/updateLayoutForHitTest.html -> failed
     fast/events/window-events-bubble2.html -> failed
   But if I run with -1, I got all cases succeeded.
Comment 5 Darin Adler 2007-03-07 07:35:57 PST
Comment on attachment 13483 [details]
patch for ToT (rev 19972)

This looks great to me!

Very interesting work that will probably require more than one pass of review, and from more than just me. Here are my comments from my first time reading through:

Why are we turning the double tree on for the Macintosh platform? Is this feature helpful for the Macintosh desktop browser? If it is, then what browser would ever want this feature off? What about other WebKit clients on Macintosh? Maybe we need a way to opt in per-WebView for compatibility with existing programs that won't expect this behavior, like Dashboard for example.

I don't think it's clear that inFastDisplay is realted to the feature called DOUBLE_TREE. In particular, it doesn't seem to have anything to do with the name "double tree".Similarly, nothing about a function named needToSwitch or a data member named m_needSwitch connects it with the fast display or double tree names. The function name switchToFinalDocIfReady also is not clearly connected to the double tree or to fast mode. The names are too vague given context. There is more than one kind of switch that occurs in a frame loader, for example the switch from provisional state to committed state.

I'd like to see the various names -- for the entire feature, the mode where you haven't yet loaded enough to do "real" display, the transition from one mode to the other, and the two documents -- that are clearly related. Today the feature overall is called "double tree", the early mode is called "fast display", the transition is called "switch" and "switch to final doc".

+#include "Frame.h"
+#include "FrameLoader.h"

This is a strange change in Cache.cpp. That file already includes FrameLoader.h so we don't need to include it again.

Also, we typically only put #if around includes in cases where we have to, like major platform differences. This ifdef change would not require an ifdef.

+#if USE(DOUBLE_TREE)        
+    CachedResourceClientWalker n(m_clients);
+    while (CachedResourceClient *c = n.next())
+        c->notifyFinished(this);

No comment here to indicate why this is correct behavior. I guess the rule is that CSS always pretends to be done loading immediately, but that's not obvious so it needs some kind of comment.

This also has formatting. The * should be next to the class name, CachedResourceClient, not the variable name, c. The surrounding code was already incorrect; it's older code that we haven't gotten too yet. It's better to do it our new way for new code.

+    void replaceDoc(Document* doc) { m_doc = doc; }

Please name this replaceDocument, rather than replaceDoc. We're trying to abbreviate less. Even the class name, DocLoader, is slated to be changed (actually, we're going to merge the two classes DocLoader and DocumentLoader).

+    if (document->isHTMLDocument() &&

Why is this check, in FrameLoader::begin, correct? Why don't we want to do this for non-HTML documents?

+        if (m_needSwitch)
+            return;
+        else
+            switchToFinalDocIfReady();

This code shows some confusion of terminology. When I read that if I think that is says: "if you need switch don't switch, but if you don't need a switch go ahead and switch if ready".

Also, we don't use else statements after return in new WebKit code.

+    switch(cache->type()) {

We put space after switch keywords before te parentheses.

+        case CachedResource::CSSStyleSheet:
+        case CachedResource::Script:
+            m_needSwitch = true;
+            m_CachedRequests.add(cache);
+            cache->ref(this);
+            return true;
+        default:
+        break;
+    }

The break should be indented after default, but more importantly, we normally list all the cases in switch statements like this one. That way the gcc compiler can detect missing switch statements. This is very helpful when adding a new enumeration value. It helps you spot switch statements and make a decision about the right behavior. So please add the other cases here.

m_CachedRequests should not have a capital "C". But also I don't understand in what sense these requests are "cached". Is the cache in "cached" a reference to the WebCore::Cache object? I think we need a better name for these than "cached requests"; ideally one that makes it clear how these relate to the double tree, fast mode, and decision about when to switch.

+void FrameLoader::removeAllCachedRequests()
+    if (!m_CachedRequests.isEmpty()) {
+        HashSet<CachedResource*>::iterator end = m_CachedRequests.end();
+        for (HashSet<CachedResource*>::iterator it = m_CachedRequests.begin(); it != end; ++it)
+            (*it)->deref(this);
+        m_CachedRequests.clear();        
+    }

The isEmpty() check here is unnecessary. It will be quite efficient to just call end(), begin(), and clear() even if the set is empty.

+    if (m_CachedRequests.contains(script)) {
+        HashSet<CachedResource*>::iterator it = m_CachedRequests.find(script);
+        (*it)->deref(this);
+        m_CachedRequests.remove(it);
+        switchToFinalDocIfReady();        
+    }

This can be written more efficiently to do only a single hash table lookup.

    HashSet<CachedResource*>::iterator it = m_CachedRequests.find(script);
    if (it != m_CachedRequests.end()) {

+#define MAX_SRC_LENGTH  128*1024

Since this is C++, we use const for things like this rather than #define. And thus we don't use all capitals either. And we put such constants at the top of the file rather than in between functions.

I think that FrameLoader::switchToFinalDocIfReady has too much repeated code that's also in FrameLoader::begin. FrameLoader has some bad stuff like that in it already, but I'd like to avoid adding more. Instead we should consider a small bit of refactoring so we can share as much of the code as possible.

+            oldDoc->removeAllEventListenersFromAllNodes();

Why is this helpful? Who could have added event listeners? Do we actually let JavaScript run in the "fast display" mode? If so, how can that work? Wouldn't at least some documents have a problem if they were loaded twice?

+    class FrameLoader : Noncopyable, CachedResourceClient {

I'd like this to say "private CachedResourceClient". It's correct to use private inheritance for this, but it should be explicit. The one case where we don't bother saying "private" explicitly is for Noncopyable, because in that one special case it doesn't matter if it's private or public inheritance. So it sets a bad example!

+        // implementation for CachedResourceClient        
+        virtual void notifyFinished(CachedResource*);

This function should be private. We privately inherit from CachedResourceClient.

+        bool addCachedRequest(CachedResource*);
+        void removeAllCachedRequests();
+        void setUseFastDisplay(bool fastDisplay) { m_useFastDisplay = fastDisplay; }
+        void needToSwitch() { m_needSwitch = true; }
+        void switchToFinalDocIfReady();        

Can some of these functions be private? We want everything private unless it needs to be public.

+        // whether to use fastDislay, set by client

Typo "fast dislay".

+        String m_pendingSrc;        

We try to avoid abbreviations where possible, and "src" here is pretty terse.

What kinds of testing have you done so far on this? Is there any way to write tests to make sure we exercise all the code paths?

review- because at least some of these comments will need to be addressed before we can land
Comment 6 Grace Kloba 2007-03-07 22:37:22 PST
Created attachment 13537 [details]
updated patch according to review (rev 20051)

I have updated the patch according to the comment from Darin.

The patch originally set the flag for MAC build to make testing easier. I have changed it to depend on whether MOBILE is set as it is only needed when network has low bandwidth.

I also modified all the names. Hope they are more related this time.

I was planning to reuse begin() when switch. But current code is not suitable. Maybe we can re-factor them later to avoid duplicate code.

I did run through all 7416 LayoutTests with the flag on. There is an assertion as I mentioned before. There are three other fail cases. But they are all fine if I run the test with -1. I will continue look into them.
Comment 7 Dave Hyatt 2007-03-24 19:09:42 PDT
Comment on attachment 13537 [details]
updated patch according to review (rev 20051)

So one problem I have with this patch is that there is way too much generic terminology employed.

The term "fast display" really should be replaced with something a bit more specific in both the ifdef and the Document methods.

This isn't so much about "fast display."  Ideally our desktop Safari is displaying pages fast! ;)

I'd prefer USE(LOW_BANDWIDTH_DISPLAY) and inLowBandwidthDisplay/setLowBandwidthDisplay.

The methods addExternalRequest and removeAllExternalRequests are way too generically named.  I'd use addLowBandwidthDisplayRequest and removeAllLowbandwidthDisplayRequests.  I know this is a bit of a mouthful, but I think the current names don't provide any clue that they are only used with the low bandwidth display code.

Please ifdef the include of CachedResourceClient.h in FrameLoader.h.
Comment 8 Dave Hyatt 2007-03-24 19:10:53 PDT
And sorry for the big delay in review time.   We're in the big stabilization push now, so we're focusing more on reviewing the P1 bugs, etc.
Comment 9 Grace Kloba 2007-03-27 23:08:31 PDT
Created attachment 13840 [details]
updated patch according to review (rev 20546)

Thanks for the review. I have changed the names according to the suggestion.
Comment 10 Dave Hyatt 2007-03-27 23:14:08 PDT
Comment on attachment 13840 [details]
updated patch according to review (rev 20546)

Comment 11 Maciej Stachowiak 2007-04-01 06:41:58 PDT
Does this mode still cause assertion failures on the layout tests? If so then I do not think it should be r+.
Comment 12 Grace Kloba 2007-04-02 23:42:12 PDT
Created attachment 13932 [details]
updated patch for fix layout tests (rev 20675)

I have updated the patch so that it passed all layout tests except one, security/block-test. I suspect that the difference is caused by the nature of the two passes. Can someone inside confirm this? Thanks.
Comment 13 Darin Adler 2007-04-05 09:11:35 PDT
Comment on attachment 13932 [details]
updated patch for fix layout tests (rev 20675)

No real review, just noticed a mistake. removeAllLowbandwidthDisplayRequests should have a capital B
Comment 14 Darin Adler 2007-04-22 16:32:11 PDT
Sending        WebCore/ChangeLog
Sending        WebCore/dom/Document.cpp
Sending        WebCore/dom/Document.h
Sending        WebCore/html/HTMLTokenizer.cpp
Sending        WebCore/loader/Cache.cpp
Sending        WebCore/loader/CachedCSSStyleSheet.cpp
Sending        WebCore/loader/DocLoader.h
Sending        WebCore/loader/FrameLoader.cpp
Sending        WebCore/loader/FrameLoader.h
Transmitting file data .........
Committed revision 21009.
Comment 15 David Kilzer (:ddkilzer) 2007-12-15 06:09:06 PST
Currently broken in ToT WebKit.  See Bug 16433.

Comment 16 Arun Patole 2008-11-11 21:40:50 PST
Hello David/Grace/Darin

I know this bug is Resolved FIXED but just putting my comments and queries in case you are still tracking it.

Some comments on:

+ FrameLoader::switchOutLowBandwidthDisplayIfReady()
+ // drop the old doc
+   oldDoc = 0; 

Here we are dropping the old document, creating new and swapping the ownership.

Has anyone thought of not dropping the existing document and just inserting the JS and CSS at approriate place in existing document once all external JS and CSS are available.

Problem with dropping the existing doc and creating new:
I think if we have recursive external java script calls, lets say we have a page referring to external "script1.js". This script will be skipped in low bandwidth display. Once script1.js is available, switching out of low bandwidth display will happen. 

But, if script1.js calls another external JS script2.js, we will be blocked till we receive script2.js and on slow networks we might see blank page after low bandwidth display for some time I think.

Couldn't we just insert output of all scripts to the existing document which will avoid user seeing blank page in case of external calls.

Arun Patole