I have been seeing some pages not enter the Page Cache because of a "Main Document Error". Investigating this, it turns out that some of these "Main Document Errors" are because there was a subresource load triggered at the same time the navigation was attempted. Examples of this would be websites using different images for default, :hover, and :active states for navigational menu. Navigating using this menu could spawn a subresource load, only to be immediately stopped and prevent the page from entering the cache. I've seen this happen mostly with image loads, and those types of resources do not seem particularly important. The request is going to be cancelled anyways.
Created attachment 74432 [details] [PATCH] Proposed Change Some notes to follow.
• New PageCache logging output for the manual test would look like: > -------- > Determining if page can be cached: > +--- > Determining if frame can be cached navigating from (.../main-document-error-subresource-image-loading.html) to (.../resources/main-document-error-subresource-image-loading-next.html): > -Main document has an error > -BUT it was a cancellation and all loaders during the cancel were page cache acceptable (loading images). > Frame CAN be cached > +--- > Page CAN be cached > -------- > Finished creating CachedFrame for main frame url '.../main-document-error-subresource-image-loading.html' and DocumentLoader 0x107a98c00 • One concern with this patch. When returning back to page now in the page cache, the <img> will still look like the 1st image, but the DOM element's src attribute will have the value of the 2nd image. The DOM is out of sync. It might be more appropriate to refetch the image after returning to the page, but this behavior seems the same as an image taking a long time to timeout.
NOTE: I didn't make an automated test because of this comment in mac/DumpRenderTree.mm: // The back/forward cache is causing problems due to layouts during transition from one page to another. // So, turn it off for now, but we might want to turn it back on some day. [preferences setUsesPageCache:NO];
Comment on attachment 74432 [details] [PATCH] Proposed Change View in context: https://bugs.webkit.org/attachment.cgi?id=74432&action=review > WebCore/history/PageCache.cpp:259 > + && (documentLoader->mainDocumentError().isNull() || (documentLoader->mainDocumentError().isCancellation() && documentLoader->subresourceLoadersArePageCacheAcceptable())) What tells us that cancellation is specifically due to the subresource issue? It seems like any cancellation would return true here, not just the one you mention in the comment. Are cancellations solely due to leaving the page before subresources finish loading, or are there other causes of cancellation? > WebCore/loader/DocumentLoader.cpp:79 > + const ResourceLoaderSet copy = loaders; We normally don’t use const for local variables like this one. Rather than copying the set it would be more efficient to instead use a local vector and the copyToVector function. > WebKit/mac/WebCoreSupport/WebFrameLoaderClient.mm:388 > + // Save the target type, because that information may be checked later on. Is there anything else besides the target type that needs to be preserved? Maybe we should make a function that puts a new NSURLRequest in and preserves everything else. > WebKit/mac/WebCoreSupport/WebFrameLoaderClient.mm:389 > + ResourceRequestBase::TargetType type = request.targetType(); This can be ResourceRequest::TargetType. The use of ResourceRequestBase is an internal implementation detail that we normally should ignore.
Comment on attachment 74432 [details] [PATCH] Proposed Change ResourceRequest::targetType() is evil - it's only used for layering violations in Chromium. It's also not preserved after Objective C API delegate calls.
I've previously filed bug 48483 about targetType evilness, and there was some discussion in bug 48476.
Comment on attachment 74432 [details] [PATCH] Proposed Change View in context: https://bugs.webkit.org/attachment.cgi?id=74432&action=review >> WebCore/history/PageCache.cpp:259 >> + && (documentLoader->mainDocumentError().isNull() || (documentLoader->mainDocumentError().isCancellation() && documentLoader->subresourceLoadersArePageCacheAcceptable())) > > What tells us that cancellation is specifically due to the subresource issue? It seems like any cancellation would return true here, not just the one you mention in the comment. Are cancellations solely due to leaving the page before subresources finish loading, or are there other causes of cancellation? subresourceLoadersArePageCacheAcceptable only accesses a boolean set when there was a cancellation through stopLoading, and is later checked if the page can enter the page cache. I briefly looked around for other causes of cancellations, but I didn't make any conclusions. I guess another possibility would be the user explicitly stopping a load. In that case, should back/forward cause the user to go back to the partial load or not? However that only seems possible in a situation where the main resource hasn't completed loading yet, and this case I tried to only deal with subresource loads after the page has completed loading. I'll investigate this some more. >> WebCore/loader/DocumentLoader.cpp:79 >> + const ResourceLoaderSet copy = loaders; > > We normally don’t use const for local variables like this one. > > Rather than copying the set it would be more efficient to instead use a local vector and the copyToVector function. Sounds good. This copied the form of the two static functions above it (cancelAll and setAllDefersLoading). I'll change those as well. >> WebKit/mac/WebCoreSupport/WebFrameLoaderClient.mm:389 >> + ResourceRequestBase::TargetType type = request.targetType(); > > This can be ResourceRequest::TargetType. The use of ResourceRequestBase is an internal implementation detail that we normally should ignore. I'll look for an alternative to targetType. As Alexey mentioned offline, the choice of "image loads only" seems arbitrary.
This patch raises a question about what back behavior is intended to be: (1) Takes you back to a stale page, closely representing what you last saw that page to be. Potentially broken. (2) Take you back to a working page, no matter if the cost is a reload. It looks like so far, WebKit strongly favors (2). In this patch, I sided with (1) in cases where the page was unlikely to break in serious ways; hence the choice to allow this to happen only for image subresource loads. Given that this is a change in behavior, is this a change that we want to put into WebKit? Maybe it be hidden behind an ENABLE flag, such as BACK_IGNORES_LOADERS or HISTORY_IS_LAST_SEEN. The idea here is to improve back performance in cases where reloading the page would be costly (slow network, small cache). I chatted with Alexey about some ideas for this. He had a couple ideas points: • "I think that we shouldn't allow more brokenness than IE and Firefox" • https://bugs.webkit.org/show_bug.cgi?id=10199 • it might be appropriate to add target type to ResourceLoader Maybe there is also a more generic issue with subresources. So, instead of just image loads, it may make sense to ignore any subresource loads triggered by the user action. For example this test case was a user click triggers an image load and navigation that immediately cancels the load. Probably any load type that happened as a result of that user action can be ignored, since it won't have a chance of succeeding. I feel like I need to do more testing on this, to come up with some concrete examples and numbers to see if this is really a worthwhile improvement.
> I feel like I need to do more testing on this, to come up with some concrete > examples and numbers to see if this is really a worthwhile improvement. Yah, I see this all over the place with Desktop Safari. Many links on apple.com, tons of links on nytimes.com, any header link on engadget.com. I'll improve my logging to see exactly which images are loading to get a better picture, but in all cases where there was a main document error, it was because there was a cancellation and image loaders.
Most of the time this appears to be metrics gathering. An image load is triggered for statistics purposes. It is my understanding, that they would get cancelled. DocumentLoader cancels all subresource loaders, which triggers ResourceLoader::cancel, ::didCancel, and then ResourceHandle::cancel to stop the network connection. In any case, I do think that this is a worthwhile change to make. Or maybe we should allow these image loads to complete? If WebKit is truly preventing these metrics gathering requests. I'll attach some files for some basic logs doing some very basic browsing, so you can get an idea of what these image loads looks like.
Created attachment 75508 [details] [LOGGING] Patch that goes on top of previous patch Not for review. This is how I enabled my logging. It would check if a page would not have been cached due to a MainResourceError, and if that was a cancellation with only image loads. This also naively prints out the loaders (up until the first non-image loader) when the page was cancelled. I typically grep the syslog / stderr from run-safari for ">>>" to get my output.
Created attachment 75511 [details] [LOGS] After no more than 5 minutes of browsing popular sites Some simple browsing of popular sites. I see a lot of "metrics" loads.
> Most of the time this appears to be metrics gathering. Metrics gathering is supposed to be intercepted by PingLoader in ToT. It doesn't catch all cases, but given your empirical data, it seems that improving PingLoader could be a better approach - that would both make statistics work, and improve b/f cache behavior.
(In reply to comment #13) > > Most of the time this appears to be metrics gathering. > > Metrics gathering is supposed to be intercepted by PingLoader in ToT. It doesn't > catch all cases, but given your empirical data, it seems that improving PingLoader > could be a better approach - that would both make statistics work, and improve > b/f cache behavior. That sounds like an excellent idea. I'll look into that now.
So the challenge here is identifying which subresource loads here we should "convert" to PingLoader loads. I can't come up with a way to logically identify them. My first thought was to mark the last "user action" but where does this action start and end. Take a mouse click for example, there can be loads triggered anywhere in all kinds of events (onmousedown, onclick, onmouseup) of which WebCore / WebKit only gets sparse notifications at the WebHTMLView layer. Only on mouseup do we actually detect a WebCore::handleLinkClick and navigate. Considering there are already existing solutions for this (<a ping> and performing the request in pagehide/unload events) I wonder how important it is to detect and allow these image loads to complete. Anyone have thoughts on this?
Created attachment 75599 [details] [PATCH] Addressed Some Comments • Addressed Darin's style review comments. • To not use TargetType I look up the CacheResource and check if it is a CacheImage (isImage) - this means I have to move my code a little earlier in stopLoading because the resource could be evicted > What tells us that cancellation is specifically due to the subresource issue? It seems like any > cancellation would return true here, not just the one you mention in the comment. Are > cancellations solely due to leaving the page before subresources finish loading, or are there > other causes of cancellation? I looked at a lot of code that could cause cancellations. Lots of the code doesn't deal with history navigation at all, much of it deals with provisional document loaders. It is a bit unfortunate in those cases that this makes a loop through the resource loaders, but it breaks early. I haven't found a better place for this code either, which is why it remains in DocumentLoader::stopLoading. As far as I can tell, its okay for this code to run on any cancellation, even the main document via the user canceling the load. This could just be my opinion, but if I stopped a page from loading and it had a missing image, I still wouldn't mind if I clicked a link and came back to the page and it had the same missing image. But maybe this kind of behavior should be opt-in.
I like the idea. It makes sense to allow pages with only images missing in the back-forward cache as makes bf cache usable in many more cases without much danger of breaking anything. This patch should also check the pending resources in the documents CachedResourceLoader. The subresource list in the DocumentLoader is not reliable (In case multiple documents load the same resource in parallel, only one of the subresource lists has the resource. On the other hand some resource tpes are loaded using SubresourceLoaders directly and don't show up in the CachedResourceLoader. Yeah, it is a mess.) Shouldn't you somehow trigger loads for missing images after you restore the page from the cache? This is significant browser behavior change so maybe it should be a setting?
(In reply to comment #17) > This patch should also check the pending resources in the documents > CachedResourceLoader. The subresource list in the DocumentLoader is not > reliable (In case multiple documents load the same resource in parallel, only > one of the subresource lists has the resource. On the other hand some resource > types are loaded using SubresourceLoaders directly and don't show up in the > CachedResourceLoader. Yeah, it is a mess.) That is very good to know! However, in those cases (where the loaders do not show up in m_subresourceLoaders) the page is not prevented from entering the page cache. It was the check in DocumentLoader, which sets the main document error to a cancellation as a result of m_subresourceLoaders not being empty, that I saw prevent the page from being cached. Maybe that is indicative of a separate problem that some subresource loads are ignored? > Shouldn't you somehow trigger loads for missing images after you restore > the page from the cache? If the object is to return the user back to the state they were in, no. If the object is to return the user back to a working page no matter the change, then yes. This still feels like a policy decision. For back/forward loads we already have what used to be called "AllowStale" policy. That didn't make sense for the PageCache, because pages in incomplete states would never have been cached. > This is significant browser behavior change so maybe it should be a setting? Good point. I had mentioned a compile flag in comment #8, but a setting makes a lot of sense. Thanks for the comments.
(In reply to comment #18) > That is very good to know! However, in those cases (where the loaders do not > show up in m_subresourceLoaders) the page is not prevented from entering > the page cache. It was the check in DocumentLoader, which sets the main But perhaps there are resources pending that _should_ stop this page from going to the cache that are not showing up in m_subresourceLoaders? > > Shouldn't you somehow trigger loads for missing images after you restore > > the page from the cache? > > If the object is to return the user back to the state they were in, no. If the > object is to return the user back to a working page no matter the change, > then yes. This still feels like a policy decision. For back/forward loads we > already have what used to be called "AllowStale" policy. That didn't make > sense for the PageCache, because pages in incomplete states would never > have been cached. Without restarting the image loads you will get inconsistent behavior. In some cases hitting back will load the complete document (in case the page was not in the cache), in others the user will need to manually get the images by reloading. User doesn't know about the b/f cache so she won't know why this is. Restarting the image loads shouldn't slow down the cache restore at all, but it would avoid the need to manually reload.
Comment on attachment 75599 [details] [PATCH] Addressed Some Comments Clearing r? flag, as there was some discussion about this not being the best solution for open source at this time.