Summary: | Enable reuse of cached main resources | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Nate Chapin <japhet> | ||||||||||||||||||||||
Component: | WebCore Misc. | Assignee: | Nate Chapin <japhet> | ||||||||||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||||||||||
Severity: | Normal | CC: | abarth, alex, ap, beidson, dfarler, dglazkov, gtk-ews, koivisto, leviw, michaeln, ojan.autocc, philn, rniwa, skyul, webkit.review.bot, xan.lopez, zan | ||||||||||||||||||||||
Priority: | P2 | ||||||||||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||
Bug Depends on: | 107235 | ||||||||||||||||||||||||
Bug Blocks: | 107962, 108380, 108402, 108626 | ||||||||||||||||||||||||
Attachments: |
|
Description
Nate Chapin
2012-12-21 15:55:33 PST
Comment on attachment 180566 [details] Work in Progress Attachment 180566 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/15454595 New failing tests: http/tests/appcache/xhr-foreign-resource.html http/tests/appcache/foreign-fallback.html http/tests/appcache/remove-cache.html Comment on attachment 180566 [details] Work in Progress Attachment 180566 [details] did not pass mac-ews (mac): Output: http://queues.webkit.org/results/15446734 New failing tests: http/tests/inspector/resource-har-pages.html http/tests/appcache/foreign-fallback.html Created attachment 181392 [details]
patch
Comment on attachment 181392 [details] patch View in context: https://bugs.webkit.org/attachment.cgi?id=181392&action=review The non-AppCache parts look reasonable to me. I'm not very confident about the AppCache parts. Maybe we should have someone who knows more about AppCache look at them? > Source/WebCore/dom/Document.cpp:841 > + return documentElement() && documentElement()->hasTagName(htmlTag) && !documentElement()->getAttribute(manifestAttr).isEmpty(); getAttribute -> hasAttribute? > Source/WebCore/loader/cache/CachedRawResource.cpp:91 > + for (size_t i = 0; i < m_redirectChainRequests.size(); i++) { Can we assert that m_redirectChainRequests doesn't change while we're iterating it? Comment on attachment 181392 [details] patch Attachment 181392 [details] did not pass mac-ews (mac): Output: http://queues.webkit.org/results/15700350 New failing tests: http/tests/appcache/offline-access.html http/tests/appcache/main-resource-hash.html http/tests/inspector/appcache/appcache-iframe-manifests.html http/tests/appcache/reload.html http/tests/inspector/appcache/appcache-swap.html http/tests/appcache/top-frame-4.html http/tests/appcache/cyrillic-uri.html http/tests/appcache/crash-when-navigating-away-then-back.html http/tests/appcache/local-content.html http/tests/appcache/top-frame-3.html http/tests/appcache/remove-cache.html http/tests/appcache/xhr-foreign-resource.html Created attachment 181542 [details]
Address abarth's comments, fix mac crashes
Comment on attachment 181542 [details] Address abarth's comments, fix mac crashes View in context: https://bugs.webkit.org/attachment.cgi?id=181542&action=review > Source/WebCore/loader/MainResourceLoader.cpp:383 > + if (r.appCacheID()) This much makes sense to me, if the main resource was loaded from the appcache, don't memcache it. How is the equivalent achieved for the 'native' webcore case? Comment on attachment 181542 [details] Address abarth's comments, fix mac crashes View in context: https://bugs.webkit.org/attachment.cgi?id=181542&action=review >> Source/WebCore/loader/MainResourceLoader.cpp:383 >> + if (r.appCacheID()) > > This much makes sense to me, if the main resource was loaded from the appcache, don't memcache it. How is the equivalent achieved for the 'native' webcore case? maybeLoadFallbackForMainResponse() will return true if a load from appcache has begun. Comment on attachment 181542 [details]
Address abarth's comments, fix mac crashes
This patch LGTM. The only part I'm unsure about is the AppCache integration, but michealn seems happy with that part.
> >> Source/WebCore/loader/MainResourceLoader.cpp:383
> >> + if (r.appCacheID())
> >
> > This much makes sense to me, if the main resource was loaded from the appcache, don't memcache it. How is the equivalent achieved for the 'native' webcore case?
>
> maybeLoadFallbackForMainResponse() will return true if a load from appcache has begun.
I mean in these cases that occur prior to the 'fallback' case.
void maybeLoadMainResource(ResourceRequest&, SubstituteData&);
void maybeLoadMainResourceForRedirect(ResourceRequest&, SubstituteData&);
In the chromium case, the condition you've added covers main resources loaded in all of the above situations. What provisions are made to prevent the SubstitueData provided these other methods from being added to the memcache. (It might be that SubstitueData never goes into the memcache, but idk about that?)
(In reply to comment #10) > > >> Source/WebCore/loader/MainResourceLoader.cpp:383 > > >> + if (r.appCacheID()) > > > > > > This much makes sense to me, if the main resource was loaded from the appcache, don't memcache it. How is the equivalent achieved for the 'native' webcore case? > > > > maybeLoadFallbackForMainResponse() will return true if a load from appcache has begun. > > I mean in these cases that occur prior to the 'fallback' case. > void maybeLoadMainResource(ResourceRequest&, SubstituteData&); maybeLoadMainResource() will cause MainResourceLoader::m_substituteData to be valid if it triggers a load from appcache. In this case, we will never request a CachedResource for this load. > void maybeLoadMainResourceForRedirect(ResourceRequest&, SubstituteData&); > maybeLoadMainResourceForRedirect() will cause MainResourceLoader::m_substituteData to be valid in continueAfterNavigationPolicy(), where we will cancel the CachedResource load if the SubstituteData is valid. A cancelled CachedResource will not be reused. Similarly, for maybeLoadFallbackForMainError(), the CachedResource will show that an error occurred, and while the CachedResource will remain in the MemoryCache, it will not be reused due to the error. > > I mean in these cases that occur prior to the 'fallback' case.
> > void maybeLoadMainResource(ResourceRequest&, SubstituteData&);
>
> maybeLoadMainResource() will cause MainResourceLoader::m_substituteData to be valid if it triggers a load from appcache. In this case, we will never request a CachedResource for this load.
>
> > void maybeLoadMainResourceForRedirect(ResourceRequest&, SubstituteData&);
> >
>
> maybeLoadMainResourceForRedirect() will cause MainResourceLoader::m_substituteData to be valid in continueAfterNavigationPolicy(), where we will cancel the CachedResource load if the SubstituteData is valid. A cancelled CachedResource will not be reused.
>
> Similarly, for maybeLoadFallbackForMainError(), the CachedResource will show that an error occurred, and while the CachedResource will remain in the MemoryCache, it will not be reused due to the error.
Yup, thnx for the explanation.
Also, looks like SubstituteData is never added to the memcache... so in the 'native' case, an appcache retrieved resource (always in the form of SubstituteData) is never put there in any circumstance (main vs subresource / fallback vs not).
Looks good though I have no clue if the app cache stuff and API delegates get synthesized correctly. Is there a way to write a test that shows this working? Internals function to clear networking level cache or something? (In reply to comment #13) > Looks good though I have no clue if the app cache stuff and API delegates get synthesized correctly. > > Is there a way to write a test that shows this working? Internals function to clear networking level cache or something? There are a bunch of tests that hit the cache because the load multiple iframes with the same src. There are quite a few, including the 4 tests that had expectations change in this patch. I could also look into a test that explicitly covers this case, I suppose, though that would probably involve a php script that works exactly once, or perhaps Internals callbacks for cache hits. Created attachment 182218 [details]
+ test
Comment on attachment 182218 [details] + test Attachment 182218 [details] did not pass win-ews (win): Output: http://queues.webkit.org/results/15801217 It makes me a bit uneasy that a significant change like this is being made while we still have a lot of leaks, indicating architectural issues with code in trunk (see bug 106137). (In reply to comment #17) > It makes me a bit uneasy that a significant change like this is being made while we still have a lot of leaks, indicating architectural issues with code in trunk (see bug 106137). Is this a general concern, or a specific concern that http://trac.webkit.org/changeset/137607 caused some of those leaks? I don't know if r137607 specifically is to blame, but the loader appears to not be in a good shape now. Leaks are a particularly bad thing to overlook during major refactoring, because leaking can cover up lifetime issues that will turn into crashes once leaks get fixed. Comment on attachment 182218 [details] + test View in context: https://bugs.webkit.org/attachment.cgi?id=182218&action=review review- because of the bad cast I found > Source/WebCore/loader/FrameLoader.cpp:2890 > + // Main resource delegate messages are synthesized in the same way as main resource SubstituteData loads. For someone not as expert, this comment is a little oblique, like haiku. I would say something more like this: // Main resource delegate messages are synthesized, so we must not send them here. Better to be specific about the function or class that does the synthesizing. I’m not sure what the relevance of the SubstituteData comparison is, because this function doesn’t have code to handle SubstituteData as a special case. It’s sort of a riddle to compare the two here without explaining why that comparison is relevant to this code. > Source/WebCore/loader/MainResourceLoader.cpp:393 > + bool loadFallback = documentLoader()->applicationCacheHost()->maybeLoadFallbackForMainResponse(request(), r); This should be named didLoadFallback. A variable named loadFallback would hold “a fallback”, not a boolean that indicates whether fallback occurred. > Source/WebCore/loader/MainResourceLoader.cpp:397 > + if (r.appCacheID()) > + shouldRemoveResourceFromCache = true; Seems like this needs a why comment. > Source/WebCore/loader/MainResourceLoader.cpp:401 > + // The memory cache doesn't understand the application cache or its caching rules. So if a main resource is served > + // from the application cache, ensure we don't save the result for future use. Is this comment specific to the Chromium code, or is “did load fallback” also an implementation detail of the application cache? > Source/WebCore/loader/MainResourceLoader.cpp:573 > + // If the document specified an application cache manifest, it's risky to store in the memory cache > + // and deny the appcache the chance to intercept it in the future, so remove from the memory cache. Great comment, except for the use of the word “risky”. It seems this is either correct or incorrect, not “risky”. We don’t want to treat our software as if it’s a game of chance. > Source/WebCore/loader/MainResourceLoader.cpp:705 > + if (!loader()) { > + m_substituteDataLoadIdentifier = m_documentLoader->frame()->page()->progress()->createUniqueIdentifier(); > + frameLoader()->notifier()->assignIdentifierToInitialRequest(m_substituteDataLoadIdentifier, documentLoader(), request); > + frameLoader()->notifier()->dispatchWillSendRequest(documentLoader(), m_substituteDataLoadIdentifier, request, ResourceResponse()); > + } Very enigmatic code. Why does not having a loader indicate we are loading substitute data? Seems like a why comment is needed. > Source/WebCore/loader/MainResourceLoader.cpp:714 > if (loader()) > request = loader()->originalRequest(); > + if (equalIgnoringFragmentIdentifier(initialRequest.url(), request.url())) > + request.setURL(initialRequest.url()); The comment talks about a task, but then there are two sets of code that both seem to be trying to accomplish the task in two different ways. Why do we have to do both things? Is there a way to make that obvious, perhaps with more comment? > Source/WebCore/loader/cache/CachedRawResource.h:72 > + Vector<ResourceRequest> m_redirectChainRequests; > + Vector<ResourceResponse> m_redirectChainResponses; Vectors have quite a bit of overhead. I am worried this could have noticeable memory impact; maybe I’m mistaken. Is there a way we can make this more efficient? For example, we could have a Vector of structures containing request/response pairs instead of two vectors, which would save half of the overhead. Generally speaking, I’m worried about storing quite a bit more data here. But maybe CachedRawResource isn’t used as often as I think it is? > Source/WebCore/loader/cache/CachedResourceLoader.cpp:544 > + if ((existingResource->type() == CachedResource::MainResource || existingResource->type() == CachedResource::RawResource) && !static_cast<CachedRawResource*>(existingResource)->canReuse(request)) > return Reload; This casts existingResource to CachedRawResource even when the type is CachedResource::MainResource. Bad cast! I'll look more carefully at the rest of the comments this afternoon. (In reply to comment #20) > (From update of attachment 182218 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=182218&action=review > > > > Source/WebCore/loader/cache/CachedResourceLoader.cpp:544 > > + if ((existingResource->type() == CachedResource::MainResource || existingResource->type() == CachedResource::RawResource) && !static_cast<CachedRawResource*>(existingResource)->canReuse(request)) > > return Reload; > > This casts existingResource to CachedRawResource even when the type is CachedResource::MainResource. Bad cast! This is bad naming, not a bad cast. CachedResource::MainResources are loaded via CachedRawResource. Created attachment 182606 [details]
Address darin's comments
* Added some missing symbols for win and gtk.
* Made CachedRawResource::canReuse() virtual with a simple version on CachedResource to make the casting safer.
* Renamed m_substituteDataLoadIdentifier to better match its usage
* Reworked comments and reorder some code to try to make things clearer
Comment on attachment 182606 [details] Address darin's comments Attachment 182606 [details] did not pass gtk-ews (gtk): Output: http://queues.webkit.org/results/15873583 Created attachment 182621 [details]
Fix symbols.filter
Created attachment 182625 [details]
Fix symbols.filter again
Comment on attachment 182625 [details]
Fix symbols.filter again
Nice, r=me!
Comment on attachment 182625 [details] Fix symbols.filter again Clearing flags on attachment: 182625 Committed r139683: <http://trac.webkit.org/changeset/139683> All reviewed patches have been landed. Closing bug. Rolled out in r139683. This broke a bunch of webkit_unit_tests. Here's an example hideously-long url with logs: http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=@ToT - chromium.org&testType=webkit_unit_tests&tests=WebViewTest.SetCompositionFromExistingText,WebPageNewSerializeTest.PageWithFrames,WebFrameTest.ContentText,WebViewTest.ExtendSelectionAndDelete,WebPageNewSerializeTest.BlankFrames,WebPageSerializerTest.MultipleFrames,WebFrameTest.FrameForEnteredContext (In reply to comment #30) > Here's an example hideously-long url with logs: > > http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=@ToT - chromium.org&testType=webkit_unit_tests&tests=WebViewTest.SetCompositionFromExistingText,WebPageNewSerializeTest.PageWithFrames,WebFrameTest.ContentText,WebViewTest.ExtendSelectionAndDelete,WebPageNewSerializeTest.BlankFrames,WebPageSerializerTest.MultipleFrames,WebFrameTest.FrameForEnteredContext For a webkit.org failure, see http://build.webkit.org/builders/Chromium%20Mac%20Release%20%28Tests%29/builds/31711/steps/webkit-unit-tests/logs/stdio In other news, it appears chromium ews bots don't correct turn red when webkit_unit_tests fail? Will be investigating more carefully tomorrow. (In reply to comment #31) > (In reply to comment #30) > > Here's an example hideously-long url with logs: > > > > http://test-results.appspot.com/dashboards/flakiness_dashboard.html#group=@ToT - chromium.org&testType=webkit_unit_tests&tests=WebViewTest.SetCompositionFromExistingText,WebPageNewSerializeTest.PageWithFrames,WebFrameTest.ContentText,WebViewTest.ExtendSelectionAndDelete,WebPageNewSerializeTest.BlankFrames,WebPageSerializerTest.MultipleFrames,WebFrameTest.FrameForEnteredContext > > For a webkit.org failure, see > http://build.webkit.org/builders/Chromium%20Mac%20Release%20%28Tests%29/builds/31711/steps/webkit-unit-tests/logs/stdio > > In other news, it appears chromium ews bots don't correct turn red when webkit_unit_tests fail? Will be investigating more carefully tomorrow. It appears that these tests failed because of gaps in chromium's test infrastructure, which lives in chromium.org. The good news is that we should be able to just reland https://bugs.webkit.org/attachment.cgi?id=182625&action=review as it stands. The bad news is that it will need to wait for a chromium DEPS roll. The review for the chromium patch is at https://codereview.chromium.org/11940002/ Thanks for working to figure this out! Comment on attachment 182625 [details] Fix symbols.filter again Chromium test infrastructure fixed in http://src.chromium.org/viewvc/chrome?view=rev&revision=177276, rolled into chromium DEPS in http://trac.webkit.org/changeset/139956 I've tested everything I can locally, and I think this is ready to go back into the commit queue. Comment on attachment 182625 [details] Fix symbols.filter again Rejecting attachment 182625 [details] from commit-queue. Failed to run "['/mnt/git/webkit-commit-queue/Tools/Scripts/webkit-patch', '--status-host=queues.webkit.org', '-..." exit_code: 1 cwd: /mnt/git/webkit-commit-queue /mnt/git/webkit-commit-queue/ChangeLog neither lists a valid reviewer nor contains the string "Unreviewed" or "Rubber stamp" (case insensitive). Full output: http://queues.webkit.org/results/15945020 Comment on attachment 182625 [details] Fix symbols.filter again Clearing flags on attachment: 182625 Committed r140005: <http://trac.webkit.org/changeset/140005> All reviewed patches have been landed. Closing bug. Re-opened since this is blocked by bug 107235 Created attachment 184782 [details] Without caching on chromium The diffs from the previous version should be confined to disabling caching for chromium in CachedResourceLoader::requestResource() and chromium's TestExpectations. There are some open questions as to how main resource caching changes chromium's process selection strategy for navigations, and it seems prudent to allow some extra time to research those concerns. See https://bugs.webkit.org/show_bug.cgi?id=107962. Created attachment 185004 [details]
Merged to trunk
Comment on attachment 185004 [details]
Merged to trunk
ok. I think it's fine to take an incremental approach here.
Comment on attachment 185004 [details] Merged to trunk Attachment 185004 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/16182067 New failing tests: http/tests/cache/cached-main-resource.html http/tests/w3c/webperf/submission/resource-timing/html/test_resource_script_types.html Comment on attachment 185004 [details] Merged to trunk Attachment 185004 [details] did not pass chromium-ews (chromium-xvfb): Output: http://queues.webkit.org/results/16154717 New failing tests: http/tests/cache/cached-main-resource.html http/tests/w3c/webperf/submission/resource-timing/html/test_resource_script_types.html Created attachment 185260 [details]
Patch for landing
Comment on attachment 185260 [details] Patch for landing Clearing flags on attachment: 185260 Committed r141136: <http://trac.webkit.org/changeset/141136> All reviewed patches have been landed. Closing bug. It appears that this patch broke Apple's internal performance test suite. Pages are no longer loading property after a while :( Not sure this is very helpful, but I can attest to that anecdotally. Since this landed, every once in a blue moon a tab I have left open will refuse to fetch the latest content when I refresh it. It gives a browser cached result every time. I have to restart WebKit, causing the tab to reload with the right response from the remote server. It happens very infrequently after being left open for a while. (In reply to comment #48) > Not sure this is very helpful, but I can attest to that anecdotally. Since this landed, every once in a blue moon a tab I have left open will refuse to fetch the latest content when I refresh it. It gives a browser cached result every time. I have to restart WebKit, causing the tab to reload with the right response from the remote server. It happens very infrequently after being left open for a while. If you can find a reproducible case (or at least the request/response headers that triggered the problem), I'd be interested to look at it. It looks like r141136 might be causing inspector/debugger/debugger-reload-on-pause.html to crash on Apple Lion Debug WK1 Tests. First seen in build #6316. http://build.webkit.org/results/Apple%20Lion%20Debug%20WK1%20(Tests)/r141136%20(6316)/results.html http://build.webkit.org/results/Apple%20Lion%20Debug%20WK1%20(Tests)/r141136%20(6316)/inspector/debugger/debugger-reload-on-pause-crash-log.txt http://build.webkit.org/results/Apple%20Lion%20Debug%20WK1%20(Tests)/r141305%20(6354)/results.html http://build.webkit.org/results/Apple%20Lion%20Debug%20WK1%20(Tests)/r141305%20(6354)/inspector/debugger/debugger-reload-on-pause-crash-log.txt Seems like maybe sticky caching behavior can be reproduced with a simple PHP script? The following url gives a different timestamp with every refresh on shipping Safari as well as Chrome and FF. But in WebKit it remains the same through refreshes/shift-refreshes. http://www.whitewhale.net/webkit/memory_cache/ Also note that I asked about the status of bug 17998 today, but I see you recently filed 108402 about 304s on main resource requests, so maybe that old bug can be closed as a duplicate. |