Bug 199499 - Make storing cross-origin top-level prefetches in HTTP cache optional
Summary: Make storing cross-origin top-level prefetches in HTTP cache optional
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Rob Buis
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-07-04 07:51 PDT by Rob Buis
Modified: 2019-12-22 13:24 PST (History)
13 users (show)

See Also:


Attachments
Patch (10.79 KB, patch)
2019-07-04 07:56 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (10.87 KB, patch)
2019-07-05 00:04 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (10.97 KB, patch)
2019-07-05 09:53 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (11.11 KB, patch)
2019-07-10 01:27 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (14.36 KB, patch)
2019-07-10 05:53 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews100 for mac-highsierra (3.19 MB, application/zip)
2019-07-10 07:02 PDT, EWS Watchlist
no flags Details
Patch (15.93 KB, patch)
2019-07-10 07:06 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (17.17 KB, patch)
2019-07-18 02:36 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (17.00 KB, patch)
2019-07-18 04:56 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (17.00 KB, patch)
2019-07-18 06:41 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews117 for mac-highsierra (2.82 MB, application/zip)
2019-07-18 10:43 PDT, EWS Watchlist
no flags Details
Patch (17.00 KB, patch)
2019-07-18 11:18 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews117 for mac-highsierra (2.81 MB, application/zip)
2019-07-18 14:18 PDT, EWS Watchlist
no flags Details
Patch (17.04 KB, patch)
2019-07-22 05:19 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (23.96 KB, patch)
2019-07-24 02:33 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (24.16 KB, patch)
2019-07-24 07:12 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (23.96 KB, patch)
2019-07-25 04:14 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (23.99 KB, patch)
2019-07-25 06:32 PDT, Rob Buis
no flags Details | Formatted Diff | Diff
Patch (24.01 KB, patch)
2019-07-25 12:05 PDT, Rob Buis
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Rob Buis 2019-07-04 07:51:52 PDT
Currently when we navigate we check if there is a cross-origin top-level prefetch we can use for the navigation. The current solution uses the HTTP cache as a way to store the prefetch and to immediately use it for the navigation. However this solution fails in case the prefetch is not cacheable in the first place. Instead of this, skip storing in the HTTP cache and simulate a HTTP Cache entry to process the load.
Comment 1 Rob Buis 2019-07-04 07:56:34 PDT
Created attachment 373466 [details]
Patch
Comment 2 Rob Buis 2019-07-05 00:04:59 PDT
Created attachment 373481 [details]
Patch
Comment 3 Rob Buis 2019-07-05 09:53:16 PDT
Created attachment 373507 [details]
Patch
Comment 4 youenn fablet 2019-07-09 10:02:23 PDT
Comment on attachment 373507 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=373507&action=review

> Source/WebKit/ChangeLog:12
> +        not cacheable.

That might be often ok but I wonder whether there are some corner cases which would benefit for some kind of cache validation.
For instance, say the prefetched response has a Vary header and the prefetched request and real navigation request have a different header in this Vary header.
The main case might be if Vary header contains Cookie as the prefetched request will not have it while the navigation request will have it.

With the current model, a page should probably never try to prefetch another page with Vary header Cookie.
If that happens, I am unclear what we should do. The safest approach might be to not use the prefetch at all.

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:228
>                      m_cache->store(request, entry->response, entry->releaseBuffer(), nullptr);

It might read better if we have just one !entry->redirectRequest.isNull() check.

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:246
> +        loader->retrieveCacheEntryInternal(WTFMove(entry), request);

request can be moved here.

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:250
> +void NetworkResourceLoader::retrieveCacheEntryInternal(std::unique_ptr<NetworkCache::Entry> entry, WebCore::ResourceRequest request)

entry should probably be an r-value.
request has well, or maybe a const reference.

> LayoutTests/http/tests/cache/resources/prefetched-main-resource-iframe.php:-3
> -    header('Cache-Control: max-age=3600');

I think it would be good to validate that a prefetched entry that is used and is cacheable actually goes in the cache.
This was guaranteed previously but with this patch, this is no longer the case.
Comment 5 Kinuko Yasuda 2019-07-10 00:47:56 PDT
> For instance, say the prefetched response has a Vary header and the prefetched request and real navigation request have a different header in this Vary header.
> The main case might be if Vary header contains Cookie as the prefetched request will not have it while the navigation request will have it.

Fyi chromium's current impl also checks Vary headers when matching with prefetched requests, while skipping other validations (up to a certain period). I feel not reusing the prefetched resources if a Vary header Cookie exists makes sense, while this might need a spec discussion to conclude.
Comment 6 Rob Buis 2019-07-10 01:27:58 PDT
Created attachment 373827 [details]
Patch
Comment 7 Rob Buis 2019-07-10 05:53:03 PDT
Created attachment 373831 [details]
Patch
Comment 8 EWS Watchlist 2019-07-10 07:02:03 PDT
Comment on attachment 373831 [details]
Patch

Attachment 373831 [details] did not pass mac-ews (mac):
Output: https://webkit-queues.webkit.org/results/12707380

New failing tests:
http/tests/cache/link-prefetch-main-resource-skip-disk-cache.html
Comment 9 EWS Watchlist 2019-07-10 07:02:04 PDT
Created attachment 373834 [details]
Archive of layout-test-results from ews100 for mac-highsierra

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews100  Port: mac-highsierra  Platform: Mac OS X 10.13.6
Comment 10 Rob Buis 2019-07-10 07:06:22 PDT
Created attachment 373835 [details]
Patch
Comment 11 Rob Buis 2019-07-10 11:56:26 PDT
Comment on attachment 373507 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=373507&action=review

>> Source/WebKit/ChangeLog:12
>> +        not cacheable.
> 
> That might be often ok but I wonder whether there are some corner cases which would benefit for some kind of cache validation.
> For instance, say the prefetched response has a Vary header and the prefetched request and real navigation request have a different header in this Vary header.
> The main case might be if Vary header contains Cookie as the prefetched request will not have it while the navigation request will have it.
> 
> With the current model, a page should probably never try to prefetch another page with Vary header Cookie.
> If that happens, I am unclear what we should do. The safest approach might be to not use the prefetch at all.

This came up in a different discussion thread. I think the best place for this is a new bug.

>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:228
>>                      m_cache->store(request, entry->response, entry->releaseBuffer(), nullptr);
> 
> It might read better if we have just one !entry->redirectRequest.isNull() check.

Makes sense, done.

>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:246
>> +        loader->retrieveCacheEntryInternal(WTFMove(entry), request);
> 
> request can be moved here.

Done.

>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:250
>> +void NetworkResourceLoader::retrieveCacheEntryInternal(std::unique_ptr<NetworkCache::Entry> entry, WebCore::ResourceRequest request)
> 
> entry should probably be an r-value.
> request has well, or maybe a const reference.

Done.

>> LayoutTests/http/tests/cache/resources/prefetched-main-resource-iframe.php:-3
>> -    header('Cache-Control: max-age=3600');
> 
> I think it would be good to validate that a prefetched entry that is used and is cacheable actually goes in the cache.
> This was guaranteed previously but with this patch, this is no longer the case.

I did that by splitting up the test into one prefetched resource with Cache-Control and one without. Let me know if you can think of a better way though....
Comment 12 youenn fablet 2019-07-17 12:03:37 PDT
Comment on attachment 373835 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=373835&action=review

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:218
> +                std::unique_ptr<NetworkCache::Entry> cacheEntry;

I would remove this line and add 'auto cacheEntry = ' below.

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:219
> +                ResourceRequest requestCopy { request };

Ditto here, I would inline it like loader->retrieveCacheEntryInternal(WTFMove(cacheEntry), ResourceRequest { request })

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:-221
> -                } else

Could call 'return;' here and remove the else statement.

> LayoutTests/http/tests/cache/link-prefetch-main-resource-skip-disk-cache.html:14
> +<link rel="prefetch" href="http://localhost:8000/cache/resources/prefetched-main-resource-skip-disk-cache.php" onload="loadAfterPrefetch();">

We will probably want to remove unload event handler for prefetch.
I would tend to not use that one if possible.
Also, I would tend to try writing it as a testharness test so that we can later upstream it to WPT.
If testharness is not convenient, we should discuss improvements in WPT GitHub.
Comment 13 Rob Buis 2019-07-18 02:36:35 PDT
Created attachment 374368 [details]
Patch
Comment 14 EWS Watchlist 2019-07-18 02:39:37 PDT
Attachment 374368 [details] did not pass style-queue:


ERROR: /Volumes/Data/StyleQueue/WebKit/LayoutTests/platform/mac-wk1/TestExpectations:736:  Path does not exist.  [test/expectations] [5]
Total errors found: 1 in 11 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 15 Rob Buis 2019-07-18 04:56:09 PDT
Created attachment 374376 [details]
Patch
Comment 16 Rob Buis 2019-07-18 06:41:07 PDT
Created attachment 374381 [details]
Patch
Comment 17 EWS Watchlist 2019-07-18 10:43:51 PDT
Comment on attachment 374381 [details]
Patch

Attachment 374381 [details] did not pass mac-debug-ews (mac):
Output: https://webkit-queues.webkit.org/results/12766899

New failing tests:
storage/indexeddb/pending-version-change-on-exit-private.html
Comment 18 EWS Watchlist 2019-07-18 10:43:53 PDT
Created attachment 374395 [details]
Archive of layout-test-results from ews117 for mac-highsierra

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews117  Port: mac-highsierra  Platform: Mac OS X 10.13.6
Comment 19 Rob Buis 2019-07-18 11:18:08 PDT
Created attachment 374397 [details]
Patch
Comment 20 Rob Buis 2019-07-18 13:31:03 PDT
Comment on attachment 373835 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=373835&action=review

>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:218
>> +                std::unique_ptr<NetworkCache::Entry> cacheEntry;
> 
> I would remove this line and add 'auto cacheEntry = ' below.

Done.

>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:219
>> +                ResourceRequest requestCopy { request };
> 
> Ditto here, I would inline it like loader->retrieveCacheEntryInternal(WTFMove(cacheEntry), ResourceRequest { request })

Done.

>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:-221
>> -                } else
> 
> Could call 'return;' here and remove the else statement.

Done.

>> LayoutTests/http/tests/cache/link-prefetch-main-resource-skip-disk-cache.html:14
>> +<link rel="prefetch" href="http://localhost:8000/cache/resources/prefetched-main-resource-skip-disk-cache.php" onload="loadAfterPrefetch();">
> 
> We will probably want to remove unload event handler for prefetch.
> I would tend to not use that one if possible.
> Also, I would tend to try writing it as a testharness test so that we can later upstream it to WPT.
> If testharness is not convenient, we should discuss improvements in WPT GitHub.

Done, and moved the test to http/wpt. The timeouts are tricky but the best I can think of at the moment. Other ideas? WebDriver? postMessage?
Comment 21 EWS Watchlist 2019-07-18 14:18:36 PDT
Comment on attachment 374397 [details]
Patch

Attachment 374397 [details] did not pass mac-debug-ews (mac):
Output: https://webkit-queues.webkit.org/results/12767999

New failing tests:
storage/indexeddb/pending-version-change-on-exit-private.html
Comment 22 EWS Watchlist 2019-07-18 14:18:38 PDT
Created attachment 374417 [details]
Archive of layout-test-results from ews117 for mac-highsierra

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews117  Port: mac-highsierra  Platform: Mac OS X 10.13.6
Comment 23 Rob Buis 2019-07-22 05:19:48 PDT
Created attachment 374596 [details]
Patch
Comment 24 Rob Buis 2019-07-22 13:57:35 PDT
(In reply to youenn fablet from comment #4)
> That might be often ok but I wonder whether there are some corner cases
> which would benefit for some kind of cache validation.
> For instance, say the prefetched response has a Vary header and the
> prefetched request and real navigation request have a different header in
> this Vary header.
> The main case might be if Vary header contains Cookie as the prefetched
> request will not have it while the navigation request will have it.
> 
> With the current model, a page should probably never try to prefetch another
> page with Vary header Cookie.
> If that happens, I am unclear what we should do. The safest approach might
> be to not use the prefetch at all.

I started a bug for this (https://bugs.webkit.org/show_bug.cgi?id=200000).
Comment 25 youenn fablet 2019-07-23 08:49:50 PDT
Comment on attachment 374596 [details]
Patch

Looks almost ready to me.
Some comments below.

View in context: https://bugs.webkit.org/attachment.cgi?id=374596&action=review

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:220
>              if (auto entry = session->prefetchCache().take(request.url())) {

Let's add a FIXME with the bugzilla link to do some validation for the prefetch entry, for instance the Vary header.

> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:223
> +                    loader->retrieveCacheEntryInternal(WTFMove(cacheEntry), ResourceRequest {request});

s/ {request}/ { request }/

> LayoutTests/http/tests/cache/resources/prefetched-main-resource-iframe.php:3
>      header("Access-Control-Allow-Origin: http://127.0.0.1:8000");

Why do we need Access-Control-Allow-Origin header for the prefetch case?

> LayoutTests/http/tests/cache/resources/prefetched-main-resource.php:3
> +header("Access-Control-Allow-Origin: http://127.0.0.1:8000");

Why do we need ACAO here?
Isn't prefetched-main-resource.php be loaded using http://localhost8000/cache/resources/prefetched-main-resource.php URL?

> LayoutTests/http/tests/cache/resources/prefetched-main-resource.php:13
> +      document.getElementById('log').innerText = 'FAIL';

Maybe we should make sure to have this 'FAIL' be different from below 'FAIL'.
Change it to 'FAIL: resource is not in the disk cache.' might help debugging.

> LayoutTests/http/wpt/prefetch/resources/main-resource-skip-disk-cache.py:2
> +    headers = [("Content-Type", "text/html")]

This might depend on how we do cache validation, setting Cache-Control: no-store would probably defeat using the prefetch.
In that case, the prefetch code should probably cancel the load whenever receiving the response.

> LayoutTests/http/wpt/prefetch/resources/main-resource-skip-disk-cache.py:7
> +  var ret = '%(prefetch)s';

s/ret/result/

> LayoutTests/http/wpt/prefetch/resources/main-resource-skip-disk-cache.py:9
> +    fetch('%(url)s').then(function(response) {

You need to await here so that ret can go to 'FAIL'.
Otherwise you will call postMessage and fetch will then change ret.

> LayoutTests/http/wpt/prefetch/resources/navigate-skip-disk-cache.html:10
> +  setTimeout(() => { window.location = get_host_info().HTTP_REMOTE_ORIGIN + "/WebKit/prefetch/resources/main-resource-skip-disk-cache.py" }, 500);

If we would like to move away from 500, one possibility would be to add an Internals API that would tell when the link prefetch is done.
Comment 26 Rob Buis 2019-07-24 02:33:07 PDT
Created attachment 374769 [details]
Patch
Comment 27 Rob Buis 2019-07-24 07:12:12 PDT
Created attachment 374773 [details]
Patch
Comment 28 Rob Buis 2019-07-24 13:39:11 PDT
Comment on attachment 374596 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374596&action=review

>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:220
>>              if (auto entry = session->prefetchCache().take(request.url())) {
> 
> Let's add a FIXME with the bugzilla link to do some validation for the prefetch entry, for instance the Vary header.

Done.

>> Source/WebKit/NetworkProcess/NetworkResourceLoader.cpp:223
>> +                    loader->retrieveCacheEntryInternal(WTFMove(cacheEntry), ResourceRequest {request});
> 
> s/ {request}/ { request }/

Done.

>> LayoutTests/http/tests/cache/resources/prefetched-main-resource-iframe.php:3
>>      header("Access-Control-Allow-Origin: http://127.0.0.1:8000");
> 
> Why do we need Access-Control-Allow-Origin header for the prefetch case?

You are right, I removed it.

>> LayoutTests/http/tests/cache/resources/prefetched-main-resource.php:3
>> +header("Access-Control-Allow-Origin: http://127.0.0.1:8000");
> 
> Why do we need ACAO here?
> Isn't prefetched-main-resource.php be loaded using http://localhost8000/cache/resources/prefetched-main-resource.php URL?

Ditto.

>> LayoutTests/http/tests/cache/resources/prefetched-main-resource.php:13
>> +      document.getElementById('log').innerText = 'FAIL';
> 
> Maybe we should make sure to have this 'FAIL' be different from below 'FAIL'.
> Change it to 'FAIL: resource is not in the disk cache.' might help debugging.

Done.

>> LayoutTests/http/wpt/prefetch/resources/main-resource-skip-disk-cache.py:2
>> +    headers = [("Content-Type", "text/html")]
> 
> This might depend on how we do cache validation, setting Cache-Control: no-store would probably defeat using the prefetch.
> In that case, the prefetch code should probably cancel the load whenever receiving the response.

Do you mean, when a prefetch gets a response with Cache-Control: no-store, it should cancel the prefetch?

>> LayoutTests/http/wpt/prefetch/resources/main-resource-skip-disk-cache.py:7
>> +  var ret = '%(prefetch)s';
> 
> s/ret/result/

Done.

>> LayoutTests/http/wpt/prefetch/resources/main-resource-skip-disk-cache.py:9
>> +    fetch('%(url)s').then(function(response) {
> 
> You need to await here so that ret can go to 'FAIL'.
> Otherwise you will call postMessage and fetch will then change ret.

Done.

>> LayoutTests/http/wpt/prefetch/resources/navigate-skip-disk-cache.html:10
>> +  setTimeout(() => { window.location = get_host_info().HTTP_REMOTE_ORIGIN + "/WebKit/prefetch/resources/main-resource-skip-disk-cache.py" }, 500);
> 
> If we would like to move away from 500, one possibility would be to add an Internals API that would tell when the link prefetch is done.

I added such an API. Unfortunately that makes it WPT incompatible. Still not sue how to fix that, WebDriver maybe?
Comment 29 youenn fablet 2019-07-24 14:31:41 PDT
Comment on attachment 374773 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=374773&action=review

> Source/WebCore/testing/Internals.h:32
> +#include "EventListener.h"

We probably do not need that one.

> Source/WebCore/testing/Internals.h:36
> +#include "JSEventListener.h"

Is this one needed to build JSInternals.cpp?
Could we instead do "class EventListener"?
Comment 30 Rob Buis 2019-07-25 04:14:59 PDT
Created attachment 374887 [details]
Patch
Comment 31 Rob Buis 2019-07-25 06:32:54 PDT
Created attachment 374888 [details]
Patch
Comment 32 WebKit Commit Bot 2019-07-25 08:37:16 PDT
Comment on attachment 374888 [details]
Patch

Clearing flags on attachment: 374888

Committed r247821: <https://trac.webkit.org/changeset/247821>
Comment 33 WebKit Commit Bot 2019-07-25 08:37:18 PDT
All reviewed patches have been landed.  Closing bug.
Comment 34 Radar WebKit Bug Importer 2019-07-25 08:38:43 PDT
<rdar://problem/53543632>
Comment 36 Truitt Savell 2019-07-25 10:55:34 PDT
Reverted r247821 for reason:

Caused two crashing Layout Tests

Committed r247825: <https://trac.webkit.org/changeset/247825>
Comment 37 Rob Buis 2019-07-25 12:05:26 PDT
Created attachment 374898 [details]
Patch
Comment 38 WebKit Commit Bot 2019-07-26 00:30:01 PDT
Comment on attachment 374898 [details]
Patch

Clearing flags on attachment: 374898

Committed r247860: <https://trac.webkit.org/changeset/247860>
Comment 39 WebKit Commit Bot 2019-07-26 00:30:04 PDT
All reviewed patches have been landed.  Closing bug.