Bug 169274

Summary: [link preload] Double downloads of preloaded CSS
Product: WebKit Reporter: Yoav Weiss <yoav>
Component: Page LoadingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: beidson, buildbot, cdumez, commit-queue, dbates, japhet, koivisto, rniwa
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Archive of layout-test-results from ews102 for mac-elcapitan
none
Archive of layout-test-results from ews104 for mac-elcapitan-wk2
none
Archive of layout-test-results from ews116 for mac-elcapitan
none
Archive of layout-test-results from ews124 for ios-simulator-wk2
none
Patch
none
Archive of layout-test-results from ews103 for mac-elcapitan
none
Archive of layout-test-results from ews106 for mac-elcapitan-wk2
none
Archive of layout-test-results from ews114 for mac-elcapitan
none
Patch
none
Archive of layout-test-results from ews113 for mac-elcapitan
none
Patch
none
Patch
none
Patch none

Description Yoav Weiss 2017-03-07 05:22:49 PST
Following STP's exposure of preload as an experimental feature, I got a report[1] saying that we're double-downloading CSS in some cases.

Looking into that, it seems like the fact that header-based preload requests are sent out before the document's charset was determined is causing issues in matching of resources coming out of the MemoryCache (in CachedResourceLoader::determineRevalidationPolicy).

Is it possible to avoid determining the charset of resources until later on? Requiring that charset at request time is difficult for preloaded resources, unless they are deliberately delayed.

[1] https://twitter.com/voxpelli/status/834485581132480512
Comment 1 Yoav Weiss 2017-03-07 06:18:24 PST
Created attachment 303647 [details]
Patch
Comment 2 Yoav Weiss 2017-03-07 06:19:59 PST
I've uploaded a patch which, in case of a preloaded resource, sets the encoding when taking the resource out of MemoryCache instead of reloading it. Let me know what you think of the approach.
Comment 3 Antti Koivisto 2017-03-07 07:44:44 PST
Comment on attachment 303647 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=303647&action=review

> Source/WebCore/loader/cache/CachedResourceLoader.cpp:929
> +    if (textDecoder && !textDecoder->hasEqualEncodingForCharset(cachedResourceRequest.charset())) {
> +        if (!existingResource->isLinkPreload())
> +            return Reload;
> +        existingResource->setEncoding(cachedResourceRequest.charset());
> +    }

What happens if there are multiple requests for the same link-preloaded resource with different charsets?

<link rel="preload" href="foo.js" as="script">
...
<script src="foo.js" charset=utf-8></script>
<script src="foo.js" charset=latin1></script>
Comment 4 Yoav Weiss 2017-03-07 07:59:23 PST
(In reply to comment #3)
> Comment on attachment 303647 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=303647&action=review
> 
> > Source/WebCore/loader/cache/CachedResourceLoader.cpp:929
> > +    if (textDecoder && !textDecoder->hasEqualEncodingForCharset(cachedResourceRequest.charset())) {
> > +        if (!existingResource->isLinkPreload())
> > +            return Reload;
> > +        existingResource->setEncoding(cachedResourceRequest.charset());
> > +    }
> 
> What happens if there are multiple requests for the same link-preloaded
> resource with different charsets?
> 
> <link rel="preload" href="foo.js" as="script">
> ...
> <script src="foo.js" charset=utf-8></script>
> <script src="foo.js" charset=latin1></script>

I agree that in that case, we'll enforce two different encodings on the same resource, and that can result in us sending the wrongly decoded text to the second resource.

Maybe we also need to discard any cached decoded text in the related CachedResources?
Comment 5 Antti Koivisto 2017-03-07 08:21:34 PST
Maybe you can track if the preload has been already been used and if so require full charset match (or reload)? Or maybe just reset the link-preload bit on the first match?

A better, more complicated solution would be to support multiple charsets on CachedResource level. Clients would provide charset as a parameter on all accesses to decoded strings.
Comment 6 Yoav Weiss 2017-03-07 09:21:49 PST
Created attachment 303664 [details]
Patch
Comment 7 Yoav Weiss 2017-03-07 09:23:48 PST
(In reply to comment #5)
> Maybe you can track if the preload has been already been used and if so
> require full charset match (or reload)? Or maybe just reset the link-preload
> bit on the first match?

Yeah, took that route and turned off the flag when reusing a charset mismatch.

> 
> A better, more complicated solution would be to support multiple charsets on
> CachedResource level. Clients would provide charset as a parameter on all
> accesses to decoded strings.
Comment 8 Yoav Weiss 2017-03-07 10:07:31 PST
(In reply to comment #7)
> (In reply to comment #5)
> > Maybe you can track if the preload has been already been used and if so
> > require full charset match (or reload)? Or maybe just reset the link-preload
> > bit on the first match?
> 
> Yeah, took that route and turned off the flag when reusing a charset
> mismatch.
> 
> > 
> > A better, more complicated solution would be to support multiple charsets on
> > CachedResource level. Clients would provide charset as a parameter on all
> > accesses to decoded strings.

I'm thinking about taking the second route, as it seems cleaner (thinking about it some more, turning off the flag in the charset case seems arbitrary).

Looking at the various CachedResource types it seems like:
* CachedCSSStyleSheet never reuses decoded text.
* Scripts do reuse previously decoded text.
* SVG never reuses decoded text.
* XSLT do reuse previously decoded text.

So, I think that simply getting rid of the internal representation of the decoded text whenever there's an encoding change can be enough to maintain charset correctness.

Let me know what you think.
Comment 9 Antti Koivisto 2017-03-07 11:03:51 PST
Comment on attachment 303664 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=303664&action=review

> Source/WebCore/loader/cache/CachedResource.h:235
> -    void setLinkPreload() { m_isLinkPreload = true; }
> +    enum class IsLinkPreload { LinkPreload, NotLinkPreload };
> +    void setLinkPreload(IsLinkPreload isLinkPreload) { m_isLinkPreload = ((isLinkPreload == IsLinkPreload::LinkPreload) ? true : false); }

Enum seem unnecessary. Maybe just setIsLinkPreload(bool)?
Comment 10 Antti Koivisto 2017-03-07 11:05:09 PST
> So, I think that simply getting rid of the internal representation of the
> decoded text whenever there's an encoding change can be enough to maintain
> charset correctness.
> 
> Let me know what you think.

Yeah, that would be nice but just fixing the immediate problem is good too. You can try that out next.
Comment 11 Yoav Weiss 2017-03-07 11:34:46 PST
Created attachment 303689 [details]
Patch
Comment 12 Yoav Weiss 2017-03-07 11:38:03 PST
(In reply to comment #10)
> > So, I think that simply getting rid of the internal representation of the
> > decoded text whenever there's an encoding change can be enough to maintain
> > charset correctness.
> > 
> > Let me know what you think.
> 
> Yeah, that would be nice but just fixing the immediate problem is good too.
> You can try that out next.

I uploaded a new patch with that approach. Let me know if that works, or I should revert to the LinkPreload flag approach.
Comment 13 Yoav Weiss 2017-03-07 11:56:05 PST
Looks like test failures are related. Looking into it.
Comment 14 Build Bot 2017-03-07 12:14:26 PST
Comment on attachment 303689 [details]
Patch

Attachment 303689 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/3260663

New failing tests:
imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/script-charset-01.html
Comment 15 Build Bot 2017-03-07 12:14:31 PST
Created attachment 303695 [details]
Archive of layout-test-results from ews102 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews102  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 16 Build Bot 2017-03-07 12:19:04 PST
Comment on attachment 303689 [details]
Patch

Attachment 303689 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/3260675

New failing tests:
imported/w3c/web-platform-tests/html/syntax/parsing-html-fragments/the-input-byte-stream-007.html
imported/w3c/web-platform-tests/html/syntax/parsing-html-fragments/the-input-byte-stream-001.html
Comment 17 Build Bot 2017-03-07 12:19:08 PST
Created attachment 303696 [details]
Archive of layout-test-results from ews104 for mac-elcapitan-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-elcapitan-wk2  Platform: Mac OS X 10.11.6
Comment 18 Build Bot 2017-03-07 12:40:02 PST
Comment on attachment 303689 [details]
Patch

Attachment 303689 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/3260722

New failing tests:
imported/w3c/web-platform-tests/html/syntax/parsing-html-fragments/the-input-byte-stream-009.html
http/tests/preload/preload-encoding.php
fast/loader/cache-encoding.html
imported/w3c/web-platform-tests/html/syntax/parsing-html-fragments/the-input-byte-stream-015.html
imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/script-charset-01.html
imported/w3c/web-platform-tests/html/syntax/parsing-html-fragments/the-input-byte-stream-030.html
imported/w3c/web-platform-tests/html/syntax/parsing-html-fragments/the-input-byte-stream-016.html
Comment 19 Build Bot 2017-03-07 12:40:06 PST
Created attachment 303701 [details]
Archive of layout-test-results from ews116 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews116  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 20 Build Bot 2017-03-07 12:55:05 PST
Comment on attachment 303689 [details]
Patch

Attachment 303689 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/3260809

New failing tests:
imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/script-charset-01.html
imported/w3c/web-platform-tests/html/syntax/parsing-html-fragments/the-input-byte-stream-009.html
imported/w3c/web-platform-tests/html/syntax/parsing-html-fragments/the-input-byte-stream-018.html
Comment 21 Build Bot 2017-03-07 12:55:10 PST
Created attachment 303705 [details]
Archive of layout-test-results from ews124 for ios-simulator-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews124  Port: ios-simulator-wk2  Platform: Mac OS X 10.11.6
Comment 22 Yoav Weiss 2017-03-07 20:31:06 PST
Created attachment 303768 [details]
Patch
Comment 23 Build Bot 2017-03-07 21:35:33 PST
Comment on attachment 303768 [details]
Patch

Attachment 303768 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/3263401

New failing tests:
imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/script-charset-01.html
Comment 24 Build Bot 2017-03-07 21:35:39 PST
Created attachment 303779 [details]
Archive of layout-test-results from ews103 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews103  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 25 Build Bot 2017-03-07 21:47:47 PST
Comment on attachment 303768 [details]
Patch

Attachment 303768 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/3263441

New failing tests:
imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/script-charset-01.html
Comment 26 Build Bot 2017-03-07 21:47:52 PST
Created attachment 303780 [details]
Archive of layout-test-results from ews106 for mac-elcapitan-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106  Port: mac-elcapitan-wk2  Platform: Mac OS X 10.11.6
Comment 27 Build Bot 2017-03-07 22:00:58 PST
Comment on attachment 303768 [details]
Patch

Attachment 303768 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/3263462

New failing tests:
imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/script-charset-01.html
fast/loader/cache-encoding.html
http/tests/preload/preload-encoding.php
Comment 28 Build Bot 2017-03-07 22:01:04 PST
Created attachment 303781 [details]
Archive of layout-test-results from ews114 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews114  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 29 Yoav Weiss 2017-03-07 23:37:17 PST
Created attachment 303786 [details]
Patch
Comment 30 Build Bot 2017-03-08 01:06:51 PST
Comment on attachment 303786 [details]
Patch

Attachment 303786 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/3264198

New failing tests:
imported/w3c/web-platform-tests/html/semantics/scripting-1/the-script-element/script-charset-01.html
fast/loader/cache-encoding.html
http/tests/preload/preload-encoding.php
Comment 31 Build Bot 2017-03-08 01:06:55 PST
Created attachment 303793 [details]
Archive of layout-test-results from ews113 for mac-elcapitan

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews113  Port: mac-elcapitan  Platform: Mac OS X 10.11.6
Comment 32 Yoav Weiss 2017-03-08 03:03:10 PST
Created attachment 303806 [details]
Patch
Comment 33 Yoav Weiss 2017-03-08 05:13:56 PST
Antti - The tests pass now without reloading CachedResources and performing charset conversions on them when needed. PTAL?
Comment 34 Yoav Weiss 2017-03-09 12:23:07 PST
Created attachment 303948 [details]
Patch
Comment 35 Antti Koivisto 2017-03-09 12:57:14 PST
Comment on attachment 303948 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=303948&action=review

> Source/WebCore/loader/cache/CachedResource.h:235
> +    bool isUnknownCharset() { return m_unknownCharset; }

hasUnknownEncoding()/m_hasUnknownEncoding would match setEncoding and WebKit style better.
Comment 36 Yoav Weiss 2017-03-09 13:29:27 PST
Created attachment 303978 [details]
Patch
Comment 37 Yoav Weiss 2017-03-09 13:30:06 PST
(In reply to comment #35)
> Comment on attachment 303948 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=303948&action=review
> 
> > Source/WebCore/loader/cache/CachedResource.h:235
> > +    bool isUnknownCharset() { return m_unknownCharset; }
> 
> hasUnknownEncoding()/m_hasUnknownEncoding would match setEncoding and WebKit
> style better.

changed
Comment 38 Yoav Weiss 2017-03-09 13:30:38 PST
Comment on attachment 303978 [details]
Patch

Thanks for reviewing! :)
Comment 39 WebKit Commit Bot 2017-03-09 14:29:05 PST
The commit-queue encountered the following flaky tests while processing attachment 303978 [details]:

media/modern-media-controls/macos-fullscreen-media-controls/macos-fullscreen-media-controls-buttons-styles.html bug 168317 (author: graouts@apple.com)
The commit-queue is continuing to process your patch.
Comment 40 WebKit Commit Bot 2017-03-09 14:29:43 PST
Comment on attachment 303978 [details]
Patch

Clearing flags on attachment: 303978

Committed r213672: <http://trac.webkit.org/changeset/213672>
Comment 41 WebKit Commit Bot 2017-03-09 14:29:50 PST
All reviewed patches have been landed.  Closing bug.