Add Link header support for preload.
Created attachment 296554 [details] Patch
Comment on attachment 296554 [details] Patch Attachment 296554 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/2655791 New failing tests: http/tests/preload/dynamic_remove_preload_href.html
Created attachment 296566 [details] Archive of layout-test-results from ews100 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews100 Port: mac-yosemite Platform: Mac OS X 10.10.5
Comment on attachment 296554 [details] Patch Attachment 296554 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/2655814 New failing tests: http/tests/preload/dynamic_remove_preload_href.html
Created attachment 296568 [details] Archive of layout-test-results from ews114 for mac-yosemite The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews114 Port: mac-yosemite Platform: Mac OS X 10.10.5
Comment on attachment 296554 [details] Patch Attachment 296554 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/2655897 New failing tests: http/tests/preload/dynamic_remove_preload_href.html
Created attachment 296569 [details] Archive of layout-test-results from ews104 for mac-yosemite-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews104 Port: mac-yosemite-wk2 Platform: Mac OS X 10.10.5
Comment on attachment 296554 [details] Patch Attachment 296554 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/2655931 New failing tests: http/tests/preload/dynamic_remove_preload_href.html
Created attachment 296573 [details] Archive of layout-test-results from ews125 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews125 Port: ios-simulator-wk2 Platform: Mac OS X 10.11.6
Created attachment 296593 [details] Patch
Created attachment 296683 [details] Patch
Comment on attachment 296683 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=296683&action=review > Source/WebCore/loader/LinkHeader.cpp:39 > +static bool isWhitespace(CharType chr) > +{ > + return (chr == ' ') || (chr == '\t'); > +} This function should have a better name. We probably already have a function that checks for tab or space. > Source/WebCore/loader/LinkHeader.cpp:45 > +static bool isValidURLChar(CharType chr) > +{ > + return !isWhitespace(chr) && chr != '>'; > +} Why do we think a character is valid for a URL if it's not ' ', '\t' or '>'? URLs can have spaces and '>', and they ignore tabs while parsing. > Source/WebCore/loader/LinkLoader.cpp:161 > +std::optional<std::unique_ptr<LinkPreloadResourceClient>> LinkLoader::preloadIfNeeded(const LinkRelAttribute& relAttribute, const URL& href, Document& document, const String& as, std::unique_ptr already has a value that indicates it is empty. The std::optional seems excessive. > Source/WebCore/loader/LinkLoader.cpp:162 > + const String& crossOriginMode, LinkLoader* loader, LinkLoaderClient* client) This doesn't need to be on a new line. Also, making client a pointer instead of a reference seems like a step in the wrong direction. This function has one call site, and it could pass a reference. > Source/WebCore/loader/LinkLoader.cpp:164 > + fprintf(stderr, "preloadIfNeeded\n"); Do we really want to print to stderr every time this function is called? > LayoutTests/http/tests/preload/resources/download_resources_from_header.php:2 > +header("Link: <../resources/dummy.js>; rel=preload; as=script", false); We need more tests with non-ASCII characters, invalid URLs, attempts at making CORS requests from link headers, null characters, vertical tabs, URLs with '>' in them, etc.
Comment on attachment 296683 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=296683&action=review >> Source/WebCore/loader/LinkHeader.cpp:39 >> +} > > This function should have a better name. We probably already have a function that checks for tab or space. The only one I found is a static inline one in html/track/WebVTTParser.h I could move this one to platform/network/HTTPParsers and reuse it in there, if you think it's a good idea. In the mean time, I changed its name to isSpaceOrTab. >> Source/WebCore/loader/LinkHeader.cpp:45 >> +} > > Why do we think a character is valid for a URL if it's not ' ', '\t' or '>'? URLs can have spaces and '>', and they ignore tabs while parsing. The function's name was misleading. It was aimed at finding the end of the URL part of the header. At the same time, it might be better/safer to only read URLs up to the first invalid char. I changed the function accordingly. >> Source/WebCore/loader/LinkLoader.cpp:161 >> +std::optional<std::unique_ptr<LinkPreloadResourceClient>> LinkLoader::preloadIfNeeded(const LinkRelAttribute& relAttribute, const URL& href, Document& document, const String& as, > > std::unique_ptr already has a value that indicates it is empty. The std::optional seems excessive. True. Removed the std::optional >> Source/WebCore/loader/LinkLoader.cpp:162 >> + const String& crossOriginMode, LinkLoader* loader, LinkLoaderClient* client) > > This doesn't need to be on a new line. Also, making client a pointer instead of a reference seems like a step in the wrong direction. This function has one call site, and it could pass a reference. OK regarding the new line. Regarding making LinkLoaderClient a reference, this function is called from both loadLinksFromHeader and from loadLink. In loadLinksFromHeader there is no client (as it doesn't make sense to register load/error events in this context), which is the reason I chose to pass it as a pointer, and pass nullptr when a client is missing. >> Source/WebCore/loader/LinkLoader.cpp:164 >> + fprintf(stderr, "preloadIfNeeded\n"); > > Do we really want to print to stderr every time this function is called? No, I don't. Omission from debugging phase :/ >> LayoutTests/http/tests/preload/resources/download_resources_from_header.php:2 >> +header("Link: <../resources/dummy.js>; rel=preload; as=script", false); > > We need more tests with non-ASCII characters, invalid URLs, attempts at making CORS requests from link headers, null characters, vertical tabs, URLs with '>' in them, etc. Good call. Adding a test for invalid URLs. What do you want to test with CORS requests?
Created attachment 297011 [details] Patch
Friendly ping :)
Comment on attachment 297011 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=297011&action=review > Source/WebCore/loader/LinkHeader.cpp:42 > +static bool isValidURLChar(CharType chr) This seems very strange. Determining if a character is valid for a URL without context isn't something you should be able to do. For example, if you have an invalid surrogate pair with a valid begin and an invalid end that is one of the characters that is determined here to be "invalid" then this would naively think the second half of an invalid surrogate pair is an invalid character when in fact the whole thing is something that will be URL encoded to be the replacement character. It's bad enough that CSP has its own definition of what parsing part of a URL from a header should be (and maybe we should align with that "parsing") that is separate from the URL spec. Let's not add a third definition of what a URL is supposed to look like. > Source/WebCore/loader/LinkHeader.cpp:95 > +static bool parseURL(CharType*& position, CharType* end, String& url) end should definitely be const I don't like the name "parseURL". This function doesn't parse the URL, it just finds the end of the URL. This should return a std::optional<String> > Source/WebCore/loader/LinkHeader.cpp:122 > + ASSERT(position <= end); This assertion should be at every function that has a position and an end. It seems like an iterator type would be better than raw pointers in this whole patch. StringView.h has code point and code unit iterators. > Source/WebCore/loader/LinkHeader.cpp:138 > +static bool parseParameterDelimiter(CharType*& position, CharType* end, bool& isValid) This should return std::optional<bool> instead of a bool with a bool& parameter. > Source/WebCore/loader/LinkHeader.cpp:274 > + else if (name == LinkParameterAnchor) It looks like a switch would be better here. > Source/WebCore/loader/LinkHeader.h:78 > + LinkHeader& operator[](size_t i) { return m_headerSet[i]; } This doesn't have bounds checks. > Source/WebCore/loader/LinkHeader.h:82 > + template <typename CharType> typename CharacterType > Source/WebCore/loader/LinkHeader.h:83 > + void init(CharType* headerValue, unsigned len); unsigned len -> size_t length Or even better, use an iterator type. See below. > Source/WebCore/loader/LinkLoader.cpp:99 > + if (url == baseURL) What if url and baseURL only differ in the fragment? > LayoutTests/http/tests/preload/resources/download_resources_from_header.php:2 > +header("Link: <../resources/dummy.js>; rel=preload; as=script", false); Can we put a complete absolute URL here? We should add a test with a URL that looks like http://localhost:8000/preload/resources/something.html on a page with CSP that doesn't allow loading from localhost. We should also add a test with an invalid URL like http://host:invalidport/ just to see what happens. It's hard to make a URL fail to parse when only giving the path. A fragment-only URL might also be interesting. Also a lowercase "link", a missing colon, no <> or URL, <> but no characters inside it, and a missing space after the colon. Also what about emojis, invalid surrogate pairs, and null characters in the middle of the URL?
Comment on attachment 297011 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=297011&action=review >> Source/WebCore/loader/LinkHeader.cpp:42 >> +static bool isValidURLChar(CharType chr) > > This seems very strange. Determining if a character is valid for a URL without context isn't something you should be able to do. For example, if you have an invalid surrogate pair with a valid begin and an invalid end that is one of the characters that is determined here to be "invalid" then this would naively think the second half of an invalid surrogate pair is an invalid character when in fact the whole thing is something that will be URL encoded to be the replacement character. It's bad enough that CSP has its own definition of what parsing part of a URL from a header should be (and maybe we should align with that "parsing") that is separate from the URL spec. Let's not add a third definition of what a URL is supposed to look like. OK. Since URL validation is not critical here (and will happen further down the pipe), I've changed this to just detect ">" as the URL boundary. >> Source/WebCore/loader/LinkHeader.cpp:95 >> +static bool parseURL(CharType*& position, CharType* end, String& url) > > end should definitely be const > I don't like the name "parseURL". This function doesn't parse the URL, it just finds the end of the URL. > This should return a std::optional<String> Good point regarding end. Changed parseURL into FindURLBoundaries with an optional return value. >> Source/WebCore/loader/LinkHeader.cpp:122 >> + ASSERT(position <= end); > > This assertion should be at every function that has a position and an end. > It seems like an iterator type would be better than raw pointers in this whole patch. StringView.h has code point and code unit iterators. Good point regarding the assertion. Added everywhere else. Regarding switching to StringView's iterators I tried to convert the logic to that, but it seems to be missing some concepts needed here (skipWhile, skipExactly and/or getting an iterator for a position in the middle of a StringView). For now I kept it using pointers, but if we want to go the iterator route, it'd require adding a new iterator type (e.g. ParserStringView) which will include the required concepts. Let me know if you think this is the route we should take and if so, where should such a class be located. >> Source/WebCore/loader/LinkHeader.cpp:138 >> +static bool parseParameterDelimiter(CharType*& position, CharType* end, bool& isValid) > > This should return std::optional<bool> instead of a bool with a bool& parameter. OK >> Source/WebCore/loader/LinkHeader.cpp:274 >> + else if (name == LinkParameterAnchor) > > It looks like a switch would be better here. switched >> Source/WebCore/loader/LinkHeader.h:78 >> + LinkHeader& operator[](size_t i) { return m_headerSet[i]; } > > This doesn't have bounds checks. This as well as size() don't seem to be required. Removed both >> Source/WebCore/loader/LinkHeader.h:82 >> + template <typename CharType> > > typename CharacterType OK >> Source/WebCore/loader/LinkHeader.h:83 >> + void init(CharType* headerValue, unsigned len); > > unsigned len -> size_t length > Or even better, use an iterator type. See below. switched to size_t for now >> Source/WebCore/loader/LinkLoader.cpp:99 >> + if (url == baseURL) > > What if url and baseURL only differ in the fragment? Good point. Switched to use equalIgnoringFragmentIdentifier >> LayoutTests/http/tests/preload/resources/download_resources_from_header.php:2 >> +header("Link: <../resources/dummy.js>; rel=preload; as=script", false); > > Can we put a complete absolute URL here? We should add a test with a URL that looks like http://localhost:8000/preload/resources/something.html on a page with CSP that doesn't allow loading from localhost. We should also add a test with an invalid URL like http://host:invalidport/ just to see what happens. It's hard to make a URL fail to parse when only giving the path. A fragment-only URL might also be interesting. Also a lowercase "link", a missing colon, no <> or URL, <> but no characters inside it, and a missing space after the colon. Also what about emojis, invalid surrogate pairs, and null characters in the middle of the URL? Added tests for CSP blocked, invalid port, fragment-only, lower-case link, missing colon, missing spaces, missing URL, etc. (Invalid port passes the isPreloaded() test, but I'm guessing that it's being blocked further down the stack) I'm not sure how to add emojis, invalid surrogate pairs and null chars in PHP... Any examples you can point me to?
Created attachment 299128 [details] Patch
Comment on attachment 299128 [details] Patch Attachment 299128 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/2908914 New failing tests: imported/w3c/web-platform-tests/html/browsers/history/the-location-interface/location-protocol-setter-non-broken.html
Created attachment 299131 [details] Archive of layout-test-results from ews103 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-elcapitan Platform: Mac OS X 10.11.6
(In reply to comment #17) > >> Source/WebCore/loader/LinkHeader.cpp:122 > >> + ASSERT(position <= end); > > > > This assertion should be at every function that has a position and an end. > > It seems like an iterator type would be better than raw pointers in this whole patch. StringView.h has code point and code unit iterators. > > Good point regarding the assertion. Added everywhere else. > > Regarding switching to StringView's iterators I tried to convert the logic > to that, but it seems to be missing some concepts needed here (skipWhile, > skipExactly and/or getting an iterator for a position in the middle of a > StringView). > For now I kept it using pointers, but if we want to go the iterator route, > it'd require adding a new iterator type (e.g. ParserStringView) which will > include the required concepts. Let me know if you think this is the route we > should take and if so, where should such a class be located. Yeah, it's probably not worth adding another iterator type just for this. > >> Source/WebCore/loader/LinkHeader.cpp:274 > >> + else if (name == LinkParameterAnchor) > > > > It looks like a switch would be better here. > > switched lol > >> LayoutTests/http/tests/preload/resources/download_resources_from_header.php:2 > >> +header("Link: <../resources/dummy.js>; rel=preload; as=script", false); > > > > Can we put a complete absolute URL here? We should add a test with a URL that looks like http://localhost:8000/preload/resources/something.html on a page with CSP that doesn't allow loading from localhost. We should also add a test with an invalid URL like http://host:invalidport/ just to see what happens. It's hard to make a URL fail to parse when only giving the path. A fragment-only URL might also be interesting. Also a lowercase "link", a missing colon, no <> or URL, <> but no characters inside it, and a missing space after the colon. Also what about emojis, invalid surrogate pairs, and null characters in the middle of the URL? > > Added tests for CSP blocked, invalid port, fragment-only, lower-case link, > missing colon, missing spaces, missing URL, etc. > (Invalid port passes the isPreloaded() test, but I'm guessing that it's > being blocked further down the stack) > > I'm not sure how to add emojis, invalid surrogate pairs and null chars in > PHP... Any examples you can point me to? Now that I think about it, HTTP headers can only contain 8-bit values, so we can't have 16-bit characters in an HTTP header. EWS failure looks unrelated. I'll re-review this soon.
Comment on attachment 299128 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=299128&action=review > Source/WebCore/loader/LinkHeader.cpp:115 > +static bool validFieldEnd(CharacterType*& position, CharacterType * const end) extra space after CharacterType. I see 9 instances of this. I guess we don't have a lot of T* const in WebKit, but we should. It's more WebKit-style than T * const. > Source/WebCore/loader/LinkHeader.cpp:215 > + completeQuotes = false; This should be initialized by the caller. > Source/WebCore/loader/LinkHeader.cpp:262 > + if (hasQuotes) > + ++valueStart; > + if (completeQuotes) > + --valueEnd; There should be some checks or at least assertions to make sure we don't go out of bounds. We should be cautious with out bounds checks. What if there is only a single quote? (It would probably fail earlier, but still) if (valueStart < valueEnd && hasQuotes) ++valueStart; if (valueStart < valueEnd && completeQuotes) --valueEnd; > Source/WebCore/loader/LinkHeader.cpp:309 > + : m_isValid(true) This should be initialized in the header. bool m_isValid { true };
Comment on attachment 299128 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=299128&action=review >> Source/WebCore/loader/LinkHeader.cpp:115 >> +static bool validFieldEnd(CharacterType*& position, CharacterType * const end) > > extra space after CharacterType. I see 9 instances of this. > I guess we don't have a lot of T* const in WebKit, but we should. It's more WebKit-style than T * const. changed >> Source/WebCore/loader/LinkHeader.cpp:215 >> + completeQuotes = false; > > This should be initialized by the caller. moved initialization to the caller >> Source/WebCore/loader/LinkHeader.cpp:262 >> + --valueEnd; > > There should be some checks or at least assertions to make sure we don't go out of bounds. We should be cautious with out bounds checks. What if there is only a single quote? (It would probably fail earlier, but still) > > if (valueStart < valueEnd && hasQuotes) > ++valueStart; > if (valueStart < valueEnd && completeQuotes) > --valueEnd; Added ASSERT as well as a single quote test case >> Source/WebCore/loader/LinkHeader.cpp:309 >> + : m_isValid(true) > > This should be initialized in the header. > bool m_isValid { true }; done
Created attachment 299178 [details] Patch
Comment on attachment 299178 [details] Patch Thanks for reviewing! :)
Comment on attachment 299178 [details] Patch Clearing flags on attachment: 299178 Committed r210914: <http://trac.webkit.org/changeset/210914>
All reviewed patches have been landed. Closing bug.
We should add some more tests like this: Link: <<url>> LiNk: <url> Link: <url>> etc.
Also duplicate key-value pairs in the list: Link: <../resources/test.mp4>; rel=preload; as=media; as=font
(In reply to comment #28) > We should add some more tests like this: > Link: <<url>> > LiNk: <url> > Link: <url>> > etc. Added to https://bugs.webkit.org/show_bug.cgi?id=167366
(In reply to comment #29) > Also duplicate key-value pairs in the list: > Link: <../resources/test.mp4>; rel=preload; as=media; as=font That test would be significantly easier to add once https://bugs.webkit.org/show_bug.cgi?id=159678 lands