Bug 137718 - URLs that start with http:/// and https:/// lose two slashes when parsed, causing assertion failure and inconsistent behavior
Summary: URLs that start with http:/// and https:/// lose two slashes when parsed, cau...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Platform (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: mitz
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2014-10-14 14:30 PDT by mitz
Modified: 2014-10-14 16:59 PDT (History)
1 user (show)

See Also:


Attachments
Ignore additional slashes immediately after the :// (10.21 KB, patch)
2014-10-14 14:40 PDT, mitz
ap: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description mitz 2014-10-14 14:30:52 PDT
<rdar://problem/18029894>

Given a URL that starts with http:/// or https:///, URL::parse() removes exactly two slashes from the URL. There are a couple of undesirable side effects to this:
1. It makes parsing non-idempotent for such URL with four or more slashes, which in turn causes an assertion failure in the already-parsed URL constructor when trying to follow <a href="http:////webkit.org">
2. If the URL has sufficiently many slashes after the colon, then even after repeated reparsing it will contain more than two, and therefore CFNetwork will consider its host to be localhost, which is incorrect
Comment 1 mitz 2014-10-14 14:40:58 PDT
Created attachment 239824 [details]
Ignore additional slashes immediately after the ://
Comment 2 Alexey Proskuryakov 2014-10-14 15:14:42 PDT
Comment on attachment 239824 [details]
Ignore additional slashes immediately after the ://

View in context: https://bugs.webkit.org/attachment.cgi?id=239824&action=review

The code change looks right to me, yet some of the test changes are surprising.

> LayoutTests/fast/loader/url-parse-1-expected.txt:27
>  http:/	http:/		/
>  http://	http:/		/
>  http:///	http:/		/
> -http:////	http://		//
> +http:////	http:/		/

These all look strange, something seems to be going wrong. Perhaps a test bug?

> LayoutTests/fast/url/host-expected.txt:36
> -FAIL canonicalize('http:////:@//') should be http:////. Was http://:@//.
> +FAIL canonicalize('http:////:@//') should be http:////. Was http:////:@//.

I don't like this change, we now produce a very ambiguous URL here.

> LayoutTests/fast/url/host-expected.txt:39
> +FAIL canonicalize('http:////asdf@//') should be http://asdf@//. Was http:////asdf@//.

Ditto.

> LayoutTests/fast/url/invalid-urls-utf8-expected.txt:6
>  FAIL src should be http:///. Was http:/.

One slash again. Why?

> LayoutTests/fast/url/invalid-urls-utf8-expected.txt:11
> -PASS src is expected
> -PASS src is expected
> -PASS src is expected
> -PASS src is expected
> +FAIL src should be ftp:///. Was ftp://.
> +FAIL src should be gopher:///. Was gopher://.
> +FAIL src should be ws:///. Was ws://.
> +FAIL src should be wss:///. Was wss://.

This test is strange, why is it expecting three slashes? Perhaps we should change the expectation.
Comment 3 mitz 2014-10-14 15:46:11 PDT
Comment on attachment 239824 [details]
Ignore additional slashes immediately after the ://

View in context: https://bugs.webkit.org/attachment.cgi?id=239824&action=review

>> LayoutTests/fast/loader/url-parse-1-expected.txt:27
>> +http:////	http:/		/
> 
> These all look strange, something seems to be going wrong. Perhaps a test bug?

The URL spec calls for parsing to fail on all of the above. I’m not sure what’s the best thing for URL::parse() to do here, but the http://// case seems to have progressed in that now it won’t change when reparsed.

>> LayoutTests/fast/url/host-expected.txt:36
>> +FAIL canonicalize('http:////:@//') should be http:////. Was http:////:@//.
> 
> I don't like this change, we now produce a very ambiguous URL here.

We now begin the search for authority at the second colon. We conclude that the host and port are empty but credentials are present and invalidate the URL, therefore it’s not rewritten.

>> LayoutTests/fast/url/host-expected.txt:39
>> +FAIL canonicalize('http:////asdf@//') should be http://asdf@//. Was http:////asdf@//.
> 
> Ditto.

This one is also considered invalid now.
Comment 4 mitz 2014-10-14 16:59:12 PDT
Fixed in <http://trac.webkit.org/r174712>.