Bug 161364

Summary: URLParser should handle relative URLs that start with //
Product: WebKit Reporter: Alex Christensen <achristensen>
Component: New BugsAssignee: Alex Christensen <achristensen>
Status: RESOLVED FIXED    
Severity: Normal CC: annevk
Priority: P2    
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch darin: review+

Description Alex Christensen 2016-08-29 17:17:24 PDT
URLParser should handle relative URLs that start with //
Comment 1 Alex Christensen 2016-08-29 17:24:37 PDT
Created attachment 287357 [details]
Patch
Comment 2 Alex Christensen 2016-08-29 17:25:32 PDT
An unspecified difference in URL behavior can be seen by executing this in all browsers:
alert(new URL("//", "https://example.com/path1/path2/index.html"))
Comment 3 Alex Christensen 2016-08-29 17:27:16 PDT
Comment on attachment 287357 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=287357&action=review

> Source/WebCore/platform/URLParser.cpp:300
> +                copyURLPartsUntil(base, URLPart::SchemeEnd);

This is not in the spec, but it needs to be.  Such URLs are quite common.

> Source/WebCore/platform/URLParser.cpp:301
> +                m_buffer.append("://");

I'm not sure if this is "correct" behavior, or if "correct" behavior exists.
Comment 4 Alex Christensen 2016-08-29 21:08:04 PDT
The web platform tests have a similar test for file URLs:
    "input": "//",
    "base": "file:///tmp/mock/path",
    ...
File URLs have special states in the state machine, but there are no such tests for non-file URLs.
Comment 5 Anne van Kesteren 2016-08-30 01:06:54 PDT
Going through the state machine of https://url.spec.whatwg.org/#concept-basic-url-parser I get:

* https://url.spec.whatwg.org/#scheme-start-state "/" -> no scheme state 
* https://url.spec.whatwg.org/#no-scheme-state "/" -> relative state
* https://url.spec.whatwg.org/#relative-state "/" consumed -> relative slash state
* https://url.spec.whatwg.org/#relative-slash-state "/" consumed -> special authority ignore slashes state
* https://url.spec.whatwg.org/#special-authority-ignore-slashes-state EOF -> authority state
* https://url.spec.whatwg.org/#authority-state EOF -> host state
* https://url.spec.whatwg.org/#host-state -> EOF and /url/ is special -> failure

This seems to match Firefox, which throws.
Comment 6 Alex Christensen 2016-08-30 12:26:02 PDT
I think we need to change the spec in this case.  Copying the base url's scheme is not in the spec, but it's needed to successfully search anything on google.com.

By the way, Anne, I'm collecting a bunch of feedback on the url spec.  I'll let you know once I have a parser working well enough to be more sure about my requested changes.  Consider this the first.
Comment 7 Alex Christensen 2016-08-30 12:36:57 PDT
I'm going to go ahead and assume the spec will be changed to match my code and land this so I can move ahead with finding other problems.  If it is changed differently, I'll change my new URLParser, which isn't used by default yet.
http://trac.webkit.org/changeset/205194
Comment 8 Alex Christensen 2016-08-30 18:52:03 PDT
(In reply to comment #7)
> I'm going to go ahead and assume the spec will be changed to match my code
> and land this so I can move ahead with finding other problems.  If it is
> changed differently, I'll change my new URLParser, which isn't used by
> default yet.
This could be interpreted in a way that is antagonizing.  This isn't in harmony with the cooperative nature of open web standards, and that was not the intent.  My work so far is experimental, and I hope to contribute positively towards tests, standards, and implementations.
Comment 9 Anne van Kesteren 2016-08-31 10:30:12 PDT
(In reply to comment #6)
> I think we need to change the spec in this case.  Copying the base url's
> scheme is not in the spec, but it's needed to successfully search anything
> on google.com.

That happens first thing in https://url.spec.whatwg.org/#relative-state, no?


> By the way, Anne, I'm collecting a bunch of feedback on the url spec.  I'll
> let you know once I have a parser working well enough to be more sure about
> my requested changes.  Consider this the first.

Sounds great, looking forward to it. Please do file them as issues at https://github.com/whatwg/url/issues/new so they don't get lost. Really pleased to see some work in this area.

Note https://bugzilla.mozilla.org/show_bug.cgi?id=1064700#c7 for changes Firefox is considering relative to the specification, that would move Firefox away from the specification & Safari, but closer to Chrome & Edge. The change seems relatively sensible (more normalization) so I'm hoping you all would be on board with that too.
Comment 10 Alex Christensen 2016-08-31 14:27:17 PDT
(In reply to comment #9)
> (In reply to comment #6)
> > I think we need to change the spec in this case.  Copying the base url's
> > scheme is not in the spec, but it's needed to successfully search anything
> > on google.com.
> 
> That happens first thing in https://url.spec.whatwg.org/#relative-state, no?
Yep, I had overlooked that and done this in a different place, which had a similar effect for the few test cases I'm using so far.
> 
> 
> > By the way, Anne, I'm collecting a bunch of feedback on the url spec.  I'll
> > let you know once I have a parser working well enough to be more sure about
> > my requested changes.  Consider this the first.
> 
> Sounds great, looking forward to it. Please do file them as issues at
> https://github.com/whatwg/url/issues/new so they don't get lost. Really
> pleased to see some work in this area.
> 
> Note https://bugzilla.mozilla.org/show_bug.cgi?id=1064700#c7 for changes
> Firefox is considering relative to the specification, that would move
> Firefox away from the specification & Safari, but closer to Chrome & Edge.
> The change seems relatively sensible (more normalization) so I'm hoping you
> all would be on board with that too.
I think it's a little too early for me to respond to that Firefox change right now.