Bug 165155

Summary: Web Inspector: Use URL constructor to better handle all kinds of URLs
Product: WebKit Reporter: Joseph Pecoraro <joepeck>
Component: Web InspectorAssignee: Devin Rousso <hi>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, ap, commit-queue, hi, inspector-bugzilla-changes, joepeck, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: All   
OS: All   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch
none
Patch none

Description Joseph Pecoraro 2016-11-29 13:44:32 PST
Summary:
parseURL should use URL constructor to better handle all kinds of URLs

Currently our parseURL URL Utility function returns an object with these properties: (in error cases they are null)

    scheme, host, port, path, queryString, fragment, lastPathComponent

The URL constructor, when passed without a base URL, returns an object with these properties: (in error cases this throws an exception)

    protocol, username, password, host, hostname, port, origin, pathname, search, hash (and searchParams special object)

We should convert over to using the URL constructor to avoid our own logic which may not (and does not always) match WebKit's URL parsing.

Notes:
- See LayoutTests/inspector/unit-tests/url-utilities.html for some examples
Comment 1 Joseph Pecoraro 2016-11-29 13:45:25 PST
This affects parseURL and absoluteURL and maybe other Utilities.

Note we have some special cases for parseDataURL which may still need to exist.
Comment 2 Devin Rousso 2019-08-16 12:31:48 PDT
Created attachment 376516 [details]
Patch

This has one major difference, which is that "http://example.com" WILL have a path of "/" (it's implicit).
Comment 3 Alex Christensen 2019-08-16 12:43:11 PDT
Comment on attachment 376516 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=376516&action=review

I like the idea of this.

> Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:110
> +    if (url.startsWith("data:")) {
> +        result.scheme = "data";
> +        return result;

Won't path be empty after this change but not before this change?

> Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:117
> +        match = url.match(/^(?<protocol>\w+:)\/\/(?:(?<username>[^:@]+)(?::(?<password>[^@]+))?@)?(?<hostname>[^/#:?]+)?(?::(?<port>[\d]+))?(?<pathname>\/[^#?]*)?(?<search>\?[^#]*)?(?<hash>#.*)?$/i);

Can we not?  If a URL is invalid, it's invalid.

> Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:142
> +        result.host = match.hostname.toLowerCase();

This shouldn't be needed any more.  The URLParser does this for you.

> Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:153
> +        result.path = resolveDotsInPath(match.pathname);

The URLParser resolves dots in the path for you.  You shouldn't need this any more.
Comment 4 Devin Rousso 2019-08-16 14:18:30 PDT
Created attachment 376540 [details]
Patch
Comment 5 Devin Rousso 2019-08-16 14:20:19 PDT
Created attachment 376541 [details]
Patch
Comment 6 Joseph Pecoraro 2019-08-16 15:31:34 PDT
Comment on attachment 376541 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=376541&action=review

> LayoutTests/inspector/unit-tests/url-utilities-expected.txt:40
> +PASS: port should be: 'null'
> +PASS: origin should be: 'http://example.com'

So the default port gets stripped?

We should have a test right next to this with a non-default port then. Like:

    "http://example.com:42"

> LayoutTests/inspector/unit-tests/url-utilities-expected.txt:219
> +Test Invalid: http://user@pass:example.com/

This is invalid? Then there don't appear to be any tests that cover user:pass now.
Comment 7 Joseph Pecoraro 2019-08-16 15:33:09 PDT
> > Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:153
> > +        result.path = resolveDotsInPath(match.pathname);
> 
> The URLParser resolves dots in the path for you.  You shouldn't need this
> any more.

I wonder if this will cause us problems anywhere. Particularly when generating folder paths related to "Group by Path" or Source Maps.
Comment 8 Alex Christensen 2019-08-16 17:24:14 PDT
Comment on attachment 376541 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=376541&action=review

> Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:109
> +    if (url && url.startsWith("data:")) {
> +        result.scheme = "data";
> +        return result;

I still don't like this.

> Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:123
> +        if (parsed.password)

What about http://:password@host/?  This should probably be outside its containing if statement, and you should probably add that test.

>> LayoutTests/inspector/unit-tests/url-utilities-expected.txt:219
>> +Test Invalid: http://user@pass:example.com/
> 
> This is invalid? Then there don't appear to be any tests that cover user:pass now.

This is indeed invalid.  According to URL parsing, pass is the host and example.com is the port, which is nonnumeric.  This shows a benefit of using a real URL parser instead of a regular expression.
Comment 9 Joseph Pecoraro 2019-08-19 17:55:14 PDT
> >> LayoutTests/inspector/unit-tests/url-utilities-expected.txt:219
> >> +Test Invalid: http://user@pass:example.com/
> > 
> > This is invalid? Then there don't appear to be any tests that cover user:pass now.
> 
> This is indeed invalid.  According to URL parsing, pass is the host and
> example.com is the port, which is nonnumeric.  This shows a benefit of using
> a real URL parser instead of a regular expression.

Awesome, but then we still need tests to cover the `user:pass@host` form.
Comment 10 Devin Rousso 2019-08-19 21:45:45 PDT
(In reply to Joseph Pecoraro from comment #9)
> > >> LayoutTests/inspector/unit-tests/url-utilities-expected.txt:219
> > >> +Test Invalid: http://user@pass:example.com/
> > > 
> > > This is invalid? Then there don't appear to be any tests that cover user:pass now.
> > 
> > This is indeed invalid.  According to URL parsing, pass is the host and example.com is the port, which is nonnumeric.  This shows a benefit of using a real URL parser instead of a regular expression.
> 
> Awesome, but then we still need tests to cover the `user:pass@host` form.
We do already have a test for `http://user:pass@example.com/` (both `parseURL` and `WI.displayNameForURL`).

We could add more, but frankly at this point it seems a bit "overkill" given that we're basically just renaming properties of `URL` to match an existing format.  It's not like we're doing anything overly special/unique to the result of `new URL`, so in reality the tests for `parseURL` are more testing `new URL` than `parseURL`.
Comment 11 Joseph Pecoraro 2019-08-19 22:04:59 PDT
> We could add more, but frankly at this point it seems a bit "overkill" given
> that we're basically just renaming properties of `URL` to match an existing
> format.  It's not like we're doing anything overly special/unique to the
> result of `new URL`, so in reality the tests for `parseURL` are more testing
> `new URL` than `parseURL`.

The intent is to see what our API produces. If our API was incorrectly handling things then we'd want to know what parts of our code-base might have relied on incorrect inputs. Fortunately we seem to have done pretty well!
Comment 12 Joseph Pecoraro 2019-08-19 22:07:00 PDT
Comment on attachment 376541 [details]
Patch

r=me though there are calls for a few small additional tests (like the case Alex points out, and a port).
Comment 13 Devin Rousso 2019-08-19 22:08:26 PDT
Created attachment 376744 [details]
Patch
Comment 14 Devin Rousso 2019-08-19 22:11:11 PDT
Created attachment 376746 [details]
Patch
Comment 15 WebKit Commit Bot 2019-08-20 00:04:09 PDT
Comment on attachment 376746 [details]
Patch

Clearing flags on attachment: 376746

Committed r248895: <https://trac.webkit.org/changeset/248895>
Comment 16 WebKit Commit Bot 2019-08-20 00:04:10 PDT
All reviewed patches have been landed.  Closing bug.
Comment 17 Radar WebKit Bug Importer 2019-08-20 00:05:19 PDT
<rdar://problem/54500957>