Summary: parseURL should use URL constructor to better handle all kinds of URLs Currently our parseURL URL Utility function returns an object with these properties: (in error cases they are null) scheme, host, port, path, queryString, fragment, lastPathComponent The URL constructor, when passed without a base URL, returns an object with these properties: (in error cases this throws an exception) protocol, username, password, host, hostname, port, origin, pathname, search, hash (and searchParams special object) We should convert over to using the URL constructor to avoid our own logic which may not (and does not always) match WebKit's URL parsing. Notes: - See LayoutTests/inspector/unit-tests/url-utilities.html for some examples
This affects parseURL and absoluteURL and maybe other Utilities. Note we have some special cases for parseDataURL which may still need to exist.
Created attachment 376516 [details] Patch This has one major difference, which is that "http://example.com" WILL have a path of "/" (it's implicit).
Comment on attachment 376516 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=376516&action=review I like the idea of this. > Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:110 > + if (url.startsWith("data:")) { > + result.scheme = "data"; > + return result; Won't path be empty after this change but not before this change? > Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:117 > + match = url.match(/^(?<protocol>\w+:)\/\/(?:(?<username>[^:@]+)(?::(?<password>[^@]+))?@)?(?<hostname>[^/#:?]+)?(?::(?<port>[\d]+))?(?<pathname>\/[^#?]*)?(?<search>\?[^#]*)?(?<hash>#.*)?$/i); Can we not? If a URL is invalid, it's invalid. > Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:142 > + result.host = match.hostname.toLowerCase(); This shouldn't be needed any more. The URLParser does this for you. > Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:153 > + result.path = resolveDotsInPath(match.pathname); The URLParser resolves dots in the path for you. You shouldn't need this any more.
Created attachment 376540 [details] Patch
Created attachment 376541 [details] Patch
Comment on attachment 376541 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=376541&action=review > LayoutTests/inspector/unit-tests/url-utilities-expected.txt:40 > +PASS: port should be: 'null' > +PASS: origin should be: 'http://example.com' So the default port gets stripped? We should have a test right next to this with a non-default port then. Like: "http://example.com:42" > LayoutTests/inspector/unit-tests/url-utilities-expected.txt:219 > +Test Invalid: http://user@pass:example.com/ This is invalid? Then there don't appear to be any tests that cover user:pass now.
> > Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:153 > > + result.path = resolveDotsInPath(match.pathname); > > The URLParser resolves dots in the path for you. You shouldn't need this > any more. I wonder if this will cause us problems anywhere. Particularly when generating folder paths related to "Group by Path" or Source Maps.
Comment on attachment 376541 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=376541&action=review > Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:109 > + if (url && url.startsWith("data:")) { > + result.scheme = "data"; > + return result; I still don't like this. > Source/WebInspectorUI/UserInterface/Base/URLUtilities.js:123 > + if (parsed.password) What about http://:password@host/? This should probably be outside its containing if statement, and you should probably add that test. >> LayoutTests/inspector/unit-tests/url-utilities-expected.txt:219 >> +Test Invalid: http://user@pass:example.com/ > > This is invalid? Then there don't appear to be any tests that cover user:pass now. This is indeed invalid. According to URL parsing, pass is the host and example.com is the port, which is nonnumeric. This shows a benefit of using a real URL parser instead of a regular expression.
> >> LayoutTests/inspector/unit-tests/url-utilities-expected.txt:219 > >> +Test Invalid: http://user@pass:example.com/ > > > > This is invalid? Then there don't appear to be any tests that cover user:pass now. > > This is indeed invalid. According to URL parsing, pass is the host and > example.com is the port, which is nonnumeric. This shows a benefit of using > a real URL parser instead of a regular expression. Awesome, but then we still need tests to cover the `user:pass@host` form.
(In reply to Joseph Pecoraro from comment #9) > > >> LayoutTests/inspector/unit-tests/url-utilities-expected.txt:219 > > >> +Test Invalid: http://user@pass:example.com/ > > > > > > This is invalid? Then there don't appear to be any tests that cover user:pass now. > > > > This is indeed invalid. According to URL parsing, pass is the host and example.com is the port, which is nonnumeric. This shows a benefit of using a real URL parser instead of a regular expression. > > Awesome, but then we still need tests to cover the `user:pass@host` form. We do already have a test for `http://user:pass@example.com/` (both `parseURL` and `WI.displayNameForURL`). We could add more, but frankly at this point it seems a bit "overkill" given that we're basically just renaming properties of `URL` to match an existing format. It's not like we're doing anything overly special/unique to the result of `new URL`, so in reality the tests for `parseURL` are more testing `new URL` than `parseURL`.
> We could add more, but frankly at this point it seems a bit "overkill" given > that we're basically just renaming properties of `URL` to match an existing > format. It's not like we're doing anything overly special/unique to the > result of `new URL`, so in reality the tests for `parseURL` are more testing > `new URL` than `parseURL`. The intent is to see what our API produces. If our API was incorrectly handling things then we'd want to know what parts of our code-base might have relied on incorrect inputs. Fortunately we seem to have done pretty well!
Comment on attachment 376541 [details] Patch r=me though there are calls for a few small additional tests (like the case Alex points out, and a port).
Created attachment 376744 [details] Patch
Created attachment 376746 [details] Patch
Comment on attachment 376746 [details] Patch Clearing flags on attachment: 376746 Committed r248895: <https://trac.webkit.org/changeset/248895>
All reviewed patches have been landed. Closing bug.
<rdar://problem/54500957>